I noticed uneven CPU usage on different pods. As you can see, around 30 data node pods hit 100% all the time but some others are pretty idle.
How’s the right way to find where the busyness from?
Check disk IO in atop. Most common issue is high CPU duo to disk busy.
DiskIO looks fine. Not seeing the utili % hit 1%+ for these NVMEs.
CPU usage of each OpenSearch data node pod is high.
root@swiftsre001:~# docker stats | grep opensearch-data
34f60b15b504 k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0 166.96% 18.07GiB / 32GiB 56.48% 0B / 0B 150MB / 110GB 106
61cf2e90316e k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0 200.42% 17.91GiB / 32GiB 55.97% 0B / 0B 166MB / 116GB 99
7aca0b8741c4 k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0 196.32% 17.99GiB / 32GiB 56.22% 0B / 0B 161MB / 115GB 99
ef107121bee7 k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0 202.81% 18.09GiB / 32GiB 56.52% 0B / 0B 158MB / 115GB 107
fe1ce845e8d4 k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0 177.46% 17.89GiB / 32GiB 55.91% 0B / 0B 136MB / 108GB 104
623eff6b90b9 k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0 187.65% 17.92GiB / 32GiB 56.01% 0B / 0B 166MB / 120GB 103
8bf4ed37c5d6 k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0 208.78% 17.96GiB / 32GiB 56.11% 0B / 0B 191MB / 121GB 107
dabb9bec4662 k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0 175.89% 18.2GiB / 32GiB 56.88% 0B / 0B 144MB / 116GB 99
b14d0153cb31 k8s_opensearch-data_opensearch-cluster-data-4_cs-opensearch_6966fdf9-0b10-474f-8d49-cfb67814f24c_0 193.23% 17.97GiB / 32GiB 56.15% 0B / 0B 151MB / 116GB 99
dd55f4809e91 k8s_opensearch-data_opensearch-cluster-data-50_cs-opensearch_fefa51e2-b26a-4ba4-9aa4-582d82df9497_0 203.36% 17.96GiB / 32GiB 56.12% 0B / 0B 144MB / 116GB 97
23c95c66136e k8s_opensearch-data_opensearch-cluster-data-29_cs-opensearch_a58ed33a-c1ed-448c-b25c-2c5b049260e2_0 170.97% 18.55GiB / 32GiB 57.98% 0B / 0B 140MB / 116GB 104
0e2497b27239 k8s_opensearch-data_opensearch-cluster-data-25_cs-opensearch_7f860502-140f-4bb1-82da-e36d80fd07bd_0 201.81% 17.87GiB / 32GiB 55.86% 0B / 0B 158MB / 122GB 102
135b4319121b k8s_opensearch-data_opensearch-cluster-data-14_cs-opensearch_7cc3d07d-3610-4d4b-828a-fc1e343e46a9_0 203.52% 17.87GiB / 32GiB 55.84% 0B / 0B 155MB / 115GB 101
34f60b15b504 k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0 166.96% 18.07GiB / 32GiB 56.48% 0B / 0B 150MB / 110GB 106
61cf2e90316e k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0 200.42% 17.91GiB / 32GiB 55.97% 0B / 0B 166MB / 116GB 99
7aca0b8741c4 k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0 196.32% 17.99GiB / 32GiB 56.22% 0B / 0B 161MB / 115GB 99
ef107121bee7 k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0 202.81% 18.09GiB / 32GiB 56.52% 0B / 0B 158MB / 115GB 107
fe1ce845e8d4 k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0 177.46% 17.89GiB / 32GiB 55.91% 0B / 0B 136MB / 108GB 104
623eff6b90b9 k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0 187.65% 17.92GiB / 32GiB 56.01% 0B / 0B 166MB / 120GB 103
8bf4ed37c5d6 k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0 208.78% 17.96GiB / 32GiB 56.11% 0B / 0B 191MB / 121GB 107
dabb9bec4662 k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0 175.89% 18.2GiB / 32GiB 56.88% 0B / 0B 144MB / 116GB 99
b14d0153cb31 k8s_opensearch-data_opensearch-cluster-data-4_cs-opensearch_6966fdf9-0b10-474f-8d49-cfb67814f24c_0 193.23% 17.97GiB / 32GiB 56.15% 0B / 0B 151MB / 116GB 99
dd55f4809e91 k8s_opensearch-data_opensearch-cluster-data-50_cs-opensearch_fefa51e2-b26a-4ba4-9aa4-582d82df9497_0 203.36% 17.96GiB / 32GiB 56.12% 0B / 0B 144MB / 116GB 97
23c95c66136e k8s_opensearch-data_opensearch-cluster-data-29_cs-opensearch_a58ed33a-c1ed-448c-b25c-2c5b049260e2_0 170.97% 18.55GiB / 32GiB 57.98% 0B / 0B 140MB / 116GB 104
0e2497b27239 k8s_opensearch-data_opensearch-cluster-data-25_cs-opensearch_7f860502-140f-4bb1-82da-e36d80fd07bd_0 201.81% 17.87GiB / 32GiB 55.86% 0B / 0B 158MB / 122GB 102
135b4319121b k8s_opensearch-data_opensearch-cluster-data-14_cs-opensearch_7cc3d07d-3610-4d4b-828a-fc1e343e46a9_0 203.52% 17.87GiB / 32GiB 55.84% 0B / 0B 155MB / 115GB 101
34f60b15b504 k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0 37.82% 18.07GiB / 32GiB 56.48% 0B / 0B 150MB / 110GB 106
61cf2e90316e k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0 104.82% 17.91GiB / 32GiB 55.97% 0B / 0B 166MB / 116GB 99
7aca0b8741c4 k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0 178.91% 17.99GiB / 32GiB 56.22% 0B / 0B 161MB / 115GB 99
ef107121bee7 k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0 147.72% 18.09GiB / 32GiB 56.52% 0B / 0B 158MB / 115GB 107
fe1ce845e8d4 k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0 32.49% 17.89GiB / 32GiB 55.92% 0B / 0B 136MB / 108GB 104
623eff6b90b9 k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0 110.59% 17.92GiB / 32GiB 56.01% 0B / 0B 166MB / 120GB 103
8bf4ed37c5d6 k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0 42.53% 17.96GiB / 32GiB 56.11% 0B / 0B 191MB / 121GB 107
dabb9bec4662 k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0 213.08% 18.2GiB / 32GiB 56.88% 0B / 0B 144MB / 116GB 99
Hey @hugok ,
It is worth to inspect what each node is doing, could you please capture hot threads stats [1]? And share if possible after, thank you.
Why %util of sdb is high like that? xD
@reta
The hot_threads from one of 100% utilized opensearch-data nodes. Not easy to understand the output. Is it diskIO or CPU bottleneck?
I can see a write and a merge consume CPU
10.42.8.89 34 100 100 24.63 24.52 24.66 d - opensearch-cluster-data-20
::: {opensearch-cluster-data-20}{suHu5HqQTyqLWYStHzp_mA}{dWs14VEYR6eWiE3ri2kh7A}{10.42.8.89}{10.42.8.89:9300}{d}{shard_indexing_pressure_enabled=true}
Hot threads at 2022-05-31T05:54:46.569Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
87.2% (436.2ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][write][T#1]'
2/10 snapshots sharing following 26 elements
app//org.opensearch.index.translog.Translog.ensureSynced(Translog.java:871)
app//org.opensearch.index.translog.Translog.ensureSynced(Translog.java:892)
app//org.opensearch.index.engine.InternalEngine.ensureTranslogSynced(InternalEngine.java:599)
app//org.opensearch.index.shard.IndexShard$5.write(IndexShard.java:3552)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:121)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:109)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:97)
app//org.opensearch.index.shard.IndexShard.sync(IndexShard.java:3575)
app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:478)
app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult.runPostReplicationActions(TransportWriteAction.java:316)
app//org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:176)
app//org.opensearch.action.support.replication.ReplicationOperation$$Lambda$3633/0x0000000800fc5440.accept(Unknown Source)
app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
app//org.opensearch.action.ActionListener.completeWith(ActionListener.java:351)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.finishRequest(TransportShardBulkAction.java:264)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:227)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
2/10 snapshots sharing following 19 elements
app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:429)
app//org.opensearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:135)
app//org.opensearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:90)
app//org.opensearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:252)
app//org.opensearch.index.shard.IndexShard.prepareIndex(IndexShard.java:926)
app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:883)
app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
2/10 snapshots sharing following 25 elements
app//org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:527)
app//org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:491)
app//org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:208)
app//org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:415)
app//org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1471)
app//org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1757)
app//org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1400)
app//org.opensearch.index.engine.InternalEngine.addDocs(InternalEngine.java:1321)
app//org.opensearch.index.engine.InternalEngine.indexIntoLucene(InternalEngine.java:1257)
app//org.opensearch.index.engine.InternalEngine.index(InternalEngine.java:1060)
app//org.opensearch.index.shard.IndexShard.index(IndexShard.java:965)
app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:909)
app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
2/10 snapshots sharing following 34 elements
app//org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction$$Lambda$3630/0x0000000800fc4840.accept(Unknown Source)
app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
app//org.opensearch.action.ActionListener$4.onResponse(ActionListener.java:178)
app//org.opensearch.action.support.replication.ReplicationOperation.finish(ReplicationOperation.java:420)
app//org.opensearch.action.support.replication.ReplicationOperation.decPendingAndFinishIfNeeded(ReplicationOperation.java:406)
app//org.opensearch.action.support.replication.ReplicationOperation.access$300(ReplicationOperation.java:70)
app//org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:184)
app//org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:176)
app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult$1.onSuccess(TransportWriteAction.java:309)
app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.maybeFinish(TransportWriteAction.java:451)
app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.lambda$run$1(TransportWriteAction.java:480)
app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction$$Lambda$3644/0x0000000800fd8440.accept(Unknown Source)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.notifyList(AsyncIOProcessor.java:135)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:113)
app//org.opensearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:97)
app//org.opensearch.index.shard.IndexShard.sync(IndexShard.java:3575)
app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:478)
app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult.runPostReplicationActions(TransportWriteAction.java:316)
app//org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:176)
app//org.opensearch.action.support.replication.ReplicationOperation$$Lambda$3633/0x0000000800fc5440.accept(Unknown Source)
app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
app//org.opensearch.action.ActionListener.completeWith(ActionListener.java:351)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.finishRequest(TransportShardBulkAction.java:264)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:227)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
2/10 snapshots sharing following 18 elements
app//org.opensearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:128)
app//org.opensearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:90)
app//org.opensearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:252)
app//org.opensearch.index.shard.IndexShard.prepareIndex(IndexShard.java:926)
app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:883)
app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
62.1% (310.4ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][transport_worker][T#1]'
2/10 snapshots sharing following 49 elements
app//org.opensearch.action.support.replication.TransportReplicationAction$$Lambda$2624/0x0000000800cd2840.messageReceived(Unknown Source)
org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceivedDecorate(SecuritySSLRequestHandler.java:193)
org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:336)
org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:153)
org.opensearch.security.OpenSearchSecurityPlugin$7$1.messageReceived(OpenSearchSecurityPlugin.java:647)
org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:118)
org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43)
app//org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:91)
app//org.opensearch.transport.InboundHandler.handleRequest(InboundHandler.java:244)
app//org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:127)
app//org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:109)
app//org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:759)
org.opensearch.transport.netty4.Netty4MessageChannelHandler$$Lambda$3131/0x0000000800e5b440.accept(Unknown Source)
app//org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:170)
app//org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:145)
app//org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:110)
org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:94)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1371)
io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283)
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510)
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449)
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
2/10 snapshots sharing following 20 elements
io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283)
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510)
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449)
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
4/10 snapshots sharing following 3 elements
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
61.5% (307.4ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][[filebeat-swift-v1-proxy-server-2022.05.31][26]: Lucene Merge Thread #16]'
2/10 snapshots sharing following 17 elements
app//org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:52)
app//org.apache.lucene.store.RAMOutputStream.writeTo(RAMOutputStream.java:90)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlock(BlockTreeTermsWriter.java:848)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:606)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
6/10 snapshots sharing following 14 elements
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:606)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
2/10 snapshots sharing following 14 elements
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:628)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
The PVC is nvme8n1 on a k8s host. iostat output as below.
[opensearch@opensearch-cluster-data-20 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
overlay 878G 94G 740G 12% /
tmpfs 64M 0 64M 0% /dev
tmpfs 378G 0 378G 0% /sys/fs/cgroup
/dev/mapper/system-root 878G 94G 740G 12% /etc/hosts
shm 64M 2.1M 62M 4% /dev/shm
/dev/nvme8n1 5.8T 1.3T 4.3T 23% /usr/share/opensearch/data
tmpfs 378G 12K 378G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 378G 0 378G 0% /proc/acpi
tmpfs 378G 0 378G 0% /proc/scsi
tmpfs 378G 0 378G 0% /sys/firmware
root@swiftsre003:/mnt/nvme8n1# iostat -dmx 3 /dev/nvme8n1
Linux 4.15.0-167-generic (swiftsre003) 05/31/2022 _x86_64_ (80 CPU)
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
nvme8n1 2.27 1569.05 0.27 22.39 0.00 1125.34 0.00 41.77 0.20 0.08 0.08 123.38 14.61 0.00 0.13
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
nvme8n1 0.00 4244.67 0.00 29.48 0.00 3113.00 0.00 42.31 0.00 0.02 0.00 0.00 7.11 0.00 0.00
Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
nvme8n1 0.00 4042.00 0.00 39.15 0.00 3046.33 0.00 42.98 0.00 0.03 0.04 0.00 9.92 0.00 0.53
Why %util of sdb is high like that? xD
It’s a PV of the OS root partition LVM. Not sure why it went high. Not seeing such high usage of sdb today.
thx
Looks like translog sync. Try
index.translog.durability: “async”
index.translog.sync_interval: “30s”
Thanks,
Matt Weber
Applied the translog async to indexes. Not seeing significant changes from the CPU usage. Maybe the free CPU resource from translog been taken by Lucene merge threads?
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.42.6.50 34 100 100 16.16 16.12 13.29 d - opensearch-cluster-data-23
Is it normal to have Merging Throttling ?
::: {opensearch-cluster-data-75}{mdo83cflRRO5IpN0mlsYVw}{aNMNoWagSqyv-1SSdwuNrA}{10.42.4.117}{10.42.4.117:9300}{d}{shard_indexing_pressure_enabled=true}
Hot threads at 2022-06-01T05:16:50.807Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
33.3% (166.3ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][[filebeat-swift-v1-proxy-server-2022.06.01][12]: Lucene Merge Thread #9]'
10/10 snapshots sharing following 14 elements
app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.doAddSortedField(Lucene80DocValuesConsumer.java:682)
app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addSortedSetField(Lucene80DocValuesConsumer.java:891)
app//org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:804)
app//org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:145)
org.opensearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer.merge(KNN80DocValuesConsumer.java:212)
app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:155)
app//org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:195)
app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:150)
app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
31.4% (157.2ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][transport_worker][T#1]'
9/10 snapshots sharing following 3 elements
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
28.7% (143.6ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][write][T#1]'
4/10 snapshots sharing following 10 elements
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
6/10 snapshots sharing following 10 elements
java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method)
java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:743)
java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:684)
java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1366)
app//org.opensearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:176)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
Hi @hugok ,
The merge throttling seems to be enabled by default (index.merge.scheduler.auto_throttle
setting). I am wondering what are the values index.merge.scheduler.max_thread_count
and index.merge.scheduler.max_merge_count
settings have? How many CPUs you have allocated per node?
@reta
Not seeing the setting in the cluster or index level. I guess it just uses the default one.
I guess it’s the default:
Math.max(1, Math.min(4, Runtime.getRuntime().availableProcessors() / 2))
What number for trying in your opinion?
- index.merge.scheduler.max_thread_count:10
- index.merge.scheduler.max_merge_count: 100
How many CPUs you have allocated per node?
This is the key point. I’m about to update this ticket with the recent tuning result. A very important setting of k8s while running OpenSearch. I left both CPU reservation and limit empty and thought that means no limitation to the OpenSearch pod. Unfortunately, it’s not true. When seeing the data node CPU hits 100% from OpenSearch API, the host of these k8s pods doesn’t have a busy CPU at all.
Finally, it you set the reservation and limit to a certain number, it ensure the requested CPU times are reserved for the pod in k8s. THIS IS SO IMPORTANT. Our OpenSearch on k8s runs 4x faster for indexing and searching. It can be faster to have more CPU reservation but there’re other services in the k8s cluster. The current setting already meets our requirement. The CPU usage burst to limit under heavy loading and down to 25~40% for normal traffic.