To understand the source of 100% cpu usage data node

I noticed uneven CPU usage on different pods. As you can see, around 30 data node pods hit 100% all the time but some others are pretty idle.
How’s the right way to find where the busyness from?

Check disk IO in atop. Most common issue is high CPU duo to disk busy.

1 Like

DiskIO looks fine. Not seeing the utili % hit 1%+ for these NVMEs.

CPU usage of each OpenSearch data node pod is high.

root@swiftsre001:~# docker stats | grep opensearch-data
34f60b15b504   k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0                                         166.96%   18.07GiB / 32GiB      56.48%    0B / 0B   150MB / 110GB     106
61cf2e90316e   k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0                                          200.42%   17.91GiB / 32GiB      55.97%    0B / 0B   166MB / 116GB     99
7aca0b8741c4   k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0                                         196.32%   17.99GiB / 32GiB      56.22%    0B / 0B   161MB / 115GB     99
ef107121bee7   k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0                                         202.81%   18.09GiB / 32GiB      56.52%    0B / 0B   158MB / 115GB     107
fe1ce845e8d4   k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0                                         177.46%   17.89GiB / 32GiB      55.91%    0B / 0B   136MB / 108GB     104
623eff6b90b9   k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0                                         187.65%   17.92GiB / 32GiB      56.01%    0B / 0B   166MB / 120GB     103
8bf4ed37c5d6   k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0                                         208.78%   17.96GiB / 32GiB      56.11%    0B / 0B   191MB / 121GB     107
dabb9bec4662   k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0                                         175.89%   18.2GiB / 32GiB       56.88%    0B / 0B   144MB / 116GB     99
b14d0153cb31   k8s_opensearch-data_opensearch-cluster-data-4_cs-opensearch_6966fdf9-0b10-474f-8d49-cfb67814f24c_0                                          193.23%   17.97GiB / 32GiB      56.15%    0B / 0B   151MB / 116GB     99
dd55f4809e91   k8s_opensearch-data_opensearch-cluster-data-50_cs-opensearch_fefa51e2-b26a-4ba4-9aa4-582d82df9497_0                                         203.36%   17.96GiB / 32GiB      56.12%    0B / 0B   144MB / 116GB     97
23c95c66136e   k8s_opensearch-data_opensearch-cluster-data-29_cs-opensearch_a58ed33a-c1ed-448c-b25c-2c5b049260e2_0                                         170.97%   18.55GiB / 32GiB      57.98%    0B / 0B   140MB / 116GB     104
0e2497b27239   k8s_opensearch-data_opensearch-cluster-data-25_cs-opensearch_7f860502-140f-4bb1-82da-e36d80fd07bd_0                                         201.81%   17.87GiB / 32GiB      55.86%    0B / 0B   158MB / 122GB     102
135b4319121b   k8s_opensearch-data_opensearch-cluster-data-14_cs-opensearch_7cc3d07d-3610-4d4b-828a-fc1e343e46a9_0                                         203.52%   17.87GiB / 32GiB      55.84%    0B / 0B   155MB / 115GB     101
34f60b15b504   k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0                                         166.96%   18.07GiB / 32GiB      56.48%    0B / 0B   150MB / 110GB     106
61cf2e90316e   k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0                                          200.42%   17.91GiB / 32GiB      55.97%    0B / 0B   166MB / 116GB     99
7aca0b8741c4   k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0                                         196.32%   17.99GiB / 32GiB      56.22%    0B / 0B   161MB / 115GB     99
ef107121bee7   k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0                                         202.81%   18.09GiB / 32GiB      56.52%    0B / 0B   158MB / 115GB     107
fe1ce845e8d4   k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0                                         177.46%   17.89GiB / 32GiB      55.91%    0B / 0B   136MB / 108GB     104
623eff6b90b9   k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0                                         187.65%   17.92GiB / 32GiB      56.01%    0B / 0B   166MB / 120GB     103
8bf4ed37c5d6   k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0                                         208.78%   17.96GiB / 32GiB      56.11%    0B / 0B   191MB / 121GB     107
dabb9bec4662   k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0                                         175.89%   18.2GiB / 32GiB       56.88%    0B / 0B   144MB / 116GB     99
b14d0153cb31   k8s_opensearch-data_opensearch-cluster-data-4_cs-opensearch_6966fdf9-0b10-474f-8d49-cfb67814f24c_0                                          193.23%   17.97GiB / 32GiB      56.15%    0B / 0B   151MB / 116GB     99
dd55f4809e91   k8s_opensearch-data_opensearch-cluster-data-50_cs-opensearch_fefa51e2-b26a-4ba4-9aa4-582d82df9497_0                                         203.36%   17.96GiB / 32GiB      56.12%    0B / 0B   144MB / 116GB     97
23c95c66136e   k8s_opensearch-data_opensearch-cluster-data-29_cs-opensearch_a58ed33a-c1ed-448c-b25c-2c5b049260e2_0                                         170.97%   18.55GiB / 32GiB      57.98%    0B / 0B   140MB / 116GB     104
0e2497b27239   k8s_opensearch-data_opensearch-cluster-data-25_cs-opensearch_7f860502-140f-4bb1-82da-e36d80fd07bd_0                                         201.81%   17.87GiB / 32GiB      55.86%    0B / 0B   158MB / 122GB     102
135b4319121b   k8s_opensearch-data_opensearch-cluster-data-14_cs-opensearch_7cc3d07d-3610-4d4b-828a-fc1e343e46a9_0                                         203.52%   17.87GiB / 32GiB      55.84%    0B / 0B   155MB / 115GB     101
34f60b15b504   k8s_opensearch-data_opensearch-cluster-data-77_cs-opensearch_2049077b-ec8d-4fdc-919f-7913ff720ab2_0                                         37.82%    18.07GiB / 32GiB      56.48%    0B / 0B   150MB / 110GB     106
61cf2e90316e   k8s_opensearch-data_opensearch-cluster-data-9_cs-opensearch_3211505c-8f38-4dc9-8842-dc089ee01af5_0                                          104.82%   17.91GiB / 32GiB      55.97%    0B / 0B   166MB / 116GB     99
7aca0b8741c4   k8s_opensearch-data_opensearch-cluster-data-38_cs-opensearch_49fc3d2f-2efa-4f06-8f9d-913925938e60_0                                         178.91%   17.99GiB / 32GiB      56.22%    0B / 0B   161MB / 115GB     99
ef107121bee7   k8s_opensearch-data_opensearch-cluster-data-55_cs-opensearch_2ca4e20f-8d9c-4b93-b43c-42b8d2ffef0d_0                                         147.72%   18.09GiB / 32GiB      56.52%    0B / 0B   158MB / 115GB     107
fe1ce845e8d4   k8s_opensearch-data_opensearch-cluster-data-70_cs-opensearch_f6a3b235-4b1d-460c-91fc-fd4e606c42c8_0                                         32.49%    17.89GiB / 32GiB      55.92%    0B / 0B   136MB / 108GB     104
623eff6b90b9   k8s_opensearch-data_opensearch-cluster-data-19_cs-opensearch_6ba81df6-6306-4cf6-8277-50f837673c36_0                                         110.59%   17.92GiB / 32GiB      56.01%    0B / 0B   166MB / 120GB     103
8bf4ed37c5d6   k8s_opensearch-data_opensearch-cluster-data-83_cs-opensearch_b721328e-4c5f-4b42-86f8-205752752d44_0                                         42.53%    17.96GiB / 32GiB      56.11%    0B / 0B   191MB / 121GB     107
dabb9bec4662   k8s_opensearch-data_opensearch-cluster-data-43_cs-opensearch_80ae8caa-d482-475c-b8ab-2d427094734f_0                                         213.08%   18.2GiB / 32GiB       56.88%    0B / 0B   144MB / 116GB     99

Hey @hugok ,

It is worth to inspect what each node is doing, could you please capture hot threads stats [1]? And share if possible after, thank you.

[1] Nodes hot threads - OpenSearch documentation

Why %util of sdb is high like that? xD

@reta
The hot_threads from one of 100% utilized opensearch-data nodes. Not easy to understand the output. Is it diskIO or CPU bottleneck?

I can see a write and a merge consume CPU

10.42.8.89             34         100 100   24.63   24.52    24.66 d         -      opensearch-cluster-data-20
::: {opensearch-cluster-data-20}{suHu5HqQTyqLWYStHzp_mA}{dWs14VEYR6eWiE3ri2kh7A}{10.42.8.89}{10.42.8.89:9300}{d}{shard_indexing_pressure_enabled=true}
   Hot threads at 2022-05-31T05:54:46.569Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   87.2% (436.2ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][write][T#1]'
     2/10 snapshots sharing following 26 elements
       app//org.opensearch.index.translog.Translog.ensureSynced(Translog.java:871)
       app//org.opensearch.index.translog.Translog.ensureSynced(Translog.java:892)
       app//org.opensearch.index.engine.InternalEngine.ensureTranslogSynced(InternalEngine.java:599)
       app//org.opensearch.index.shard.IndexShard$5.write(IndexShard.java:3552)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.processList(AsyncIOProcessor.java:121)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:109)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:97)
       app//org.opensearch.index.shard.IndexShard.sync(IndexShard.java:3575)
       app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:478)
       app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult.runPostReplicationActions(TransportWriteAction.java:316)
       app//org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:176)
       app//org.opensearch.action.support.replication.ReplicationOperation$$Lambda$3633/0x0000000800fc5440.accept(Unknown Source)
       app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
       app//org.opensearch.action.ActionListener.completeWith(ActionListener.java:351)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.finishRequest(TransportShardBulkAction.java:264)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:227)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     2/10 snapshots sharing following 19 elements
       app//org.opensearch.index.mapper.DocumentParser.parseObjectOrNested(DocumentParser.java:429)
       app//org.opensearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:135)
       app//org.opensearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:90)
       app//org.opensearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:252)
       app//org.opensearch.index.shard.IndexShard.prepareIndex(IndexShard.java:926)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:883)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
       app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     2/10 snapshots sharing following 25 elements
       app//org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:527)
       app//org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:491)
       app//org.apache.lucene.index.DocumentsWriterPerThread.updateDocuments(DocumentsWriterPerThread.java:208)
       app//org.apache.lucene.index.DocumentsWriter.updateDocuments(DocumentsWriter.java:415)
       app//org.apache.lucene.index.IndexWriter.updateDocuments(IndexWriter.java:1471)
       app//org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1757)
       app//org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1400)
       app//org.opensearch.index.engine.InternalEngine.addDocs(InternalEngine.java:1321)
       app//org.opensearch.index.engine.InternalEngine.indexIntoLucene(InternalEngine.java:1257)
       app//org.opensearch.index.engine.InternalEngine.index(InternalEngine.java:1060)
       app//org.opensearch.index.shard.IndexShard.index(IndexShard.java:965)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:909)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
       app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     2/10 snapshots sharing following 34 elements
       app//org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction$$Lambda$3630/0x0000000800fc4840.accept(Unknown Source)
       app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
       app//org.opensearch.action.ActionListener$4.onResponse(ActionListener.java:178)
       app//org.opensearch.action.support.replication.ReplicationOperation.finish(ReplicationOperation.java:420)
       app//org.opensearch.action.support.replication.ReplicationOperation.decPendingAndFinishIfNeeded(ReplicationOperation.java:406)
       app//org.opensearch.action.support.replication.ReplicationOperation.access$300(ReplicationOperation.java:70)
       app//org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:184)
       app//org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:176)
       app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult$1.onSuccess(TransportWriteAction.java:309)
       app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.maybeFinish(TransportWriteAction.java:451)
       app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.lambda$run$1(TransportWriteAction.java:480)
       app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction$$Lambda$3644/0x0000000800fd8440.accept(Unknown Source)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.notifyList(AsyncIOProcessor.java:135)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.drainAndProcessAndRelease(AsyncIOProcessor.java:113)
       app//org.opensearch.common.util.concurrent.AsyncIOProcessor.put(AsyncIOProcessor.java:97)
       app//org.opensearch.index.shard.IndexShard.sync(IndexShard.java:3575)
       app//org.opensearch.action.support.replication.TransportWriteAction$AsyncAfterWriteAction.run(TransportWriteAction.java:478)
       app//org.opensearch.action.support.replication.TransportWriteAction$WritePrimaryResult.runPostReplicationActions(TransportWriteAction.java:316)
       app//org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:176)
       app//org.opensearch.action.support.replication.ReplicationOperation$$Lambda$3633/0x0000000800fc5440.accept(Unknown Source)
       app//org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:78)
       app//org.opensearch.action.ActionListener.completeWith(ActionListener.java:351)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.finishRequest(TransportShardBulkAction.java:264)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:227)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     2/10 snapshots sharing following 18 elements
       app//org.opensearch.index.mapper.DocumentParser.internalParseDocument(DocumentParser.java:128)
       app//org.opensearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:90)
       app//org.opensearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:252)
       app//org.opensearch.index.shard.IndexShard.prepareIndex(IndexShard.java:926)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperation(IndexShard.java:883)
       app//org.opensearch.index.shard.IndexShard.applyIndexOperationOnPrimary(IndexShard.java:810)
       app//org.opensearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:351)
       app//org.opensearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:212)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
   
   62.1% (310.4ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][transport_worker][T#1]'
     2/10 snapshots sharing following 49 elements
       app//org.opensearch.action.support.replication.TransportReplicationAction$$Lambda$2624/0x0000000800cd2840.messageReceived(Unknown Source)
       org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceivedDecorate(SecuritySSLRequestHandler.java:193)
       org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:336)
       org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:153)
       org.opensearch.security.OpenSearchSecurityPlugin$7$1.messageReceived(OpenSearchSecurityPlugin.java:647)
       org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:118)
       org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43)
       app//org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:91)
       app//org.opensearch.transport.InboundHandler.handleRequest(InboundHandler.java:244)
       app//org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:127)
       app//org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:109)
       app//org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:759)
       org.opensearch.transport.netty4.Netty4MessageChannelHandler$$Lambda$3131/0x0000000800e5b440.accept(Unknown Source)
       app//org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:170)
       app//org.opensearch.transport.InboundPipeline.doHandleBytes(InboundPipeline.java:145)
       app//org.opensearch.transport.InboundPipeline.handleBytes(InboundPipeline.java:110)
       org.opensearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:94)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:280)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1371)
       io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
       io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283)
       io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510)
       io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449)
       io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
       io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
       io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
       io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)
       io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)
       io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     2/10 snapshots sharing following 20 elements
       io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1234)
       io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1283)
       io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510)
       io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449)
       io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
       io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
       io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
       io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
       io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
       io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
       io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623)
       io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586)
       io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     4/10 snapshots sharing following 3 elements
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
   
   61.5% (307.4ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-20][[filebeat-swift-v1-proxy-server-2022.05.31][26]: Lucene Merge Thread #16]'
     2/10 snapshots sharing following 17 elements
       app//org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:52)
       app//org.apache.lucene.store.RAMOutputStream.writeTo(RAMOutputStream.java:90)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlock(BlockTreeTermsWriter.java:848)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:606)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
       app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
       app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
       app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
       app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
       app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
       app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
       app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
       app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
       app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
       app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
     6/10 snapshots sharing following 14 elements
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:606)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
       app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
       app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
       app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
       app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
       app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
       app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
       app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
       app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
       app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
       app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
     2/10 snapshots sharing following 14 elements
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:628)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.pushTerm(BlockTreeTermsWriter.java:947)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:912)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:318)
       app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
       app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:197)
       app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244)
       app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
       app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
       app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
       app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
       app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
       app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
       app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)

The PVC is nvme8n1 on a k8s host. iostat output as below.

[opensearch@opensearch-cluster-data-20 ~]$ df -h
Filesystem               Size  Used Avail Use% Mounted on
overlay                  878G   94G  740G  12% /
tmpfs                     64M     0   64M   0% /dev
tmpfs                    378G     0  378G   0% /sys/fs/cgroup
/dev/mapper/system-root  878G   94G  740G  12% /etc/hosts
shm                       64M  2.1M   62M   4% /dev/shm
/dev/nvme8n1             5.8T  1.3T  4.3T  23% /usr/share/opensearch/data
tmpfs                    378G   12K  378G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                    378G     0  378G   0% /proc/acpi
tmpfs                    378G     0  378G   0% /proc/scsi
tmpfs                    378G     0  378G   0% /sys/firmware

root@swiftsre003:/mnt/nvme8n1# iostat -dmx 3 /dev/nvme8n1
Linux 4.15.0-167-generic (swiftsre003) 	05/31/2022 	_x86_64_	(80 CPU)

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme8n1          2.27 1569.05      0.27     22.39     0.00  1125.34   0.00  41.77    0.20    0.08   0.08   123.38    14.61   0.00   0.13

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme8n1          0.00 4244.67      0.00     29.48     0.00  3113.00   0.00  42.31    0.00    0.02   0.00     0.00     7.11   0.00   0.00

Device            r/s     w/s     rMB/s     wMB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme8n1          0.00 4042.00      0.00     39.15     0.00  3046.33   0.00  42.98    0.00    0.03   0.04     0.00     9.92   0.00   0.53

Why %util of sdb is high like that? xD

It’s a PV of the OS root partition LVM. Not sure why it went high. Not seeing such high usage of sdb today.

thx

Looks like translog sync. Try

index.translog.durability: “async”
index.translog.sync_interval: “30s”

Thanks,
Matt Weber

Applied the translog async to indexes. Not seeing significant changes from the CPU usage. Maybe the free CPU resource from translog been taken by Lucene merge threads?

ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.42.6.50             34         100 100   16.16   16.12    13.29 d         -      opensearch-cluster-data-23

Is it normal to have Merging Throttling ?

Screen Shot 2022-06-01 at 1.47.33 PM

::: {opensearch-cluster-data-75}{mdo83cflRRO5IpN0mlsYVw}{aNMNoWagSqyv-1SSdwuNrA}{10.42.4.117}{10.42.4.117:9300}{d}{shard_indexing_pressure_enabled=true}
   Hot threads at 2022-06-01T05:16:50.807Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   33.3% (166.3ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][[filebeat-swift-v1-proxy-server-2022.06.01][12]: Lucene Merge Thread #9]'
     10/10 snapshots sharing following 14 elements
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.doAddSortedField(Lucene80DocValuesConsumer.java:682)
       app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addSortedSetField(Lucene80DocValuesConsumer.java:891)
       app//org.apache.lucene.codecs.DocValuesConsumer.mergeSortedSetField(DocValuesConsumer.java:804)
       app//org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:145)
       org.opensearch.knn.index.codec.KNN80Codec.KNN80DocValuesConsumer.merge(KNN80DocValuesConsumer.java:212)
       app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.merge(PerFieldDocValuesFormat.java:155)
       app//org.apache.lucene.index.SegmentMerger.mergeDocValues(SegmentMerger.java:195)
       app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:150)
       app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4757)
       app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4361)
       app//org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:5920)
       app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:626)
       app//org.opensearch.index.engine.OpenSearchConcurrentMergeScheduler.doMerge(OpenSearchConcurrentMergeScheduler.java:118)
       app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:684)
   
   31.4% (157.2ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][transport_worker][T#1]'
     9/10 snapshots sharing following 3 elements
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
   
   28.7% (143.6ms out of 500ms) cpu usage by thread 'opensearch[opensearch-cluster-data-75][write][T#1]'
     4/10 snapshots sharing following 10 elements
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       app//org.opensearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:276)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:165)
       app//org.opensearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:98)
       app//org.opensearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:220)
       app//org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:792)
       app//org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:50)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)
     6/10 snapshots sharing following 10 elements
       java.base@11.0.14.1/jdk.internal.misc.Unsafe.park(Native Method)
       java.base@11.0.14.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
       java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:743)
       java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.xfer(LinkedTransferQueue.java:684)
       java.base@11.0.14.1/java.util.concurrent.LinkedTransferQueue.take(LinkedTransferQueue.java:1366)
       app//org.opensearch.common.util.concurrent.SizeBlockingQueue.take(SizeBlockingQueue.java:176)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
       java.base@11.0.14.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
       java.base@11.0.14.1/java.lang.Thread.run(Thread.java:829)

Hi @hugok ,

The merge throttling seems to be enabled by default (index.merge.scheduler.auto_throttle setting). I am wondering what are the values index.merge.scheduler.max_thread_count and index.merge.scheduler.max_merge_count settings have? How many CPUs you have allocated per node?

@reta
Not seeing the setting in the cluster or index level. I guess it just uses the default one.
I guess it’s the default:

Math.max(1, Math.min(4, Runtime.getRuntime().availableProcessors() / 2))

What number for trying in your opinion?

  • index.merge.scheduler.max_thread_count:10
  • index.merge.scheduler.max_merge_count: 100

How many CPUs you have allocated per node?

This is the key point. I’m about to update this ticket with the recent tuning result. A very important setting of k8s while running OpenSearch. I left both CPU reservation and limit empty and thought that means no limitation to the OpenSearch pod. Unfortunately, it’s not true. When seeing the data node CPU hits 100% from OpenSearch API, the host of these k8s pods doesn’t have a busy CPU at all.

Finally, it you set the reservation and limit to a certain number, it ensure the requested CPU times are reserved for the pod in k8s. THIS IS SO IMPORTANT. Our OpenSearch on k8s runs 4x faster for indexing and searching. It can be faster to have more CPU reservation but there’re other services in the k8s cluster. The current setting already meets our requirement. The CPU usage burst to limit under heavy loading and down to 25~40% for normal traffic.