Describe the issue:
I am aiming to optimize the performance of our OpenSearch cluster, specifically looking to improve search performance during peak load times. Currently, during peak operations, the average CPU usage on our data nodes hits 60%, which suggests that our current shard configuration may need adjustment.
I am considering how best to adjust the number of primary and replica shards to manage search load more effectively and reduce CPU strain. Any insights or recommendations on optimal shard configurations for better performance and load distribution would be greatly appreciated.
Configuration:
Node Details: 4 data nodes on AWS r6g.large instances, supported by 3 master nodes
Index Details: A single index with 7 million documents
Current Shard Setup: 2 primary shards with 1 replica shard for each
Workload Characteristics: Mainly search-heavy, with peak search requests of 100 req/s
Thank you for the advice.
Each shard is balanced in the 4 data nodes!
(Excerpt from response of GET /_cat/shards?v)
index_name 0 r STARTED 3710610 1.5gb x.x.x.x 8bbb2axxxxxxxxxx
index_name 0 p STARTED 3710610 1.4gb x.x.x.x a39fdxxxxxxxxxx
index_name 1 r STARTED 3708589 1.4gb x.x.x.x f9567xxxxxxxxxx
index_name 1 p STARTED 3708590 1.4gb x.x.x.x a2a42xxxxxxxxxx
Thank you for the article, it was helpful. However, I’m not sure if our sharding configuration is optimal. Actually our index data is only 3GB, so according to the article’s guideline of 10-30GiB per shard, perhaps we should have just one primary shard. But I’m concerned that this could result in the indexing process load being concentrated on a single data node.