Hi everyone, My name is Soumitra Dutta, an Oxford-based entrepreneur & photographer. I’m setting up an OpenSearch cluster and I want to make sure it’s configured for optimal performance. I’m specifically trying to understand the best practices for configuring nodes, shards, and replicas.
Do you have any tips, recommendations, or experiences to share on how to set these up efficiently? Any guidance on balancing performance, reliability, and resource usage would be really helpful!
Regards
Soumitra Dutta
@soumitradutta26 this is a very general question, therefore I can only provide a general answer, but hoping this will be a good starting point.
Nodes: Use at least 3 master-eligible nodes in production to avoid split-brain issues. For larger clusters, separate master and data node roles to prevent resource contention.
Shards: Aim for 10–50 GB per shard. Too many small shards overload the master and too few large ones slow recovery. Primary shard count is fixed at index creation, so plan ahead. For time-series data, use ISM with rollover to keep shard sizes manageable automatically.
Replicas: 1 replica is the standard for production. It provides redundancy and boosts read throughput. Unlike primary shards, replicas can be adjusted at any time. Temporarily setting replicas to 0 during bulk indexing can significantly speed things up.
Use _cat/shards and _cluster/health to monitor allocation and catch imbalances early.