Ideal Opensearch Cluster?

vnovotny98 · May 26, 2023, 9:41am

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
I have Opensearch 1.3.8 - I will update this cluster to Opensearch 2 after some Optimization

Describe the issue:
We have like 70% allocation on hot and 90% on warm nodes. We need to expand.

Configuration:
We have 3 warm nodes, 3 hot nodes and 3 master nodes.
On data nodes, we have 32GB heap in docker, 50GB RAM for docker allocation and 64 GB on OS.
I think we have no problem with Heap and GC.

But I need to expand allocations. We have 1 TB for HOT and 6 TB for WARM.
Can you give me some tips? I need like 6 TB on SSD and 30 TB on WARM, at least, maybe 50 TB.
Should I add 3 hot and warm nodes with same configuration or can i extend storage from 1 to 2 and from 6 to 12 TB?
I have 3 datacenters and one node in every datacenter. What if I will have maintenance in one datacenter and I will stop this datacenter. If I had 6 hot nodes, 2 of them will be here and I shut them down. I use 1 master and 2 replicas shards. So If I will have only 4 nodes up, they maybe rellocate to other nodes and there would be big traffic and maybe no space left. Can I make somehow that 1 shard will be in every datacenter? If I shutdown 2 nodes in the same datacenter I will have 2 shards of every index still ready. Thanks

vnovotny98 · May 29, 2023, 1:06pm

Can anyone help me please?

mkhl · May 30, 2023, 9:32am

If cpu utilization is moderate, you can add just storage.
You can disable allocation during maintenance to avoid shard relocation.

radu.gheorghe · June 7, 2023, 8:36am

Judging by the graphs you shared, you don’t need 32GB of heap, I assume 10GB is enough. In fact, 32GB isn’t a great idea because the JVM doesn’t compress pointers, so you have less usable heap than with e.g. 30GB.

And this is with warm nodes, I assume hot nodes are OK with the same amount of heap, too.

I also assume that IO isn’t that much of an issue (maybe you have local SSDs?) because you have much more data than OS cache space.

That leaves us with CPU, as @mkhl suggested. If that’s averaging under 30% or so, you could probably safely squeeze 3x the data on the same amount of nodes. But you need 6x, so probably double the nodes?

The above is just a rule of thumb, maybe you have other bottlenecks. I’m not sure which metrics you currently collect, I’d recommend our OpenSearch monitoring for a comprehensive solution.

If you have 3 datacenters and you only need to support one DC/node failing at a time, I would only have one replica per shard (I mean two copies, primary + replica). I would use forced awareness to make sure shards don’t migrate on other DCs when one DC is down for maintenance. The downside is that if something else fails during that maintenance window, your cluster will be mostly unavailable.

The upside is, with one replica instead of two you can store 50% more data on the same hardware So maybe if your CPU usage is 20% or so, you can squeeze all the data on existing nodes. Or maybe that’s only the case for the warm tier and you only have to expand the hot tier or something like that.

Topic		Replies	Views
Need to know impact of increasing primary shards for existing index OpenSearch	1	54	September 20, 2024
Opensearch Query Very Slow OpenSearch troubleshoot	1	67	June 18, 2025
How to reduce the node's heap size? OpenSearch	4	1149	September 12, 2023
Opensearch Resource requirements OpenSearch	4	4459	August 31, 2023
Mukesh karkey : What are the recommended hardware requirements for my OpenSearch cluster? OpenSearch install	1	14	September 4, 2025

Ideal Opensearch Cluster?

Related topics