Versions (OpenSearch/Server OS): 3.4.0/Rocky Linux 9.4
Describe the issue:
I’d like to restart a node without data being shuffled around in the cluster. I tried:
- index.unassigned.node_left.delayed_timeout: 5m (doesn’t work at all, I get unassigned shards immediately with unassigned.reason=NODE_LEFT)
- cluster.routing.allocation.cluster_concurrent_rebalance: 0 (triggers shard initializations)
- cluster.routing.allocation.enable: none (triggers shard relocations when re-enabled)
Note: my settings are
cluster.routing.allocation.balance.index": 1.0
cluster.routing.allocation.balance.threshold": 1.0
cluster.routing.allocation.balance.shard": 0
I also tried with the default of 0.55 for index.
Configuration: 115 di, 5m
All my tests were done with zero reads and zero writes on the cluster.
Is it even possible? I can work with that but I’d like to understand why it’s so hard…
Thank you!
Related Q: Is there a difference between a relocation and an initialization? Both show up in the _cat/recovery endpoint and consume the same amount of network/CPU…