Cluster manager node unable to join back cluster after rebuilt

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue:

I create an opensearch cluster with 3 nodes :

opensearch-cluster-manager : role cluster_manager
opensearch-cluster-node1 : roles cluster_manager, data, ingest
opensearch-cluster-node2 : roles cluster_manager, data, ingest

After deployment cluster is OK with API requests I can see it’s OK :

Defaulted container “opensearch” out of: opensearch, fsgroup-volume (init), configfile (init)
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:–:-- --:–:-- --:–:-- 0{
“cluster_name” : “opensearch-cluster”,
“status” : “green”,
“timed_out” : false,
“number_of_nodes” : 3,
“number_of_data_nodes” : 2,
“discovered_master” : true,
“discovered_cluster_manager” : true,
“active_primary_shards” : 3,
“active_shards” : 6,
“relocating_shards” : 0,
“initializing_shards” : 0,
“unassigned_shards” : 0,
“delayed_unassigned_shards” : 0,
“number_of_pending_tasks” : 0,
“number_of_in_flight_fetch” : 0,
“task_max_waiting_in_queue_millis” : 0,
“active_shards_percent_as_number” : 100.0

When I test high availability by deleting first node, it’s ok, I can see new manager election.

But while the deleted node is rebuilt, It cannot join cluster with error:

[2026-01-27T17:33:04,363][WARN ][o.o.c.c.ClusterFormationFailureHelper] [opensearch-cluster-manager-0] cluster-manager not discovered or elected yet, an election requires at least 2 nodes with ids from [Y6H56qDxQT6iffpGXuE4Bw, sA4pC1rvREi7xJ_aWbKZLQ, yHHpA2j4TzKbit3beOHpPg], have discovered [{opensearch-cluster-manager-0}{Y6H56qDxQT6iffpGXuE4Bw}{zd0vpeClS-C6l4NJyCOIYQ}{100.72.246.91}{100.72.246.91:9300}{m}{shard_indexing_pressure_enabled=true}] which is not a quorum; discovery will continue using from hosts providers and [{opensearch-cluster-manager-0}{Y6H56qDxQT6iffpGXuE4Bw}{zd0vpeClS-C6l4NJyCOIYQ}{100.72.246.91}{100.72.246.91:9300}{m}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 2, last-accepted version 28 in term 2

My parameters :

gateway.recover_after_data_nodes: 1  ==> tried without this but same erro
gateway.expected_data_nodes: 2  ==> tried without this but same erro
gateway.recover_after_time: "5m"  ==> tried without this but same erro
cluster.auto_shrink_voting_configuration: true  ==> tried without this but same erro

# Bind to all interfaces because we don't know what IP address Docker will assign to us.
network.bind_host: 0.0.0.0
discovery.seed_hosts:
  - "opensearch-cluster-manager-0.opensearch-cluster-manager-headless.cuvms-multus.svc.cluster.local:9300"
  - "opensearch-cluster-node1-0.opensearch-cluster-node1-headless.cuvms-multus.svc.cluster.local:9300"
  - "opensearch-cluster-node2-0.opensearch-cluster-node2-headless.cuvms-multus.svc.cluster.local:9300"
cluster.initial_cluster_manager_nodes:   ==> tried with only one but same error
  - "opensearch-cluster-manager-0"
  - "opensearch-cluster-node1-0"
  - "opensearch-cluster-node2-0"

Configuration: 3.4.0

An update to this topic. If I delete one of my data nodes pod, a new pod is created and come back to existing cluster without any problem.

I just have the issue with my quorum node that is cluster_manager only. When the new pod is created, it creates a new cluster and is alone in it. (I’ve removed cluster.init_cluster_manager_nodes section in my values.yaml and “helm upgrades” my cluster_manager before deletion).