Hi, I got a problem to create a cluster with 3 nodes
Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 2.13.0
Describe the issue:
I got 3 nodes. Each node have 2 private IP linked with the other node.
Each node need to use a different IP to reach the other node. Here is my configuration for each node (resume)
Node 1
# Ecouter sur les 3 interfaces
network.bind_host: ['XX.XX.1.1', 'XX.XX.3.2', 'other_private_address']
network.publish_host: ['XX.XX.1.1', 'XX.XX.3.2']
ht tp.bind_host: ['XX.XX.1.1', 'XX.XX.3.2', 'other_private_address']
ht tp.publish_host: ['XX.XX.1.1', 'XX.XX.3.2']
transport.bind_host: ['XX.XX.1.1', 'XX.XX.3.2', 'other_private_address']
transport.publish_host: ['XX.XX.1.1', 'XX.XX.3.2']
# Nommer le cluster
cluster.name: opensearch-xxx-cluster
# Node name
node.name: xxx01.xxx
# Discovery host to automaticly join the cluster
discovery.seed_hosts: ['XX.XX.1.2', 'XX.XX.3.1']
cluster.initial_cluster_manager_nodes: ['xxx01.xxx']
# Path to directory where to store the data (separate multiple locations by comma):
path.data: /var/lib/opensearch
# Path to log files:
path.logs: /var/log/opensearch
##### Certificates
plugins.security....
##### Tuning
# Disable JVM heap memory swapping
bootstrap.memory_lock: true
When I start the Node 1, everything is OK I can curl my cluster and check that the node is OK.
When I start the Node 2, it works and I can see that the node join the cluster.
When I start the Node 3, it doesn’t work and log this error.
[2024-05-16T11:22:09,106][WARN ][o.o.d.HandshakingTransportAddressConnector] [xxx03.xxx] [connectToRemoteMasterNode[XX.XX.2.1:9300]] completed handshake with [{xxx02.xxx}{X0GTmdd1R-Wvvo2vFAyYPg}{wRvQxxlUTC-SNHnWFte65g}{XX.XX.1.2}{XX.XX.1.2:9300}{dimr}{shard_indexing_pressure_enabled=true}] but followup connection failed
[2024-05-16T11:22:09,105][WARN ][o.o.d.HandshakingTransportAddressConnector] [xxx03.xxx] [connectToRemoteMasterNode[XX.XX.3.2:9300]] completed handshake with [{xxx01.xxx}{B48Chv2EQbqpdhXHVfNaog}{9UPPR7-jT-auWWtiEfadIA}{XX.XX.1.1}{XX.XX.1.1:9300}{dimr}{shard_indexing_pressure_enabled=true}] but followup connection failed
So what I supposed is that the Node 3 take the transport_address
for each node (2 / 3) to try connexion and failed because there transport_address
are respectivily define with XX.XX.1.1
and XX.XX.1.2
My question is, what I’m doing wrong ? Why is the transport_address
define with the first IP address in the array (I guess from transport.publish_host) whereas I define an array and not a unique IP ?
I checked the transport_address
with a curl request to display node information
"ID": {
"name": "xxx02.xxx",
"transport_address": "XX.XX.1.2:9300",
"host": "XX.XX.1.2",
"ip": "XX.XX.1.2",
"version": "2.13.0",
"build_type": "deb",
"build_hash": "7ec678d1b7c87d6e779fdef94e33623e1f1e2647",
"total_indexing_buffer": 13314398617,
Thanks for you help !