Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.8.0
Describe the issue:
Hi All,
We are trying to deploy OS on our multi node docker cluster. There are a total of 3 physical hosts. But no mater what i do, i cant seem to get them to join the cluster. If i deploy them on the one node, all works as expected. But once i split them out to load balance over the nodes, it just will not create a quorum. I have looked oveer some of the other threads on this, namely Opensearch with multiple nodes on different servers not working and tried the solutions in there, but to no success.
Can someone have a look over my config below and let me know where im going wrong?
The stack is deployed with
docker stack deploy opensearch -c opensearch_docker_compose.yml
Configuration:
---
version: "3.4"
services:
###########################################
# OpeanSearch Start
###########################################
os-node1:
image: opensearchproject/opensearch:2.8.0
container_name: opensearch-node1
environment:
- cluster.name=opensearch-cluster
- node.name=os-node1
- discovery.seed_hosts=os-node1,os-node2,os-node3
- cluster.initial_master_nodes=os-node1,os-node2,os-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data1:/usr/share/opensearch/data
ports:
- 9200:9200
- 9300:9300
- 9600:9600
networks:
- opensearch_net
- internal_prod
deploy:
placement:
constraints:
- node.hostname == cube-node-1
os-node2:
image: opensearchproject/opensearch:2.8.0
container_name: opensearch-node2
environment:
- cluster.name=opensearch-cluster
- node.name=os-node2
- discovery.seed_hosts=os-node1,os-node2,os-node3
- cluster.initial_master_nodes=os-node1,os-node2,os-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data2:/usr/share/opensearch/data
networks:
- opensearch_net
deploy:
placement:
constraints:
- node.hostname == cube-node-2
os-node3:
image: opensearchproject/opensearch:2.8.0
container_name: opensearch-node3
environment:
- cluster.name=opensearch-cluster
- node.name=os-node3
- discovery.seed_hosts=os-node1,os-node2,os-node3
- cluster.initial_master_nodes=os-node1,os-node2,os-node3
- bootstrap.memory_lock=true
- "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g"
- "DISABLE_INSTALL_DEMO_CONFIG=true"
- "DISABLE_SECURITY_PLUGIN=true"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
volumes:
- opensearch-data3:/usr/share/opensearch/data
networks:
- opensearch_net
deploy:
placement:
constraints:
- node.hostname == cube-node-3
###########################################
# OpenSearch End
###########################################
###########################################
# OpenSearch Dashboards Start
###########################################
opensearch-dashboards:
image: opensearchproject/opensearch-dashboards:2.8.0
container_name: opensearch-dashboards
ports:
- 5601:5601
expose:
- "5601"
environment:
- 'OPENSEARCH_HOSTS=["http://os-node1:9200","http://os-node2:9200","http://os-node3:9200"]'
- "DISABLE_SECURITY_DASHBOARDS_PLUGIN=false"
networks:
- opensearch_net
- internal_prod
###########################################
# OpenSearch Dashboards End
###########################################
volumes:
opensearch-data1:
opensearch-data2:
opensearch-data3:
networks:
opensearch_net:
external: true
internal_prod:
external: true
Relevant Logs or Screenshots:
Node 1
[2023-07-25T02:56:49,650][WARN ][o.o.c.c.ClusterFormationFailureHelper] [os-node1] cluster-manager not discovered or elected yet, an election requires 2 nodes with ids [I4dTdjuzRrC_spgVMveBHw, P9h3-UXnSTGnsdPSw8IzYQ], have discovered [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is a quorum; discovery will continue using [10.0.12.24:9300, 10.0.10.21:9300, 10.0.10.13:9300] from hosts providers and [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 2, last-accepted version 0 in term 0
[2023-07-25T02:56:59,650][WARN ][o.o.c.c.ClusterFormationFailureHelper] [os-node1] cluster-manager not discovered or elected yet, an election requires 2 nodes with ids [I4dTdjuzRrC_spgVMveBHw, P9h3-UXnSTGnsdPSw8IzYQ], have discovered [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is a quorum; discovery will continue using [10.0.12.24:9300, 10.0.10.21:9300, 10.0.10.13:9300] from hosts providers and [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 2, last-accepted version 0 in term 0
[2023-07-25T02:57:09,651][WARN ][o.o.c.c.ClusterFormationFailureHelper] [os-node1] cluster-manager not discovered or elected yet, an election requires 2 nodes with ids [I4dTdjuzRrC_spgVMveBHw, P9h3-UXnSTGnsdPSw8IzYQ], have discovered [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}] which is a quorum; discovery will continue using [10.0.12.24:9300, 10.0.10.21:9300, 10.0.10.13:9300] from hosts providers and [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}] from last-known cluster state; node term 2, last-accepted version 0 in term 0
[2023-07-25T02:57:11,341][INFO ][o.o.c.c.JoinHelper ] [os-node1] failed to join {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=2, optionalJoin=Optional.empty}
org.opensearch.transport.RemoteTransportException: [os-node3][10.0.10.14:9300][internal:cluster/coordination/join]
Caused by: org.opensearch.transport.ConnectTransportException: [os-node1][10.0.0.24:9300] connect_timeout[30s]
at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1082) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) ~[opensearch-2.8.0.jar:2.8.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
[2023-07-25T02:57:11,341][INFO ][o.o.c.c.JoinHelper ] [os-node1] failed to join {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={os-node1}{I4dTdjuzRrC_spgVMveBHw}{wfIBFc5dS1mGVmbjqQUFUw}{10.0.0.24}{10.0.0.24:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=2, optionalJoin=Optional.empty}
org.opensearch.transport.RemoteTransportException: [os-node3][10.0.10.14:9300][internal:cluster/coordination/join]
Caused by: org.opensearch.transport.ConnectTransportException: [os-node1][10.0.0.24:9300] connect_timeout[30s]
at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onTimeout(TcpTransport.java:1082) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) ~[opensearch-2.8.0.jar:2.8.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Node 2
[2023-07-25T02:14:10,813][INFO ][o.o.n.Node ] [os-node2] initialized
[2023-07-25T02:14:10,813][INFO ][o.o.n.Node ] [os-node2] starting ...
[2023-07-25T02:14:10,895][INFO ][o.o.t.TransportService ] [os-node2] publish_address {10.0.10.22:9300}, bound_addresses {0.0.0.0:9300}
[2023-07-25T02:14:10,997][INFO ][o.o.b.BootstrapChecks ] [os-node2] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2023-07-25T02:14:15,227][INFO ][o.o.c.c.JoinHelper ] [os-node2] failed to join {os-node1}{I4dTdjuzRrC_spgVMveBHw}{A8n71U64R0iwJmaV5S92hw}{10.0.0.23}{10.0.0.23:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=0, optionalJoin=Optional[Join{term=1, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, targetNode={os-node1}{I4dTdjuzRrC_spgVMveBHw}{A8n71U64R0iwJmaV5S92hw}{10.0.0.23}{10.0.0.23:9300}{dimr}{shard_indexing_pressure_enabled=true}}]}
org.opensearch.transport.NodeNotConnectedException: [os-node1][10.0.0.23:9300] Node not connected
at org.opensearch.transport.ClusterConnectionManager.getConnection(ClusterConnectionManager.java:206) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.TransportService.getConnection(TransportService.java:904) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.TransportService.sendRequest(TransportService.java:820) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:335) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:263) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.lambda$new$2(JoinHelper.java:201) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:113) [opensearch-index-management-2.8.0.0.jar:2.8.0.0]
at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43) [opensearch-performance-analyzer-2.8.0.0.jar:2.8.0.0]
at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:106) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:453) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.8.0.jar:2.8.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
[2023-07-25T02:14:15,238][INFO ][o.o.c.c.JoinHelper ] [os-node2] failed to join {os-node1}{I4dTdjuzRrC_spgVMveBHw}{A8n71U64R0iwJmaV5S92hw}{10.0.0.23}{10.0.0.23:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=0, optionalJoin=Optional[Join{term=1, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}, targetNode={os-node1}{I4dTdjuzRrC_spgVMveBHw}{A8n71U64R0iwJmaV5S92hw}{10.0.0.23}{10.0.0.23:9300}{dimr}{shard_indexing_pressure_enabled=true}}]}
org.opensearch.transport.NodeNotConnectedException: [os-node1][10.0.0.23:9300] Node not connected
at org.opensearch.transport.ClusterConnectionManager.getConnection(ClusterConnectionManager.java:206) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.TransportService.getConnection(TransportService.java:904) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.TransportService.sendRequest(TransportService.java:820) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:335) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.sendJoinRequest(JoinHelper.java:263) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.cluster.coordination.JoinHelper.lambda$new$2(JoinHelper.java:201) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:113) [opensearch-index-management-2.8.0.0.jar:2.8.0.0]
at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43) [opensearch-performance-analyzer-2.8.0.0.jar:2.8.0.0]
at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:106) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:453) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.8.0.jar:2.8.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
[2023-07-25T02:14:15,482][INFO ][o.o.c.c.Coordinator ] [os-node2] setting initial configuration to VotingConfiguration{{bootstrap-placeholder}-os-node1,bVYnmh1NRA-wo7jxLqcSGA,P9h3-UXnSTGnsdPSw8IzYQ}
[2023-07-25T02:14:15,576][INFO ][o.o.c.c.CoordinationState] [os-node2] cluster UUID set to [vPTgjomxR4OufpYvcyz8RA]
[2023-07-25T02:14:15,597][INFO ][o.o.c.s.ClusterApplierService] [os-node2] cluster-manager node changed {previous [], current [{os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}]}, added {{os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}}, term: 2, version: 1, reason: ApplyCommitRequest{term=2, version=1, sourceNode={os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}}
[2023-07-25T02:14:15,600][INFO ][o.o.a.c.ADClusterEventListener] [os-node2] Cluster is not recovered yet.
Node 3
[2023-07-25T02:14:15,223][INFO ][o.o.n.Node ] [os-node3] initialized
[2023-07-25T02:14:15,223][INFO ][o.o.n.Node ] [os-node3] starting ...
[2023-07-25T02:14:15,299][INFO ][o.o.t.TransportService ] [os-node3] publish_address {10.0.10.14:9300}, bound_addresses {0.0.0.0:9300}
[2023-07-25T02:14:15,402][INFO ][o.o.b.BootstrapChecks ] [os-node3] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2023-07-25T02:14:15,461][INFO ][o.o.c.c.Coordinator ] [os-node3] setting initial configuration to VotingConfiguration{{bootstrap-placeholder}-os-node1,bVYnmh1NRA-wo7jxLqcSGA,P9h3-UXnSTGnsdPSw8IzYQ}
[2023-07-25T02:14:15,536][INFO ][o.o.c.s.MasterService ] [os-node3] elected-as-cluster-manager ([2] nodes joined)[{os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true} elect leader, {os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true} elect leader, _BECOME_CLUSTER_MANAGER_TASK_, _FINISH_ELECTION_], term: 2, version: 1, delta: cluster-manager node changed {previous [], current [{os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}]}, added {{os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}}
[2023-07-25T02:14:15,574][INFO ][o.o.c.c.CoordinationState] [os-node3] cluster UUID set to [vPTgjomxR4OufpYvcyz8RA]
[2023-07-25T02:14:15,622][INFO ][o.o.c.s.ClusterApplierService] [os-node3] cluster-manager node changed {previous [], current [{os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}]}, added {{os-node2}{P9h3-UXnSTGnsdPSw8IzYQ}{GLuWLKrDRYK9QIAlIQnezA}{10.0.10.22}{10.0.10.22:9300}{dimr}{shard_indexing_pressure_enabled=true}}, term: 2, version: 1, reason: Publication{term=2, version=1}
[2023-07-25T02:14:15,629][INFO ][o.o.a.c.ADClusterEventListener] [os-node3] Cluster is not recovered yet.
[2023-07-25T02:14:15,633][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [os-node3] Detected cluster change event for destination migration
[2023-07-25T02:14:15,639][INFO ][o.o.c.r.a.DiskThresholdMonitor] [os-node3] skipping monitor as a check is already in progress
[2023-07-25T02:14:15,648][INFO ][o.o.i.i.ManagedIndexCoordinator] [os-node3] Cache cluster manager node onClusterManager time: 1690251255648
[2023-07-25T02:14:15,650][INFO ][o.o.m.a.MLModelAutoReDeployer] [os-node3] Model auto reload configuration is false, not performing auto reloading!
[2023-07-25T02:14:15,653][WARN ][o.o.p.c.s.h.ConfigOverridesClusterSettingHandler] [os-node3] Config override setting update called with empty string. Ignoring.
[2023-07-25T02:14:15,658][INFO ][o.o.d.PeerFinder ] [os-node3] setting findPeersInterval to [1s] as node commission status = [true] for local node [{os-node3}{bVYnmh1NRA-wo7jxLqcSGA}{ONfMrMEiQzSHiuYXeVmI0w}{10.0.10.14}{10.0.10.14:9300}{dimr}{shard_indexing_pressure_enabled=true}]
[2023-07-25T02:14:15,660][INFO ][o.o.h.AbstractHttpServerTransport] [os-node3] publish_address {10.0.10.14:9200}, bound_addresses {0.0.0.0:9200}
[2023-07-25T02:14:15,661][INFO ][o.o.n.Node ] [os-node3] started
[2023-07-25T02:14:15,661][INFO ][o.o.s.OpenSearchSecurityPlugin] [os-node3] Node started
[2023-07-25T02:14:15,661][INFO ][o.o.s.OpenSearchSecurityPlugin] [os-node3] 0 OpenSearch Security modules loaded so far: []
[2023-07-25T02:14:15,686][INFO ][o.o.a.c.HashRing ] [os-node3] Node added: [bVYnmh1NRA-wo7jxLqcSGA, P9h3-UXnSTGnsdPSw8IzYQ]
[2023-07-25T02:14:15,687][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [os-node3] Detected cluster change event for destination migration
[2023-07-25T02:14:18,615][INFO ][o.o.c.r.a.AllocationService] [os-node3] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[.kibana_1][0]]]).
[2023-07-25T02:14:18,646][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [os-node3] Detected cluster change event for destination migration
[2023-07-25T02:14:45,456][WARN ][o.o.d.HandshakingTransportAddressConnector] [os-node3] [connectToRemoteMasterNode[10.0.10.19:9300]] completed handshake with [{os-node1}{I4dTdjuzRrC_spgVMveBHw}{A8n71U64R0iwJmaV5S92hw}{10.0.0.23}{10.0.0.23:9300}{dimr}{shard_indexing_pressure_enabled=true}] but followup connection failed
org.opensearch.transport.ConnectTransportException: [os-node1][10.0.0.23:9300] connect_exception
at org.opensearch.transport.TcpTransport$ChannelsConnectedListener.onFailure(TcpTransport.java:1076) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.action.ActionListener.lambda$toBiConsumer$2(ActionListener.java:215) ~[opensearch-2.8.0.jar:2.8.0]
at org.opensearch.common.concurrent.CompletableContext.lambda$addListener$0(CompletableContext.java:57) ~[opensearch-common-2.8.0.jar:2.8.0]
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162) ~[?:?]
at org.opensearch.common.concurrent.CompletableContext.completeExceptionally(CompletableContext.java:72) ~[opensearch-common-2.8.0.jar:2.8.0]
at org.opensearch.transport.netty4.Netty4TcpChannel.lambda$addListener$0(Netty4TcpChannel.java:81) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:629) ~[?:?]
at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:118) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:262) ~[?:?]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[?:?]
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[?:?]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174) ~[?:?]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[?:?]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: 10.0.0.23/10.0.0.23:9300
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[?:?]
... 9 more
Docker Network Configuration
docker network ls | grep opensearch_net
59k0hn8zxn6k opensearch_net overlay swarm
docker network ls | grep internal_prod
6ebe0bze3ztm internal_prod overlay swarm