@pablo I really want to thank you, coz you drove me to the right direction.
Finally I was able to get it to work with only one Exception:
opensearch-node2 | [2023-04-25T12:02:53,665][INFO ][o.o.c.c.JoinHelper ] [opensearch-node2] failed to join {opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={opensearch-node2}{oLDXZoDSQR-TQ0puoKZcsQ}{V1yPQBpeSFK499yxp_RmIg}{192.168.16.2}{192.168.16.2:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=0, optionalJoin=Optional[Join{term=1, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={opensearch-node2}{oLDXZoDSQR-TQ0puoKZcsQ}{V1yPQBpeSFK499yxp_RmIg}{192.168.16.2}{192.168.16.2:9300}{dimr}{shard_indexing_pressure_enabled=true}, targetNode={opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true}}]}
opensearch-node2 | org.opensearch.transport.RemoteTransportException: [opensearch-node1][192.168.16.4:9300][internal:cluster/coordination/join]
opensearch-node2 | Caused by: org.opensearch.cluster.coordination.CoordinationStateRejectedException: incoming term 1 does not match current term 2
opensearch-node2 | at org.opensearch.cluster.coordination.CoordinationState.handleJoin(CoordinationState.java:256) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.cluster.coordination.Coordinator.handleJoin(Coordinator.java:1179) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at java.util.Optional.ifPresent(Optional.java:178) ~[?:?]
opensearch-node2 | at org.opensearch.cluster.coordination.Coordinator.processJoinRequest(Coordinator.java:647) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.cluster.coordination.Coordinator.lambda$handleJoinRequest$7(Coordinator.java:610) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:80) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.transport.ClusterConnectionManager.connectToNode(ClusterConnectionManager.java:138) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.transport.TransportService.connectToNode(TransportService.java:450) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.transport.TransportService.connectToNode(TransportService.java:430) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.cluster.coordination.Coordinator.handleJoinRequest(Coordinator.java:592) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.cluster.coordination.JoinHelper.lambda$new$1(JoinHelper.java:190) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceivedDecorate(SecuritySSLRequestHandler.java:192) ~[?:?]
opensearch-node2 | at org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:278) ~[?:?]
opensearch-node2 | at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:152) ~[?:?]
opensearch-node2 | at org.opensearch.security.OpenSearchSecurityPlugin$7$1.messageReceived(OpenSearchSecurityPlugin.java:659) ~[?:?]
opensearch-node2 | at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:108) ~[?:?]
opensearch-node2 | at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43) ~[?:?]
opensearch-node2 | at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:106) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.transport.InboundHandler$RequestHandler.doRun(InboundHandler.java:453) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.6.0.jar:2.6.0]
opensearch-node2 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
opensearch-node2 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
opensearch-node2 | at java.lang.Thread.run(Thread.java:833) [?:?]
opensearch-node2 | [2023-04-25T12:02:53,747][INFO ][o.o.c.s.MasterService ] [opensearch-node2] elected-as-cluster-manager ([2] nodes joined)[{opensearch-node2}{oLDXZoDSQR-TQ0puoKZcsQ}{V1yPQBpeSFK499yxp_RmIg}{192.168.16.2}{192.168.16.2:9300}{dimr}{shard_indexing_pressure_enabled=true} elect leader, {opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true} elect leader, _BECOME_CLUSTER_MANAGER_TASK_, _FINISH_ELECTION_], term: 2, version: 1, delta: cluster-manager node changed {previous [], current [{opensearch-node2}{oLDXZoDSQR-TQ0puoKZcsQ}{V1yPQBpeSFK499yxp_RmIg}{192.168.16.2}{192.168.16.2:9300}{dimr}{shard_indexing_pressure_enabled=true}]}, added {{opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true}}
opensearch-node1 | [2023-04-25T12:02:53,816][INFO ][o.o.c.c.JoinHelper ] [opensearch-node1] failed to join {opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true} with JoinRequest{sourceNode={opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true}, minimumTerm=0, optionalJoin=Optional[Join{term=1, lastAcceptedTerm=0, lastAcceptedVersion=0, sourceNode={opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true}, targetNode={opensearch-node1}{rpJrJk1XQ5itqukmivi_Vg}{UgcfUXIyR4CQV2kJtoMk5w}{192.168.16.4}{192.168.16.4:9300}{dimr}{shard_indexing_pressure_enabled=true}}]}
opensearch-node1 | org.opensearch.transport.RemoteTransportException: [opensearch-node1][192.168.16.4:9300][internal:cluster/coordination/join]
opensearch-node1 | Caused by: org.opensearch.cluster.coordination.CoordinationStateRejectedException: became follower
opensearch-node1 | at org.opensearch.cluster.coordination.JoinHelper$CandidateJoinAccumulator.lambda$close$3(JoinHelper.java:570) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at java.util.HashMap$Values.forEach(HashMap.java:1065) [?:?]
opensearch-node1 | at org.opensearch.cluster.coordination.JoinHelper$CandidateJoinAccumulator.close(JoinHelper.java:570) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at org.opensearch.cluster.coordination.Coordinator.becomeFollower(Coordinator.java:745) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at org.opensearch.cluster.coordination.Coordinator.onFollowerCheckRequest(Coordinator.java:344) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at org.opensearch.cluster.coordination.FollowersChecker$2.doRun(FollowersChecker.java:228) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.6.0.jar:2.6.0]
opensearch-node1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
opensearch-node1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
opensearch-node1 | at java.lang.Thread.run(Thread.java:833) [?:?]
I think this exception is related to the node2 not able to join the cluster, but I am not sure why.
For other people looking for this solution, I am going to add here the steps that worked for me:
- I created self-signed certificates for the
admin
, hence aroot-ca-admin.pem
following this guide on the documentation - I added to the
root-ca.pem
the certificates from Digicert, plus theroot-ca-admin.pem
:
$ cat DigiCertCA2.pem TrustedRoot.pem root-ca-admin.pem > root-ca.pem
-
I used
star_company_com.pem
as thenode.pem
certificates andwildcard_company_com_key.pem
as thenode-key.pem
. -
Next step was to give the right permissions and set the right owners to the files we want to add to the
volumes
section ofdocker-compose.yml
:
$ chown 1000:1000 custom-opensearch.yml node-key.pem node.pem admin-key.pem admin.pem root-ca.pem
$ chmod 0600 custom-opensearch.yml node-key.pem node.pem admin-key.pem admin.pem root-ca.pem
- I also updated my
custom-opensearch.yml
to add the another entry (for the second node) to this directive:
plugins.security.nodes_dn:
- 'CN=*.company.com,O=Company\, Inc.,L=CITY,ST=STATE,C=US'
- 'CN=*.company.com,O=Company\, Inc.,L=CITY,ST=STATE,C=US'
Make sure to properly escape special characters like commas( \,
).
It is possible that, the second line, did create the issue with the second node not being able to join the cluster, I am going to test again by removing the second entry.
Hope this could help other people in my situation.
Thanks