at the moment, I am trying to create a OpenDistro 1.7 ElasticSearch Cluster with 3 nodes. After testing with the demo certificates on a single node, I am using my own PKI for managing the node and client certificates.

On a single node server, everything is running fine.
In cluster mode, all nodes come up with the following error in high frequency:

[2020-05-19T14:48:15,794][ERROR][c.a.o.s.s.t.OpenDistroSecuritySSLNettyTransport] [] Exception during establishing a SSL connection: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)

Note: I have read here that there is a know java issue and this message does not affect the operation… but for me, it does… (reference: Troubleshoot - Open Distro Documentation )

In between, I can see that the master node election was not yet done:

[2020-05-19T15:09:00,369][WARN ][o.e.c.c.ClusterFormationFailureHelper] [] master not discovered yet, this node has not previously joined a bootstrapped (v7+) cluster, and this node must discover master-eligible nodes [,,] to bootstrap a cluster: have discovered [{}{vpOXuYYNRkeqoMQ8kbv8cw}{F3vzvzkgSrqlMj_qhklEtw}{}{}{dim}]; discovery will continue using [xx.aa.b.54:29300, xx.aa.b.55:29300, xx.aa.b.56:29300] from hosts providers and [{}{vpOXuYYNRkeqoMQ8kbv8cw}{F3vzvzkgSrqlMj_qhklEtw}{}{}{dim}] from last-known cluster state; node term 0, last-accepted version 0 in term 0

I am running 3 docker containers on 3 different VMs.

  • discovery.seed_hosts and cluster.initial_master_nodes are set to the 3 host names.
  • is the FQDN of each server
  • transport.profiles.default.port is set to 29300

The Certificate chain seems to be fine since I can use and my client certificate.

When I do a TLS test connection with my node certificate, everything seems also be fine:

openssl s_client -connect -cert ./ -key ./xxx.yyy.zzz.key.pem

=> no works
leaving out the client cert/key:
139939766322832:error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:s3_pkt.c:1498:SSL alert number 42
=> fails (as expected)

I am using a JKS keystore and JKS truststore for OpenDistro.
Checking the stores with keytool, everything seems to be fine.
PKI has been created using SearchGuards PKI scripts.

opendistro_security.nodes_dn is also configured to the DNs of the node certs.

My “feeling” is that OpenDistro does not use the node certificate as a client certificate when trying to negiotiate with the other nodes?

Finally got that working now. Must have been one the following parameters that was not set right.

discovery.seed_hosts: "{{ ansible_play_hosts_all|join(',') }}" cluster.initial_master_nodes: "{{ ansible_play_hosts_all[0] }}" transport.profiles.default.port: "{{ group.elasticsearch.transport_port }}" transport.port: "{{ group.elasticsearch.transport_port }}" http.port: "{{ group.elasticsearch.rest_port }}" network.publish_host: "{{ ansible_eth0.ipv4.address }}"

Troubleshooting is really hard if only the following error occurs: Insufficient buffer remaining for AEAD cipher fragment (2). Needs to be more than tag size (16)

Is there any way to debug/trace the Elastic node-to-node communication better?

@shakazulu you can set the property in as below:

rootLogger.level = trace

this will be very verbose however, therefore it’s recommended to disable anything else talking to nodes like kibana, logstash etc

I faced a similar issue while upgrading to opensearch 1.2.1. In my case, I had to set “majorVersion”: “7” in opensearch.yml file to make it work.

This issue is described in more detail in [1]([BUG] Opensearch SSL transport error, master not discovered or elected yet - Opensearch-Project/Helm-Charts).

This seems to an issue related to JDK.

It is fixed on JDK 17