LDAP Authentication and handshake failed for [connectToRemoteMasterNode[172.25.84.226:9300]]

Hi,

I am to configure an OpenSearch 2.0 cluster with LDAP. I have a Secure LDAP running on “ldap.example.com”. I have issues with the OpenSearch Cluster to both that the LDAPS is running with a self-signed certificate, as one with an “official” one configred in AWS ACM *.example.com.

---
_meta:
  type: "config"
  config_version: 2

config:
  dynamic:
...
    authc:
...
      ldap:
        description: "Authenticate via LDAP or Active Directory"
        http_enabled: true
        transport_enabled: false
        order: 0
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: false
            hosts:
            - ldap.example.com:636
            bind_dn: 'cn=example,ou=clients,dc=ldap,dc=example,dc=com'
            password: 'mypassword'
            userbase: 'ou=users,dc=ldap,dc=example,dc=com'
            usersearch: '(uid={0})'
            username_attribute: uid
    authz:
      roles_from_myldap:
        description: "Authorize via LDAP or Active Directory"
        http_enabled: true
        transport_enabled: false
        authorization_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: false
            hosts:
            - ldap.example.com:636
            bind_dn: 'cn=example,ou=clients,dc=ldap,dc=example,dc=com'
            password: 'mypassword'
            rolebase: 'ou=groups,dc=ldap,dc=example,dc=com'
            rolesearch: '(member={0})'
            userroleattribute: null
            userrolename: disabled
            rolename: cn
            resolve_nested_roles: true
            userbase: 'ou=users,dc=example,dc=org'
            usersearch: '(uid={0})'
            skip_users:
            - "admin"
            - "kibanaserver"

This is the opensearch.yml for 1 of the nodes:

---
network.host: 172.25.80.6
cluster.name: treasurup
node.name: ip-172-25-80-6.eu-west-1.compute.internal
discovery.seed_hosts: ip-172-25-82-154.eu-west-1.compute.internal,ip-172-25-80-6.eu-west-1.compute.internal,ip-172-25-84-226.eu-west-1.compute.internal
cluster.initial_master_nodes: ip-172-25-82-154.eu-west-1.compute.internal,ip-172-25-80-6.eu-west-1.compute.internal,ip-172-25-84-226.eu-west-1.compute.internal
bootstrap.memory_lock: true

path.data: /usr/share/opensearch/
path.logs: /var/log/opensearch

plugins.security.ssl.http.pemcert_filepath: ip-172-25-80-6.eu-west-1.compute.internal.pem
plugins.security.ssl.http.pemkey_filepath: ip-172-25-80-6.eu-west-1.compute.internal.key
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.pemcert_filepath: access_key.pub
plugins.security.ssl.transport.pemkey_filepath: access_key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca-ldap.pem
plugins.security.ssl.transport.truststore_filepath: truststore.jks
plugins.security.ssl.http.enabled: true
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.transport.resolve_hostname: false
plugins.security.authcz.admin_dn: ['CN=admin,dc=example,dc=org']
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.nodes_dn:
    - 'CN=ip-172-25-82-154.eu-west-1.compute.internal,dc=example,dc=org'
    - 'CN=ip-172-25-80-6.eu-west-1.compute.internal,dc=example,dc=org'
    - 'CN=ip-172-25-84-226.eu-west-1.compute.internal,dc=example,dc=org'
    - 'CN=filebeat,dc=example,dc=org'

plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.allow_default_init_securityindex: true
cluster.routing.allocation.disk.threshold_enabled: false

Before the integration with LDAP, I had the cluster running fine. But then I had this configured:

plugins.security.ssl.http.pemcert_filepath: ip-172-25-80-6.eu-west-1.compute.internal.pem
plugins.security.ssl.http.pemkey_filepath: ip-172-25-80-6.eu-west-1.compute.internal.key
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.pemcert_filepath: ip-172-25-80-6.eu-west-1.compute.internal.pem
plugins.security.ssl.transport.pemkey_filepath: ip-172-25-80-6.eu-west-1.compute.internal.key
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem

This resulted in a working Cluster with proper synchronisation between the nodes.

So when I then made the changes to make the LDAP integration work, I had to update the pemtrustedcas_filepath to include the ca crt of ldap.example.com. But then I got an error that was discussed/solved in Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target - #2 by DrEdWilliams

So I tried with both the self-signed certificate on ldap.example.com as with a signed certificate on AWS ACM but I keep errors and now the cluster prevents to start correctly.

[2022-06-27T11:44:59,335][WARN ][o.o.d.HandshakingTransportAddressConnector] [ip-172-25-80-6.eu-west-1.compute.internal] handshake failed for [connectToRemoteMasterNode[172.25.84.226:9300]]
org.opensearch.transport.RemoteTransportException: [ip-172-25-84-226.eu-west-1.compute.internal][172.25.84.226:9300][internal:transport/handsshake]
Caused by: org.opensearch.OpenSearchException: Transport client authentication no longer supported.
        at org.opensearch.security.ssl.util.ExceptionUtils.createTransportClientNoLongerSupportedException(ExceptionUtils.java:63) ~[?:?]
        at org.opensearch.security.transport.SecurityRequestHandler.messageReceivedDecorate(SecurityRequestHandler.java:270) ~[?:?]
        at org.opensearch.security.ssl.transport.SecuritySSLRequestHandler.messageReceived(SecuritySSLRequestHandler.java:153) ~[?:?]
        at org.opensearch.security.OpenSearchSecurityPlugin$7$1.messageReceived(OpenSearchSecurityPlugin.java:651) ~[?:?]
        at org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1.messageReceived(RollupInterceptor.kt:118) ~
        at org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler.messageReceived(PerformanceAnalyzerTransportRequestHandler.java:43) ~[?:?]
        at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:103) ~[opensearch-2.0.0.jar:2.0.0]
        at org.opensearch.transport.InboundHandler.handleRequest(InboundHandler.java:249) ~[opensearch-2.0.0.jar:2.0.0]
        at org.opensearch.transport.InboundHandler.messageReceived(InboundHandler.java:132) ~[opensearch-2.0.0.jar:2.0.0]
        at org.opensearch.transport.InboundHandler.inboundMessage(InboundHandler.java:114) ~[opensearch-2.0.0.jar:2.0.0]
        at org.opensearch.transport.TcpTransport.inboundMessage(TcpTransport.java:769) ~[opensearch-2.0.0.jar:2.0.0]
        at org.opensearch.transport.InboundPipeline.forwardFragments(InboundPipeline.java:175) ~[opensearch-2.0.0.jar:2.0.0]

I thought that with the official signed TLS certificate for ldap.example.com that all was fine and no other changes where needed in opensearch.yml. But if I only configure plugins.security.ssl.transport.pemtrustedcas_filepath it requires that also plugins.security.ssl.transport.pemcert_filepath and plugins.security.ssl.transport.pemkey_filepath are set. So, I am not sure how to get it solved right now so that the Opensearch Cluster is able to work with (I hope with the signed certificate) at ldap.example.com.

If any other information is needed, don’t hesitate to ask.
Thanks in advance.

Kind regards,
Werner

@wdijkerman Would you mind sharing results of the below commands?

openssl x509 -in ip-172-25-80-6.eu-west-1.compute.internal.pem -noout -text
openssl x509 -in ip-172-25-84-226.eu-west-1.compute.internal.pem -noout -text

@wdijkerman Just noticed that you’ve used domain DN.
The value of the plugins.security.nodes_dn: is TLS certificate DN (subject)

Sorry, copy/paste issue to remove $WORK specific information. But it contains several O=bla, OU=bla, or dc=bla.

I can, but then I have to remove some $WORK specific information, not sure if that will even work.

But before making the change to have authentication work with LDAP, all worked fine with these TLS Certificates in the cluster.