LDAP login does not work after upgrade from 2.19.1 to 3.1.0

Versions
OpenSearch 3.1.0
OpenSearch Dashboards 3.1.0
Both running in Podman containers

Describe the issue:
After upgrading from 2.19.1 to 3.1.0 our previously working security configuration does not work anymore and users can not login using LDAP for authc.
However using SSO over openid for authc, opensearch sucessfully fetches the ldap roles.
This issue seems to be linked to our configuration, as a completly new cluster with the same configuration also produces the error.
Has anyone also encountered this issue? What could be further debug steps? We have reviewed every single configuration parameter in the security configuration.

Configuration:

_meta:
  type: "config"
  config_version: 2

config:
  dynamic:
    kibana:
      multitenancy_enabled: true
      private_tenant_enabled: false
      server_username: dashboards_system
      index: '.kibana'  
    do_not_fail_on_forbidden: true
    http:
      anonymous_auth_enabled: false
    authc:
      basic_internal_auth_domain:
        description: "Authenticate via HTTP Basic against internal users database"
        http_enabled: true
        transport_enabled: true
        order: 0
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: internal
      openid_auth_domain:
        http_enabled: true
        transport_enabled: true
        order: 1
        http_authenticator:
          type: openid
          challenge: false
          config:
            subject_key: preferred_username
            openid_connect_url: "{{openid_connect_url}}"
            jwt:
              expiry: NOW+1440
            enable_ssl: true
            verify_hostnames: true        
        authentication_backend:
          type: noop
      ldap:
        http_enabled: true
        transport_enabled: true
        order: 2
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: true
            hosts:
              - ldap.internal.com:636
            bind_dn: CN={{ldap_account_name}},OU=accounts,OU=IDM,OU=RS,DC=internal,DC=com
            password: "{{svc_opensearch_pw}}"
            userbase: 'DC=internal,DC=com'
            usersearch: '(sAMAccountName={0})'
            username_attribute: cn
            pemtrustedcas_content: |-
              -----BEGIN CERTIFICATE-----
             ......
    authz:
      ldap:
        http_enabled: true
        transport_enabled: true
        authorization_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: true
            hosts:
              - ldap.internal.com:636
            bind_dn: CN={{ldap_account_name}},OU=accounts,OU=IDM,OU=RS,DC=internal,DC=com
            password: "{{svc_opensearch_pw}}"
            userbase: 'DC=internal,DC=com'
            usersearch: '(sAMAccountName={0})'
            username_attribute: null
            userroleattribute: null
            userrolename: memberOf
            rolesearch_enabled: false
            rolename: cn
            resolve_nested_roles: false
            skip_users:
              - dashboards_system
              - admin
              - logstash
            pemtrustedcas_content: |-

Relevant Logs or Screenshots:
Trying to login using a LDAP account produces the following error:

[2025-07-07T13:59:45,569][ERROR][o.o.s.a.BackendRegistry  ] [devnode1] Cannot retrieve roles for User [name=admin1, backend_roles=[], requestedTenant=null] from ldap due to OpenSearchSecurityException[java.lang.NullPointerException: Cannot invoke "org.ldaptive.Connection.getProviderConnection()" because the return value of "org.ldaptive.SearchOperation.getConnection()" is null]; nested: NullPointerException[Cannot invoke "org.ldaptive.Connection.getProviderConnection()" because the return value of "org.ldaptive.SearchOperation.getConnection()" is null];
org.opensearch.OpenSearchSecurityException: java.lang.NullPointerException: Cannot invoke "org.ldaptive.Connection.getProviderConnection()" because the return value of "org.ldaptive.SearchOperation.getConnection()" is null
        at org.opensearch.security.auth.ldap.backend.LDAPAuthorizationBackend.addRoles(LDAPAuthorizationBackend.java:1012) ~[?:?]
        at org.opensearch.security.auth.BackendRegistry.authz(BackendRegistry.java:620) ~[?:?]
        at org.opensearch.security.auth.BackendRegistry$5.call(BackendRegistry.java:671) ~[?:?]
        at org.opensearch.security.auth.BackendRegistry$5.call(BackendRegistry.java:660) ~[?:?]
        at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4860) ~[?:?]
        at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3551) ~[?:?]
        at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2302) ~[?:?]
        at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2177) ~[?:?]
        at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2068) ~[?:?]
        at com.google.common.cache.LocalCache.get(LocalCache.java:3986) ~[?:?]
        at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4855) ~[?:?]
        at org.opensearch.security.auth.BackendRegistry.authcz(BackendRegistry.java:660) ~[?:?]
        at org.opensearch.security.auth.BackendRegistry.authenticate(BackendRegistry.java:393) ~[?:?]
        at org.opensearch.security.filter.SecurityRestFilter.checkAndAuthenticateRequest(SecurityRestFilter.java:306) ~[?:?]
        at org.opensearch.security.ssl.http.netty.Netty4HttpRequestHeaderVerifier.channelRead0(Netty4HttpRequestHeaderVerifier.java:90) ~[?:?]
        at org.opensearch.security.ssl.http.netty.Netty4HttpRequestHeaderVerifier.channelRead0(Netty4HttpRequestHeaderVerifier.java:37) ~[?:?]
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:107) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1519) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1377) ~[?:?]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1428) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:530) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:469) ~[?:?]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) ~[?:?]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) ~[?:?]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868) ~[?:?]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:796) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:697) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:660) ~[?:?]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) ~[?:?]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998) ~[?:?]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.ldaptive.Connection.getProviderConnection()" because the return value of "org.ldaptive.SearchOperation.getConnection()" is null
        at org.ldaptive.SearchOperation.executeSearch(SearchOperation.java:103) ~[?:?]
        at org.ldaptive.SearchOperation.invoke(SearchOperation.java:85) ~[?:?]
        at org.ldaptive.SearchOperation.invoke(SearchOperation.java:15) ~[?:?]
        at org.ldaptive.AbstractOperation.execute(AbstractOperation.java:126) ~[?:?]
        at org.opensearch.security.auth.ldap.util.LdapHelper$1.run(LdapHelper.java:74) ~[?:?]
        at org.opensearch.security.auth.ldap.util.LdapHelper$1.run(LdapHelper.java:58) ~[?:?]
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) ~[?:?]
        at org.opensearch.security.auth.ldap.util.LdapHelper.search(LdapHelper.java:58) ~[?:?]
        at org.opensearch.security.auth.ldap.util.LdapHelper.lookup(LdapHelper.java:101) ~[?:?]
        at org.opensearch.security.auth.ldap.backend.LDAPAuthorizationBackend.getRoleFromEntry(LDAPAuthorizationBackend.java:1184) ~[?:?]
        at org.opensearch.security.auth.ldap.backend.LDAPAuthorizationBackend.addRoles(LDAPAuthorizationBackend.java:982) ~[?:?]
        ... 53 more
[2025-07-07T13:59:45,571][INFO ][o.o.s.p.PrivilegesEvaluator] [devnode1] No cluster-level perm match for User [name=admin1, backend_roles=[], requestedTenant=null] Resolved [aliases=[*], allIndices=[*], types=[*], originalRequested=[*], remoteIndices=[]] [Action [cluster:monitor/health]] [RolesChecked [own_index]]. No permissions for [cluster:monitor/health]

@mgelszinnis What is your LDAP server?

We are using Microsoft Active Directory.

@mgelszinnis I’ve tested upgrade from 2.19.1 to 3.1.0 with LDAP authentication enabled and I had no issues.
My LDAP server is MS AD 2019.

Did you check your AD with ldapsearch? Can you get roles directly from the AD server?

@pablo Our LDAP is working correctly, as I said, it was working previously on 2.19.1 and our other clusters on 2.18.0 are also working as expected. They are using the same LDAP Server and the same bind_dn as our 3.1.0 cluster.
Using Active Directory Explorer I can also get all necessary fields.

@mgelszinnis Can you test with clean 3.1.0?

@mgelszinnis Do you see any other errors before the reported one?

According to your config you’ve disabled role search in authz.
I’ve tested this option and I’ve got the same result as you did. When I set that to true, all assigned AD groups appeared and there was no errors in the logs.

@pablo I was able to login, when I configured the ldap rolesearch like in “Approach 2” in this documentation: Active Directory and LDAP - OpenSearch Documentation
However, now it seems like “Approach 1” (which we were using before) simply does not work anymore.

For anyone interested here is my configuration now:

_meta:
  type: "config"
  config_version: 2

config:
  dynamic:
    kibana:
      multitenancy_enabled: true
      private_tenant_enabled: false
      server_username: dashboards_system
      index: '.kibana'  
    do_not_fail_on_forbidden: true
    http:
      anonymous_auth_enabled: false
    authc:
      basic_internal_auth_domain:
        description: "Authenticate via HTTP Basic against internal users database"
        http_enabled: true
        transport_enabled: true
        order: 0
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: internal
      openid_auth_domain:
        http_enabled: true
        transport_enabled: true
        order: 1
        http_authenticator:
          type: openid
          challenge: false
          config:
            subject_key: preferred_username
            openid_connect_url: "{{openid_connect_url}}"
            jwt:
              expiry: NOW+1440
            enable_ssl: true
            verify_hostnames: true        
        authentication_backend:
          type: noop
      ldap:
        http_enabled: true
        transport_enabled: true
        order: 2
        http_authenticator:
          type: basic
          challenge: false
        authentication_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: true
            hosts:
              - ldap.internal.com:636
            bind_dn: CN={{ldap_account_name}},OU=accounts,OU=IDM,OU=RS,DC=internal,DC=com
            password: "{{svc_opensearch_pw}}"
            userbase: 'DC=internal,DC=com'
            usersearch: '(sAMAccountName={0})'
            username_attribute: cn
            pemtrustedcas_content: |-
              -----BEGIN CERTIFICATE-----
             ......
    authz:
      ldap:
        http_enabled: true
        transport_enabled: true
        authorization_backend:
          type: ldap
          config:
            enable_ssl: true
            enable_start_tls: false
            enable_ssl_client_auth: false
            verify_hostnames: true
            hosts:
              - ldap.internal.com:636
            bind_dn: CN={{ldap_account_name}},OU=accounts,OU=IDM,OU=RS,DC=internal,DC=com
            password: "{{svc_opensearch_pw}}"
            userbase: 'DC=internal,DC=com'
            usersearch: '(sAMAccountName={0})'
            username_attribute: cn
            userroleattribute: null
            userrolename: memberOf
            rolesearch_enabled: true
            rolesearch: '(member={0})'
            rolebase: "DC=internal,DC=com"
            rolename: cn
            resolve_nested_roles: false
            skip_users:
              - dashboards_system
              - admin
              - logstash
            pemtrustedcas_content: |-

@pablo thank you for your patient support and the idea for the solution