On every manager node restart geting OpenSearch Security not initialized

Versions OpenSearch 3.3.2

Describe the issue:

On every manager node restart getting lot of errors:

[2025-11-11T10:18:42,763][ERROR][o.o.s.a.BackendRegistry  ] [os-mn02] OpenSearch Security not initialized. (you may need to run securityadmin)

After a few minutes (10-20), the error disappears and queries can be submitted using accounts without a certificate (LDAP). This is due to the absence of LDAP configuration. Internal users connecting with a certificate work.

As a workaround is to resend security configuration:

docker exec -it os-mn01 sh plugins/opensearch-security/tools/securityadmin.sh -cd config/opensearch-security/ -icl -nhnv -cacert config/root-ca.pem -cert config/admin.pem -key config/admin-key.pem
Security Admin v7
Will connect to localhost:9200 ... done
Connected as "CN=admin,EMAILADDRESS=..."
OpenSearch Version: 3.3.2
Contacting opensearch cluster 'opensearch' and wait for YELLOW clusterstate ...
Clustername: os
Clusterstate: GREEN
Number of nodes: 22
Number of data nodes: 19
.opendistro_security index already exists, so we do not need to create one.
Populate config from /usr/share/opensearch/config/opensearch-security
Will update '/config' with config/opensearch-security/config.yml
   SUCC: Configuration for 'config' created or updated
Will update '/roles' with config/opensearch-security/roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update '/rolesmapping' with config/opensearch-security/roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update '/internalusers' with config/opensearch-security/internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update '/actiongroups' with config/opensearch-security/action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Will update '/tenants' with config/opensearch-security/tenants.yml
   SUCC: Configuration for 'tenants' created or updated
Will update '/nodesdn' with config/opensearch-security/nodes_dn.yml
   SUCC: Configuration for 'nodesdn' created or updated
Will update '/audit' with config/opensearch-security/audit.yml
   SUCC: Configuration for 'audit' created or updated
Will update '/allowlist' with config/opensearch-security/allowlist.yml
   SUCC: Configuration for 'allowlist' created or updated
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 9 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"],"updated_config_size":9,"message":null} is 9 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","actiongroups","config","internalusers"]) due to: null
Done with success

It’s quite strange that not all shards are the same size.

.opendistro_security                                                 0     p      STARTED                   10.33.15.7  os-dn07h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.8  os-dn08h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.13 os-dn13h
.opendistro_security                                                 0     r      STARTED        9   80.5kb 10.33.15.15 os-dn15h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.1  os-dn01h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.18 os-dn18h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.12 os-dn12h
.opendistro_security                                                 0     r      STARTED                   10.33.15.19 os-dn19h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.14 os-dn14h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.4  os-dn04h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.9  os-dn09h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.5  os-dn05h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.2  os-dn02h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.10 os-dn10h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.16 os-dn16h
.opendistro_security                                                 0     r      STARTED        9   80.6kb 10.33.15.3  os-dn03h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.11 os-dn11h
.opendistro_security                                                 0     r      STARTED        9  103.6kb 10.33.15.6  os-dn06h
.opendistro_security                                                 0     r      STARTED        9   80.6kb 10.33.15.17 os-dn17h

I also tried deleting all replicas, but after initializing and restarting manager node, the same problem persisted.

@joelp can you provide the following information please (redact any sensitive details):

  1. config.yml
  2. How is this deployed (helm, RPM, etc) and where (k8s, ).
  3. opensearch.yml
  4. Are you seeing the same behaviour using only basic_auth?

It seems that timeouts could be the cause. When I create an internal user or role, I get a timeout. However, the user is created.

Failed to create role
Failed to create role. You may refresh the page to retry or see browser console for more information.

Failed to save tester
Failed to save tester. You may refresh the page to retry or see browser console for more information.

And securityadmin returns me timeouts too:

docker exec -it os-mn01 sh plugins/opensearch-security/tools/securityadmin.sh -cd config/opensearch-security/ -icl -nhnv -cacert config/root-ca.pem -cert config/admin.pem -key config/admin-key.pem
Security Admin v7
Will connect to localhost:9200 ... done
Connected as "CN=admin,..."
OpenSearch Version: 3.3.2
Contacting opensearch cluster 'opensearch' and wait for YELLOW clusterstate ...
Clustername: os
Clusterstate: GREEN
Number of nodes: 22
Number of data nodes: 19
.opendistro_security index already exists, so we do not need to create one.
Populate config from /usr/share/opensearch/config/opensearch-security
Will update '/config' with config/opensearch-security/config.yml
   FAIL: Configuration for 'config' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/roles' with config/opensearch-security/roles.yml
   FAIL: Configuration for 'roles' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/rolesmapping' with config/opensearch-security/roles_mapping.yml
   FAIL: Configuration for 'rolesmapping' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/internalusers' with config/opensearch-security/internal_users.yml
   FAIL: Configuration for 'internalusers' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/actiongroups' with config/opensearch-security/action_groups.yml
   FAIL: Configuration for 'actiongroups' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/tenants' with config/opensearch-security/tenants.yml
   FAIL: Configuration for 'tenants' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/nodesdn' with config/opensearch-security/nodes_dn.yml
   FAIL: Configuration for 'nodesdn' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/audit' with config/opensearch-security/audit.yml
   FAIL: Configuration for 'audit' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
Will update '/allowlist' with config/opensearch-security/allowlist.yml
   FAIL: Configuration for 'allowlist' failed because of java.io.IOException: Timeout due to inactivity (30000 MILLISECONDS)
ERR: cannot upload configuration, see errors above

Yesterday, the only solution was to restart all manager nodes, otherwise it was impossible to get rid of the error “OpenSearch Security not initialized. (you may need to run securityadmin)”.

For test purpose, I created internal user tester. During the aforementioned errors, this user’s queries proceeded without issue.

All nodes are deployed in docker (docker compose).

Here is my opensearch.yml

plugins.security.authcz.admin_dn:
 - 'CN=admin,...'
plugins.security.nodes_dn:
  - 'CN=os-*,...'

plugins.security.ssl.transport.pemcert_filepath: node.pem
plugins.security.ssl.transport.pemkey_filepath: node-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false

plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: node.pem
plugins.security.ssl.http.pemkey_filepath: node-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem

  
plugins.security.ssl.http.clientauth_mode: "OPTIONAL"
plugins.security.allow_default_init_securityindex: true

plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opendistro-asynchronous-search-response*"]

docker-compose.yml

services:
  opensearch-node:
    container_name: ${hostname}
    image: opensearchproject/opensearch:3.3.2
    hostname: ${hostname}-pod
    deploy:
      restart_policy:
        condition: any
        delay: 5s
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    environment:
      - cluster.name=os
      - node.name=${hostname}
      - node.roles=${nodeRoles}
      - node.attr.temp=${nodeTemp}
      - node.attr.rack_id=${RackID}
      - discovery.seed_hosts=10.33.15.41,10.33.15.42,10.33.15.43
      - cluster.initial_master_nodes=10.33.15.41,10.33.15.42,10.33.15.43
      - bootstrap.memory_lock=true
      - "OPENSEARCH_JAVA_OPTS=-Xms50g -Xmx50g"
      - network.host=${networkHost}
      - http.host=${httpHost}
      - plugins.query.datasources.encryption.masterkey=${pluginsQueryDatasourcesEncryptionMasterkey}
      - "DISABLE_INSTALL_DEMO_CONFIG=true"
      - TZ=Europe/Prague
    volumes:
      - /OpenSearch:/usr/share/opensearch/data
      - ./certs/root-ca.pem:/usr/share/opensearch/config/root-ca.pem
      - ./certs/${hostname}.pem:/usr/share/opensearch/config/node.pem
      - ./certs/${hostname}-key.pem:/usr/share/opensearch/config/node-key.pem
      - ./certs/admin.pem:/usr/share/opensearch/config/admin.pem
      - ./certs/admin-key.pem:/usr/share/opensearch/config/admin-key.pem
      - ./opensearch.yml:/usr/share/opensearch/config/opensearch.yml
      - ./securityconfig/config.yml:/usr/share/opensearch/config/opensearch-security/config.yml
      - ./securityconfig/internal_users.yml:/usr/share/opensearch/config/opensearch-security/internal_users.yml
      - ./securityconfig/roles_mapping.yml:/usr/share/opensearch/config/opensearch-security/roles_mapping.yml
      - ./securityconfig/roles.yml:/usr/share/opensearch/config/opensearch-security/roles.yml
      - ./securityconfig/tenants.yml:/usr/share/opensearch/config/opensearch-security/tenants.yml
      - ./securityconfig/action_groups.yml:/usr/share/opensearch/config/opensearch-security/action_groups.yml
      - ./securityconfig/nodes_dn.yml:/usr/share/opensearch/config/opensearch-security/nodes_dn.yml
      - ./securityconfig/whitelist.yml:/usr/share/opensearch/config/opensearch-security/whitelist.yml
    network_mode: host
    logging:
        driver: fluentd
        options:
            fluentd-address: 127.0.0.11:24224
            tag: "pod.opensearch.syslog"
            fluentd-async: "true"

.env of one menager node

hostname=os-mn01
networkHost=10.33.15.41
httpHost=localhost,10.33.15.41,10.55.20.83
nodeRoles=cluster_manager,ingest
pluginsQueryDatasourcesEncryptionMasterkey=...
nodeTemp=
RackID=Rack2-2

@joelp It would appear that the propagation of the security index state might be lagging, Can you confirm how much RAM is on these nodes? I can see you have assigned 50GB for heap, this should not be over 30-32GB usually, and should not be more that 50% of the total RAM available otherwise you will start running into problems.

Also, the security index is currently set to auto expand, which is the default. You can try to reduce this to 3-4 replicas for your testing, using below command, to see if this will resolve the inconsistent size:

docker exec -it os-mn01 \
  /usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh \
  -cd /usr/share/opensearch/config/opensearch-security \
  -icl -nhnv \
  -cacert /usr/share/opensearch/config/root-ca.pem \
  -cert /usr/share/opensearch/config/admin.pem \
  -key /usr/share/opensearch/config/admin-key.pem \
  -er <number_of_replicas>

Good point. Previously, the documentation mentioned half of the RAM, up to a maximum of 32GB. Now I only see “Sets the size of the Java heap (we recommend half of system RAM).”. So I decided to increase this value. Every node has 117GB. Unfortunately, I don’t remember whether these problems occurred before or after the heap increase. I will definitely try going back to 30GB. My idea was that by increasing the RAM, I would increase the query cache.

What do you think, does it make sense to run two pods on one node? I would like to utilize the available RAM more.

@joelp I would recommend to first run 1 pod per node and making sure the performance issues are fully resolved. Then if you really want to utilize the available nodes, adding the second pod, but still maintaining the 50% limit, so 2 pods at 28GB heap (56GB total < 50% of 117GB).

I did a few restarts and it looks good. Thanks for the advice.

1 Like