Startup probe failed for opensearch cluster pods

Raphy10 · November 9, 2025, 2:52pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

OS:  Talos Linux 1.8.0

(base) raphy@raohy:~/.talos/openmetadata$ helm search repo opensearch
NAME                                   	CHART VERSION	APP VERSION	DESCRIPTION
opensearch-operator/opensearch-cluster 	3.1.0        	2.8.0      	A Helm chart for OpenSearch Cluster
opensearch-operator/opensearch-operator	2.8.0        	2.8.0      	The OpenSearch Operator Helm chart for Kubernetes

Describe the issue:

(base) raphy@raohy:~/.talos/openmetadata/opensearch$ git clone https://github.com/

opensearch-project/opensearch-k8s-operator.git

Configuration:

(base) raphy@raohy:~/.talos/openmetadata$ nano opensearch/opensearch-k8s-operator/opensearch-operator/examples/2.x/omd-os-cluster.yaml :

#Minimal configuration of a cluster with version 2.X of the operator.
#Note the replacement of 'master' role with 'cluster_manager' on line 49
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  name: omd-os-cluster
  namespace: default
spec:
  security:
    config:
    tls:
       http:
         generate: true
       transport:
         generate: true
         perNode: true
  general:
    httpPort: 9200
    serviceName: omd-os-cluster
    version: 2.14.0
    pluginsList: ["repository-s3"]
    drainDataNodes: true
  dashboards:
    tls:
      enable: true
      generate: true
    version: 2.14.0
    enable: true
    replicas: 1
    resources:
      requests:
         memory: "512Mi"
         cpu: "200m"
      limits:
         memory: "512Mi"
         cpu: "200m"
  nodePools:
    - component: masters
      replicas: 3
      resources:
         requests:
            memory: "8Gi"
            cpu: "1000m"
         limits:
            memory: "8Gi"
            cpu: "1000m"
      roles:
        - "data"
        - "cluster_manager"
      persistence:
         emptyDir: {}

OpenSearch pods have all startup probe failed status :

(base) raphy@raohy:~/.talos/openmetadata$ kubectl get pods -o wide
NAME                                                      READY   STATUS             RESTARTS          AGE   IP            NODE            NOMINATED NODE   READINESS GATES
omd-os-cluster-bootstrap-0                                0/1     Running            388 (5m25s ago)   26h   10.244.1.69   talos-mrt-ge0   <none>           <none>
omd-os-cluster-dashboards-5b9fbdfd45-hhjkg                0/1     Running            323 (3m16s ago)   26h   10.244.1.73   talos-mrt-ge0   <none>           <none>
omd-os-cluster-masters-0                                  0/1     CrashLoopBackOff   410 (2m15s ago)   26h   10.244.1.68   talos-mrt-ge0   <none>           <none>
omd-os-cluster-securityconfig-update-jp8g8                0/1     Unknown            0                 26h   <none>        talos-mrt-ge0   <none>           <none>
opensearch-operator-controller-manager-7448949c9b-gcwph   2/2     Running            118 (3m33s ago)   46h   10.244.1.70   talos-mrt-ge0   <none>           <none>
postgres-65d7c9cb49-wmswv                                 1/1     Running            41 (67m ago)      2d    10.244.1.71   talos-mrt-ge0   <none>           <none>


(base) raphy@raohy:~/.talos/openmetadata$ kubectl describe pod omd-os-cluster-dashboards-5b9fbdfd45-hhjkg
Name:             omd-os-cluster-dashboards-5b9fbdfd45-hhjkg
Namespace:        default
Priority:         0
Service Account:  default
Node:             talos-mrt-ge0/37.59.120.237
Start Time:       Sat, 08 Nov 2025 12:18:24 +0100
Labels:           opensearch.cluster.dashboards=omd-os-cluster
                  pod-template-hash=5b9fbdfd45
Annotations:      checksum/dashboards.yml: 58dd97503c53a4255035e77f9df02ad465b99af8
Status:           Running
IP:               10.244.1.73
IPs:
  IP:           10.244.1.73
Controlled By:  ReplicaSet/omd-os-cluster-dashboards-5b9fbdfd45
Containers:
  dashboards:
    Container ID:  containerd://0b657473617926d36fd4fd0c4bcb0c77d49a4a12ca8a453a528d419fb5e343e0
    Image:         docker.io/opensearchproject/opensearch-dashboards:2.14.0
    Image ID:      docker.io/opensearchproject/opensearch-dashboards@sha256:94a42c94e179d8acbef4afc516d88686bb7424086279238c72cae2d03b64b081
    Port:          5601/TCP
    Host Port:     0/TCP
    Command:
      /bin/bash
      -c
      ./opensearch-dashboards-docker-entrypoint.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sun, 09 Nov 2025 14:59:46 +0100
      Finished:     Sun, 09 Nov 2025 15:03:06 +0100
    Ready:          False
    Restart Count:  323
    Limits:
      cpu:     200m
      memory:  512Mi
    Requests:
      cpu:     200m
      memory:  512Mi
    Liveness:  http-get https://:5601/api/reporting/stats delay=10s timeout=5s period=20s #success=1 #failure=10
    Startup:   http-get https://:5601/api/reporting/stats delay=10s timeout=5s period=20s #success=1 #failure=10
    Environment:
      OPENSEARCH_HOSTS:     https://omd-os-cluster.default.svc.cluster.local:9200
      SERVER_HOST:          0.0.0.0
      OPENSEARCH_USERNAME:  kibanaserver
      OPENSEARCH_PASSWORD:  kibanaserver
    Mounts:
      /usr/share/opensearch-dashboards/certs from tls-cert (rw)
      /usr/share/opensearch-dashboards/config/opensearch_dashboards.yml from dashboards-config (rw,path="opensearch_dashboards.yml")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jl4dp (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  tls-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  omd-os-cluster-dashboards-cert
    Optional:    false
  dashboards-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      omd-os-cluster-dashboards-config
    Optional:  false
  kube-api-access-jl4dp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Guaranteed
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From     Message
  ----     ------     ----                   ----     -------
  Normal   Killing    64m                    kubelet  Container dashboards failed startup probe, will be restarted
  Normal   Pulled     64m (x2 over 67m)      kubelet  Container image "docker.io/opensearchproject/opensearch-dashboards:2.14.0" already present on machine
  Normal   Created    64m (x2 over 67m)      kubelet  Created container dashboards
  Normal   Started    64m (x2 over 67m)      kubelet  Started container dashboards
  Warning  Unhealthy  38m (x6 over 66m)      kubelet  Startup probe failed: Get "https://10.244.1.73:5601/api/reporting/stats": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  18m (x34 over 67m)     kubelet  Startup probe failed: Get "https://10.244.1.73:5601/api/reporting/stats": dial tcp 10.244.1.73:5601: connect: connection refused
  Warning  BackOff    7m54s (x100 over 47m)  kubelet  Back-off restarting failed container dashboards in pod omd-os-cluster-dashboards-5b9fbdfd45-hhjkg_default(e2848957-7fb9-4d7f-9072-6e00fbd5fbd0)
  Warning  Unhealthy  4m5s (x82 over 66m)    kubelet  Startup probe failed: HTTP probe failed with statuscode: 503

What’s wrong with this configuration? How to make it work?

Relevant Logs or Screenshots:

Anthony · November 10, 2025, 10:58am

@Raphy10 I’ve just tested your config and it works as expected. Are you able to see any of the logs from the failing pods?

One of the reasons the pods might be failing is you need to increase the max_map_count, for example on minikube this can be done using following command:

minikube ssh 'sudo sysctl -w vm.max_map_count=262144'

Raphy10 · November 10, 2025, 3:53pm

Hi Anthony .

Thank you for your kind suggestion. After increasing the max_map_count parameter of each of the two nodes (control-plane and worker) of my micro kubernetes cluster, the opensearch cluster pods seem now working fine.

But I do not understand the Status “Completed” of the pod omd-os-cluster-securityconfig-update-f887q :

(base) raphy@raohy:~/.talos/openmetadata$ kubectl get pods
NAME                                                      READY   STATUS      RESTARTS      AGE
omd-os-cluster-dashboards-5b9fbdfd45-xknns                1/1     Running     0             8m25s
omd-os-cluster-masters-0                                  0/1     Running     0             92s
omd-os-cluster-masters-1                                  1/1     Running     0             6m31s
omd-os-cluster-masters-2                                  1/1     Running     0             4m27s
omd-os-cluster-securityconfig-update-f887q                0/1     Completed   0             9m3s
opensearch-operator-controller-manager-7448949c9b-6kzlm   2/2     Running     1 (28m ago)   29m
(base) raphy@raohy:~/.talos/openmetadata$ kubectl describe pod omd-os-cluster-securityconfig-update-f887q
Name:             omd-os-cluster-securityconfig-update-f887q
Namespace:        default
Priority:         0
Service Account:  default
Node:             talos-mrt-ge0/37.59.120.237
Start Time:       Mon, 10 Nov 2025 16:41:03 +0100
Labels:           batch.kubernetes.io/controller-uid=6cee4f9f-6b6c-44f5-b1e2-80e938b2b8b6
                  batch.kubernetes.io/job-name=omd-os-cluster-securityconfig-update
                  controller-uid=6cee4f9f-6b6c-44f5-b1e2-80e938b2b8b6
                  job-name=omd-os-cluster-securityconfig-update
Annotations:      <none>
Status:           Succeeded
IP:               10.244.1.138
IPs:
  IP:           10.244.1.138
Controlled By:  Job/omd-os-cluster-securityconfig-update
Containers:
  updater:
    Container ID:  containerd://1b32f33e3c999e62acc3e78e0bf51515c8e91ee8aee09cc091743f695399cdec
    Image:         docker.io/opensearchproject/opensearch:2.14.0
    Image ID:      docker.io/opensearchproject/opensearch@sha256:466a49f379bb8889af29d615475e69b7b990898c6987d28470cd7105df9046ff
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
    Args:
      ADMIN=/usr/share/opensearch/plugins/opensearch-security/tools/securityadmin.sh;
      chmod +x $ADMIN;
      until curl -k --silent https://omd-os-cluster.default.svc.cluster.local:9200;
      do
      echo 'Waiting to connect to the cluster'; sleep 120;
      done;count=0;
      until $ADMIN -cacert /certs/ca.crt -cert /certs/tls.crt -key /certs/tls.key -cd /usr/share/opensearch/config/opensearch-security -icl -nhnv -h omd-os-cluster.default.svc.cluster.local -p 9200 || (( count++ >= 20 ));
      do
      sleep 20;
      done;
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Mon, 10 Nov 2025 16:41:03 +0100
      Finished:     Mon, 10 Nov 2025 16:43:09 +0100
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /certs from admin-cert (rw)
      /usr/share/opensearch/config/tls-http from http-cert (rw)
      /usr/share/opensearch/config/tls-transport from transport-cert (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mcg2k (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   False 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  transport-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  omd-os-cluster-transport-cert
    Optional:    false
  http-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  omd-os-cluster-http-cert
    Optional:    false
  admin-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  omd-os-cluster-admin-cert
    Optional:    false
  kube-api-access-mcg2k:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  9m15s  default-scheduler  Successfully assigned default/omd-os-cluster-securityconfig-update-f887q to talos-mrt-ge0
  Normal  Pulled     9m15s  kubelet            Container image "docker.io/opensearchproject/opensearch:2.14.0" already present on machine
  Normal  Created    9m15s  kubelet            Created container updater

(base) raphy@raohy:~/.talos/openmetadata$ kubectl logs omd-os-cluster-securityconfig-update-f887q
Waiting to connect to the cluster
OpenSearch Security not initialized.**************************************************************************
** This tool will be deprecated in the next major release of OpenSearch **
** https://github.com/opensearch-project/security/issues/1755           **
**************************************************************************
Security Admin v7
Will connect to omd-os-cluster.default.svc.cluster.local:9200 ... done
Connected as "CN=admin,OU=omd-os-cluster"
OpenSearch Version: 2.14.0
Contacting opensearch cluster 'opensearch' and wait for YELLOW clusterstate ...
Clustername: omd-os-cluster
Clusterstate: GREEN
Number of nodes: 2
Number of data nodes: 1
.opendistro_security index does not exists, attempt to create it ... done (0-all replicas)
Populate config from /usr/share/opensearch/config/opensearch-security/
Will update '/config' with /usr/share/opensearch/config/opensearch-security/config.yml 
   SUCC: Configuration for 'config' created or updated
Will update '/roles' with /usr/share/opensearch/config/opensearch-security/roles.yml 
   SUCC: Configuration for 'roles' created or updated
Will update '/rolesmapping' with /usr/share/opensearch/config/opensearch-security/roles_mapping.yml 
   SUCC: Configuration for 'rolesmapping' created or updated
Will update '/internalusers' with /usr/share/opensearch/config/opensearch-security/internal_users.yml 
   SUCC: Configuration for 'internalusers' created or updated
Will update '/actiongroups' with /usr/share/opensearch/config/opensearch-security/action_groups.yml 
   SUCC: Configuration for 'actiongroups' created or updated
Will update '/tenants' with /usr/share/opensearch/config/opensearch-security/tenants.yml 
   SUCC: Configuration for 'tenants' created or updated
Will update '/nodesdn' with /usr/share/opensearch/config/opensearch-security/nodes_dn.yml 
   SUCC: Configuration for 'nodesdn' created or updated
Will update '/whitelist' with /usr/share/opensearch/config/opensearch-security/whitelist.yml 
   SUCC: Configuration for 'whitelist' created or updated
Will update '/audit' with /usr/share/opensearch/config/opensearch-security/audit.yml 
   SUCC: Configuration for 'audit' created or updated
Will update '/allowlist' with /usr/share/opensearch/config/opensearch-security/allowlist.yml 
   SUCC: Configuration for 'allowlist' created or updated
SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"]) due to: null
SUCC: Expected 10 config types for node {"updated_config_types":["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"],"updated_config_size":10,"message":null} is 10 (["allowlist","tenants","rolesmapping","nodesdn","audit","roles","whitelist","actiongroups","config","internalusers"]) due to: null
Done with success

Anthony · November 14, 2025, 12:04pm

@Raphy10 this is expected, this pod configures the security and certificates, and once completed exists.

Topic		Replies	Views
Opensearch configure using helm char OpenSearch	16	454	September 19, 2025
Opensearch issue with helm chart on IPV4 Cluster: StartupProbe Failed OpenSearch discuss , troubleshoot , configure , install	6	91	October 30, 2025
Opensearch cluster pod crash OpenSearch	7	334	September 23, 2025
OpenSearch Dashboards is NOT Deployed after the cluster(master 3, data2) configuration finished OpenSearch troubleshoot , configure , install , security-issue	7	426	July 22, 2024
Opensearch fails to start with default configuration (demo install) OpenSearch	5	2934	February 2, 2025

Startup probe failed for opensearch cluster pods

Related topics