Hello,
I’m running OpenSearch on Kubernetes using the OpenSearch Operator and I’m currently stuck after renewing TLS certificates. I’d appreciate confirmation on the correct recovery procedure.
Environment
-
OpenSearch version: 2.11.1
-
Deployment method: OpenSearch Operator (Helm)
-
Kubernetes managed via Rancher
-
TLS certificates generated by the operator (
security.tls.http.generate: true,security.tls.transport.generate: true) -
Security configuration provided via
securityconfig-secret -
LDAP + internal auth configured
-
Multiple node pools: masters (3), datas, coordinators, mls
What happened
-
One of the TLS certificates expired (PKIX / CertificateExpired errors).
-
Pods started failing SSL handshakes and cluster communication.
-
I deleted the expired TLS secrets (http/transport), and the operator correctly regenerated them with valid dates.
-
After that, all OpenSearch pods start but remain in
Running(not Ready). -
Logs on the master show repeatedly:
Not yet initialized (you may need to run securityadmin) ClusterManagerNotDiscoveredException
Current state
-
opensearch-operator-controller-manageris running fine. -
The Job
opensearch-cluster-securityconfig-updateexists but is Completed (1/1) and does not rerun. -
No OpenSearch pod ever becomes Ready.
-
Cluster health remains
unknown.
Important detail
The security configuration is fully defined in securityconfig-secret and referenced in the OpenSearchCluster CR:
security:
config:
adminCredentialsSecret: admin-credentials-secret
securityConfigSecret: securityconfig-secret
No dynamic security changes were made outside this secret.
Question
Is the correct and safe recovery step to:
-
Delete the completed Job
opensearch-cluster-securityconfig-update -
Let the operator recreate and rerun it to re-apply the same security configuration
-
Allow the cluster to initialize again
I want to confirm that:
-
Deleting this Job will not wipe users/roles beyond what is defined in
securityconfig-secret -
This is the expected procedure after TLS renewal when the cluster is no longer initialized
Any confirmation or recommended best practice would be very helpful.
Thanks in advance.