Securityconfig-update Job Ignores caSecret

Describe the issue:

When using the OpenSearch Kubernetes Operator with externally provided TLS certificates (generate: false), the securityconfig-update job (which runs securityadmin.sh) fails with ca.crt not found if the CA certificate is provided via a separate caSecret rather than being bundled inside the main TLS secret.

The CRD documents that tls.transport.secret and tls.http.secret should contain ca.crt, tls.key and tls.crt data, but also states: “If ca.crt is in a different secret, provide it via the caSecret field”. While the OpenSearch node pods correctly mount and use the caSecret for TLS, the securityconfig-update job does not. It only mounts the main TLS secret and expects ca.crt to be present there.

This is a problem when using external cert-manager issuers (e.g., enterprise PKI issuers) that only populate tls.crt and tls.key in the TLS secret and do not include ca.crt. The CA certificate is available in a separate secret, which is exactly the use case caSecret is designed for but the security-admin job doesn’t honor it.

Expected behavior

When caSecret is configured, the securityconfig-update job should mount ca.crt from the caSecret (just like the OpenSearch node pods do), rather than only looking for it in the main TLS secret.

Actual behavior

The securityconfig-update job only mounts the main TLS secret and fails because ca.crt is missing from it. The caSecret field is ignored by the job.

Steps to reproduce

  1. Create a TLS secret with only tls.key and tls.crt (no ca.crt) this is what many enterprise cert-manager issuers produce.
  2. Create a separate secret containing ca.crt.
  3. Configure the OpenSearchCluster CR with generate: false, referencing the TLS secret via secret and the CA secret via caSecret:
apiVersion: opensearch.opster.io/v1
kind: OpenSearchCluster
metadata:
  name: my-cluster
spec:
  security:
    tls:
      transport:
        generate: false
        secret:
          name: my-tls-secret        # contains tls.key + tls.crt only
        caSecret:
          name: my-ca-secret         # contains ca.crt
        nodesDn:
          - "CN=cluster,O=org"
        adminDn:
          - "CN=cluster,O=org"
      http:
        generate: false
        secret:
          name: my-tls-secret
        caSecret:
          name: my-ca-secret
    config:
      adminCredentialsSecret:
        name: my-admin-creds
      adminSecret:
        name: my-tls-secret
  1. OpenSearch node pods start correctly (they honor caSecret).
  2. The securityconfig-update job fails because it mounts only my-tls-secret and looks for ca.crt which doesn’t exist there.

Workaround

We work around this by programmatically patching ca.crt into the main TLS secret before creating the OpenSearchCluster CR. Our operator reads ca.crt from the CA secret and injects it into the TLS secret using a Kubernetes merge patch. This ensures ca.crt is present in the TLS secret by the time the securityconfig-update job runs.

This works but is a brittle workaround it requires coordination to ensure the patch happens before the CR is created, and must handle cert-manager rotation cycles that could overwrite the secret and remove the injected ca.crt.


Configuration:

  • OpenSearch Operator version: 2.8
  • OpenSearch cluster version: 3.5.0
  • Kubernetes version: 1.31
  • TLS setup: generate: false with externally provided certificates via cert-manager
  • cert-manager issuer: Enterprise PKI issuer (does not populate ca.crt in the TLS secret)
  • Deployment: Multi-tenant OpenSearch-as-a-Service platform with per-customer clusters

Relevant CR snippet:

spec:
  security:
    tls:
      transport:
        generate: false
        secret:
          name: os-<ID>-tls-certs    # cert-manager populates tls.key + tls.crt only
        caSecret:
          name: root-ca              # separate secret with ca.crt
        nodesDn: ["CN=<ID>,O=<cID>"]
        adminDn: ["CN=<ID>,O=<cID>"]
      http:
        generate: false
        secret:
          name: os-<ID>-tls-certs
        caSecret:
          name: root-ca
    config:
      adminCredentialsSecret:
        name: os-<ID>-admin-creds
      adminSecret:
        name: os-<ID>-tls-certs

Relevant Logs or Screenshots:

securityconfig-update job failure log:

OpenSearch Security Admin v7
Will connect to localhost:9300 ... done
ERR: Unable to read the file tls-http/ca.crt

Job pod volume mounts (observed):

The securityconfig-update job mounts only the main TLS secret:

volumeMounts:
  - name: tls-http
    mountPath: /certs/tls-http      # contains tls.crt, tls.key — but NO ca.crt
  - name: tls-transport
    mountPath: /certs/tls-transport  # same secret, same issue

volumes:
  - name: tls-http
    secret:
      secretName: os-<ID>-tls-certs   # ← only this secret is mounted
  - name: tls-transport
    secret:
      secretName: os-<ID>-tls-certs

Missing: The caSecret (root-ca) is not mounted in the job pod, even though it’s configured in the CR.

Expected job volume mounts:

volumes:
  - name: tls-http
    secret:
      secretName: os-<ID>-tls-certs
  - name: tls-http-ca          # ← should be added when caSecret is set
    secret:
      secretName: root-ca
  - name: tls-transport
    secret:
      secretName: os-<ID>-tls-certs
  - name: tls-transport-ca     # ← should be added when caSecret is set
    secret:
      secretName: root-ca

Or alternatively, the job should project ca.crt from caSecret into the same mount path alongside tls.crt/tls.key.


Questions:

  1. Is this a known issue? Is there an existing GitHub issue or discussion tracking this behavior where the securityconfig-update job does not honor the caSecret field?

  2. Is there a better workaround? We are currently patching ca.crt into the main TLS secret programmatically before the OpenSearchCluster CR is created. This works but is fragile — cert-manager rotation can overwrite the secret and remove our injected ca.crt, requiring re-injection. Is there a recommended or cleaner way to handle this scenario (e.g., a CR field we’re missing, a job override, or a projected volume approach)?

  3. Is this a bug that needs a fix? The CRD documentation explicitly says “If ca.crt is in a different secret provide it via the caSecret field” — but the securityconfig-update job doesn’t follow this contract. If this is confirmed as a bug, we’d be happy to contribute a fix. A few follow-up questions on that:

    • Which release branch should we target for the fix (e.g., main, a specific release branch)?
    • Are there contribution guidelines or a CONTRIBUTING.md we should follow for the opensearch-k8s-operator repo?
    • Would this be scoped to just the securityconfig-update job, or are there other jobs/components in the operator that also need to honor caSecret?

Follow-up: prior-art check

I did some digging to confirm whether this is a known issue or has already been fixed, and wanted to share what I found in case it helps anyone else hitting this.

On the current main branch of opensearch-k8s-operator, the gap still exists. The securityconfig-update job is built in opensearch-operator/pkg/reconcilers/securityconfig.go by calling builders.NewSecurityconfigUpdateJob(...) and passing in r.reconcilerContext.Volumes / VolumeMounts. In the generate: false path, those volumes are driven by the admin/TLS secret. I could not find any code that appends a separate volume for tls.transport.caSecret or tls.http.caSecret into the job pod spec. The ApplyAllYmlCmdTmpl command template still hard-codes ca.crt as a path inside the single-mounted TLS secret directory.

This specific bug does not appear to have been filed before. I went through the open issues and found several that are adjacent but none that pin down this exact symptom:

  • Issue #623tls.http.caSecret: {} with generate: true breaks securityadmin init on OpenSearch 1.3. Related area (caSecret ↔ securityadmin), different root cause.
  • Issue #540 — SIGSEGV when caSecret.name: null is passed with generate: true. Null-handling bug.
  • Issue #941 — securityconfig-update job subPath mounts don’t pick up secret updates. Same job, different mount bug.
  • Issue #1054 — OpenShift chmod failure in the same job pod. Unrelated to caSecret.
  • Discussion #424 — user with generate: false hit “not yet initialized” and worked around it by bundling ca.crt into the TLS secret, which is essentially the same workaround many of us are using. The maintainer’s first instinct was to check the securityconfig-update pod logs — which is where this bug surfaces — but nobody ever named it as a distinct contract violation.
  • Forum post 11802 (the 2022 “ca-cert not being picked up during deployment” thread) — the same structural pattern surfacing at a different layer: caSecret wasn’t honored on the node pods either back then. That’s since been fixed for node pods but not for this job.

So the pattern across several years is that caSecret wiring has been an afterthought in multiple places, and the recurring community workaround has been “just bundle ca.crt into the main TLS secret.” That doesn’t work cleanly when the TLS secret is owned by cert-manager with an external issuer that doesn’t populate ca.crt.

I’m going to file a GitHub issue against opensearch-project/opensearch-k8s-operator [#1396] to track this properly and will link it back here once it’s open. Happy to take a crack at the fix as well. The pattern is already implemented correctly for the node pods, so mirroring that logic into the securityconfig reconciler’s volume assembly should be a relatively contained change.

If anyone else has landed on a cleaner workaround that survives cert-manager rotation, I’d love to hear it.