Describe the issue: the open search pods were coming up fine until yesterday suddenly the pods are crashing stating the root-ca.pem cannot be located or may not have permission. This is on gke
Hello @pablo, yes I used the charts 2.31.0 latest version with demo certs to deploy opensearch cluster on gcp. It was running fine since last 1 week but suddenly pods started crashing since yesterday with root-ca.pem error.
Except the gke kubectl node version upgrade by Google standard maintenance , nothing changed atleast from the open search deployment perspective.
And certs should be present, just to make sure i uninstalled and reinstalled helm release multiple times since yesterday but still seeing the same issue.
@Nagpraveen By “reinstalled” you mean destroying the cluster and deploying a new one?
Do you know what cause restarts? Missing cert shouldn’t cause that as it is read only once during the OpenSearch service start.
Sure will do, mean while I found something in log during pod start up, could this be potential reason why cert isn’t written, just a hint. Marked in blue - says opensearch.yml seems to be already configured for security. Quit
The opensearch installation worked perfectly fine with this values yaml since last week. I wouldnt think I am missing anything critical here. I did not require to modify any other templates.
kubectl exec -it <pod_name> – ls -l config wouldnt run - cannot connect to opensearch container
kubectl -n apollo exec -it opensearch-cluster-master-0 – ls -l config
Defaulted container “opensearch” out of: opensearch, fsgroup-volume (init), configfile (init)
error: Internal error occurred: unable to upgrade connection: container not found (“opensearch”)
@Nagpraveen By default, OpenSearch images contain node, admin, and root certificates and are located inside the OpenSearch config folder.
These certs were created in February last year and are all valid for 10 years.
If for some reason are missing, then maybe the config folder was overwritten by a volume mount.
a) The security section isnt provided as i am using demo certs and used to work fine as i mentioned earlier. as per doc i have overridden only the values.yaml file with required values to make the nodes up and running.
b) I am passing only one values.yaml file to the default opensearch charts via below command:
I reinstalled the open search on a different node pool on same gke cluster, with same value.yaml. This time open search service started running fine and when i checked the logs, root-ca.pem cert was accessible looks like.
It is strange though why the certs were not found/corrputed/altered permissions on the earlier tries with the earlier different nodepool.
Helm release name was different this time. but same values.yaml as above, no changes
I probably know what the bug was, the installer was not running the shell script to set admin password and install certs, note that in the one of the above screen shot in ealier messages, it stated : " opensearch.yaml" seems to be already configured for security. Quit.
But this time the opensearch ran the shell script to set admin passowrd and install the demo certs as shown below:
@pablo : One last question, can you please help me understand on what basis, the demo script assumed opensearch.yam is already set up and did’nt proceed and how to avoid this?
And also, if there is any sample to set up custom certs, could you please share?
let’s say i have .key and .cert instead of .pem, can use them for key and cert? and is root-ca always necessary?
Thanks!
@Nagpraveen I did further testing and I found that install_demo_configuration.sh stops its execution when it finds a security plugin configuration in opensearch.yml. This will also prevent the recreation of demo certs in /usr/share/opensearch/config folder.
I wasn’t aware of that.
I had to delete all the security plugin configurations in opensearch.yml and then install_demo_configuration.sh was completed successfully.
I wouldn’t call it a bug as this script should be used only once to configure demo configuration. The securityadmin.sh script should be used to manage and update the security plugin afterwards.
However, I couldn’t reproduce that by recreating the OpenSearch pod or forcing the OpenSearch pod to restart by restarting the Kubernetes host.
Not sure how you ended up with the same opensearch.yml as it is held in the emptydir volume and not preserved.
Mounts:
/usr/share/opensearch/config/opensearch.yml from config-emptydir (rw,path="opensearch.yml")
Deleting pods or scaling down and up the statefulset of the OpenSearch cluster should fix your issue.
Regarding your second question.
root-ca.pem is always necessary as that is used to validate your node certificates.
Yes you can use .key and .crt files instead
To present custom certificates you need to set the following in values.yml file: