I’m unable to upgrade from 2.17.1 to 2.18. I’ve tried several times on different machines and I keep running into the same issue.
The short version is that it won’t start because it’s trying to load an expired SSL certificate. It doesn’t give any indication of where that certificate is, though.
Caused by: java.security.cert.CertificateExpiredException: NotAfter: Wed Dec 15 08:00:00 UTC 2021
at java.base/sun.security.x509.CertificateValidity.valid(CertificateValidity.java:182) ~[?:?]
at java.base/sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:534) ~[?:?]
at java.base/sun.security.x509.X509CertImpl.checkValidity(X509CertImpl.java:507) ~[?:?]
at org.opensearch.security.ssl.config.KeyStoreUtils.validateKeyStoreCertificates(KeyStoreUtils.java:147) ~[?:?]
at org.opensearch.security.ssl.config.TrustStoreConfiguration.createTrustManagerFactory(TrustStoreConfiguration.java:61) ~[?:?]
at org.opensearch.security.ssl.SslConfiguration.lambda$buildServerSslContext$0(SslConfiguration.java:84) ~[?:?]
at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) ~[?:?]
at org.opensearch.security.ssl.SslConfiguration.buildServerSslContext(SslConfiguration.java:73) ~[?:?]
at org.opensearch.security.ssl.SslContextHandler.<init>(SslContextHandler.java:42) ~[?:?]
at org.opensearch.security.ssl.SslContextHandler.<init>(SslContextHandler.java:38) ~[?:?]
at org.opensearch.security.ssl.SslSettingsManager.lambda$buildSslContexts$0(SslSettingsManager.java:96) ~[?:?]
at java.base/java.util.Optional.ifPresentOrElse(Optional.java:196) ~[?:?]
at org.opensearch.security.ssl.SslSettingsManager.buildSslContexts(SslSettingsManager.java:95) ~[?:?]
at org.opensearch.security.ssl.SslSettingsManager.<init>(SslSettingsManager.java:80) ~[?:?]
at org.opensearch.security.ssl.OpenSearchSecuritySSLPlugin.<init>(OpenSearchSecuritySSLPlugin.java:249) ~[?:?]
at org.opensearch.security.OpenSearchSecurityPlugin.<init>(OpenSearchSecurityPlugin.java:318) ~[?:?]
at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62) ~[?:?]
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502) ~[?:?]
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486) ~[?:?]
at org.opensearch.plugins.PluginsService.loadPlugin(PluginsService.java:796) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.plugins.PluginsService.loadBundle(PluginsService.java:744) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.plugins.PluginsService.loadBundles(PluginsService.java:545) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.plugins.PluginsService.<init>(PluginsService.java:197) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.node.Node.<init>(Node.java:523) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.node.Node.<init>(Node.java:450) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:242) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.bootstrap.Bootstrap.setup(Bootstrap.java:242) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:404) ~[opensearch-2.18.0.jar:2.18.0]
at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:181) ~[opensearch-2.18.0.jar:2.18.0]
... 6 more
We just upgraded to 2.17.1 about a week ago and haven’t had any issues with that, but 2.18.0 doesn’t work at all.
@reshippie ,
which environment did you make the cluster running on? (docker/helm or by operator)
As nodes in your cluster connect to each other through transport layer encrypted by TLS, there will be certificates (ex. admin-cert, transport-cert, https-cert…).
I want you to check the validation of certificates, using the below command:
We’re running on bare metal. We use the Debian packages.
Which certificate do you mean? echo "[crt]" | base64 --decode doesn’t work because “[crt]” isn’t base64 encoded.
The cert on both hosts, set for both plugins.security.ssl.transport.pemcert_filepath and plugins.security.ssl.http.pemcert_filepath has
Validity
Not Before: Oct 7 20:29:51 2024 GMT
Not After : Jan 5 20:29:50 2025 GMT
They’re both working fine under 2.17.1, but get the same error about a 3 year old key under 2.18.0.
Is there any way I can determine which file Opensearch is trying to load?
I had the same issue. I am using certificates signed by an own CA. All certificates and the ca-certificate are in their validity range. plugins.security.ssl.transport.pemtrustedcas_filepath and plugins.security.ssl.http.pemtrustedcas_filepath were pointing to the /etc/ssl/certs/ca-certificates.crt file containing all CA certs (including my own CA cert)
I fixed the issue by changing the above values and have them point to my own CA cert only.
Our security plugin certs are signed by Let’s Encrypt. Our API certs are signed by an internal cert, but none of them have 2021 for any part of their validity.
I checked all of the files that the Debian package installs and only found certificates in the test directories, none of which will expire any time soon.
I can’t figure out where the 2021 certificate is coming from, or why it doesn’t exist in 2.17.1.
The cacerts file is copied directly from /usr/share/opensearch/jdk/lib/security/cacerts
The star_backblaze_pet certs are signed by Let’s Encrypt.
The ssl/ca-certificates.crt file is /etc/ssl/certs/ca-certificates.crt with an internal CA tacked on at the end, which is valid and does not have 2021 in the Not Before or Not After fields.
Without that line LDAP authentication doesn’t work.
Unable to connect to ldapserver ldap.internal:636 due to OpenSearchException[Empty file path for plugins.security.ssl.transport.truststore_filepath]. Try next.
@reshippie, got it. This is a separate bug, and I’ll fix it. To address the current issue, you can set the CA certificates for LDAP in the LDAP configuration as described in the documentation: Active Directory and LDAP - OpenSearch Documentation and do not use
The recommendations from @willyborankin and @cwperks worked to allow me to remove the trustore_filepath line from opensearch.yml under 2.17.1, but didn’t change anything with respect to upgrading to 2.18.0. I’m still getting the same error about a certificate that expired in 2021.
I think I figured out where the faulty certificate is coming from. The CA bundle needs to be in a subdirectory of /etc/opensearch, so we cat our internal CA and /etc/ssl/certs/ca-certificates.crt and put that data into /etc/opensearch/ssl/ca-certificates.crt. There are CA certs that our OS (Debian 11) has in its CA bundle that are expired, including 2 with NotAfter: Wed Dec 15 08:00:00 UTC 2021.
Now that we know where the error is coming from, we’re 1 step closer to fixing it. Since we didn’t change anything in our CA bundle, and can consistently reproduce the error, it seems like something must have changed in the handling of CA bundles from 2.17.1 to 2.18.
We are having a very similar issue. It’s the same error and only started occurring after upgrade from 2.16 to 2.18. We did have an old cert on the system, but even after removal of the references in the yml file AND completely deleting it from the system, we are continuing to get this exact same error.
We even wrote a script to go through every certificate one the machine (OS related or not) and check the expiration date and list them all out. There are no other certificates with the exact date in the error except for the one we deleted.
Our guess at this point is that there is some caching in the nodes folder (/var/lib/opensearch/nodes/0). That’s the only thing that makes sense to us right now. The only thing we have found that we have not tried as its destructive is the delete the nodes folder content to reinitialize any caching. The old old concept of delete everything and ingest it again doesn’t work when that’s not a possibility.
Luckily we have only done 1 of out nodes, but without a non destructive solution, we won’t be able to move forward with an upgrade and may have to try to downgrade it to 2.17.1 before the back port.
if anyone has any other suggestions on something we can try to fix this, we’d love to get this figured out.
In my case I had to regenerate the CA bundle, not just remove the certs on disk. Debian maintains a list, in /etc/ca-certificates.conf of what certs will be included in the CA bundle in /etc/ssl/certs/ca-certificates.crt. To regenerate that bundle, I needed to run /usr/sbin/update-ca-certificates, then copy the new bundle to my Opensearch config directory.
To put all of the data into 1 post:
The issue was that the CA bundle contained expired certificates. As of 2.18.0 Opensearch will not start if it finds an expired cert in any file, including the CA bundle.
I identified a number of certs that my OS distribution (Debian 11) includes which have expired.
for i in /usr/share/ca-certificates/mozilla/*; do echo $i; openssl x509 -noout -text -in $i |grep "Not After" | grep 202; done
I then put ! in front of those certificate names in /etc/ca-certificates.conf to have them excluded from the system bundle and regenerated the bundle by running /usr/sbin/update-ca-certificates.
Once that was done I moved the CA bundle to the Opensearch config directory and I was able to upgrade to 2.18.0.