HELP! Problem with "unable to find valid certification path to requested target" on creating a https opensearch cluster

HI,I meet problem as title. Let me describe it in detail

Version: OpenSearch 2.3.0

issue:
I plan to use 3 ec2 instances to create opensearch cluster with https. 2 are model node, 1 is data node. I named them as model0, model1 and data. When I finish all the steps and start 3 nodes in turn.Evey of them show the “unable to find valid certification path to requested target” error message.

Here is my opensearch.yml file. Beside that,I write a ssh file to generate all of the *.pem files.

Configuration:
model0 generate.sh file(every shell file has same content):

#!/bin/sh
# Root CA
openssl genrsa -out root-ca-key.pem 2048
openssl req -new -x509 -sha256 -key root-ca-key.pem -subj "/C=CN/ST=shanghai/L=shanghai/O=mycompany/OU=ml/CN=root" -out root-ca.pem -days 3650

# Admin cert
openssl genrsa -out admin-key-temp.pem 2048
openssl pkcs8 -inform PEM -outform PEM -in admin-key-temp.pem -topk8 -nocrypt -v1 PBE-SHA1-3DES -out admin-key.pem
openssl req -new -key admin-key.pem -subj "/C=CN/ST=shanghai/L=shanghai/O=mycompany/OU=ml/CN=admin" -out admin.csr
openssl x509 -req -in admin.csr -CA root-ca.pem -CAkey root-ca-key.pem -CAcreateserial -sha256 -out admin.pem -days 3650

# model0 cert
openssl genrsa -out model0-key-temp.pem 2048
openssl pkcs8 -inform PEM -outform PEM -in model0-key-temp.pem -topk8 -nocrypt -v1 PBE-SHA1-3DES -out model0-key.pem
openssl req -new -key model0-key.pem -subj "/C=CN/ST=shanghai/L=shanghai/O=mycompany/OU=ml/CN=ip-172-31-34-109.ap-northeast-1.compute.internal" -out model0.csr
openssl x509 -req -in model0.csr -CA root-ca.pem -CAkey root-ca-key.pem -CAcreateserial -sha256 -out model0.pem -days 3650 


# model1 cert
openssl genrsa -out model1-key-temp.pem 2048
openssl pkcs8 -inform PEM -outform PEM -in model1-key-temp.pem -topk8 -nocrypt -v1 PBE-SHA1-3DES -out model1-key.pem
openssl req -new -key model1-key.pem -subj "/C=CN/ST=shanghai/L=shanghai/O=mycompany/OU=ml/CN=ip-172-31-40-170.ap-northeast-1.compute.internal" -out model1.csr
openssl x509 -req -in model1.csr -CA root-ca.pem -CAkey root-ca-key.pem -CAcreateserial -sha256 -out model1.pem -days 3650


# data cert
openssl genrsa -out data-key-temp.pem 2048
openssl pkcs8 -inform PEM -outform PEM -in data-key-temp.pem -topk8 -nocrypt -v1 PBE-SHA1-3DES -out data-key.pem
openssl req -new -key data-key.pem -subj "/C=CN/ST=shanghai/L=shanghai/O=mycompany/OU=ml/CN=ip-172-31-33-89.ap-northeast-1.compute.internal" -out data.csr
openssl x509 -req -in data.csr -CA root-ca.pem -CAkey root-ca-key.pem -CAcreateserial -sha256 -out data.pem -days 3650

# Cleanup
rm *temp.pem *csr

and opensearch.yml (every yml file is diff in pem file’s name.
example: model0.pem,model1.pem,data.pem) and node.name is diff too.

cluster.name: opensearch-cluster
node.name: model0
node.roles: [ cluster_manager,ml ]
node.processors: 5
path.data: /opt/opensearch-2.3.0/data
path.logs: /opt/opensearch-2.3.0/logs
bootstrap.memory_lock: true
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["172.31.33.89", "172.31.34.109", "172.31.40.170"]
cluster.initial_cluster_manager_nodes: ["172.31.33.89", "172.31.34.109", "172.31.40.170"]
gateway.recover_after_nodes: 3
action.destructive_requires_name: true
node.max_local_storage_nodes: 3

plugins.security.ssl.transport.pemcert_filepath: model0.pem
plugins.security.ssl.transport.pemkey_filepath: model0-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.transport.resolve_hostname: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: model0.pem
plugins.security.ssl.http.pemkey_filepath: model0-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem

plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn:
  - CN=admin,OU=ml,O=mycompany,L=shanghai,ST=shanghai,C=CN
plugins.security.nodes_dn:
  - CN=ip-172-31-34-109.ap-northeast-1.compute.internal,OU=ml,O=mycompany,L=shanghai,ST=shanghai,C=CN
  - CN=ip-172-31-40-170.ap-northeast-1.compute.internal,OU=ml,O=mycompany,L=shanghai,ST=shanghai,C=CN
  - CN=ip-172-31-33-89.ap-northeast-1.compute.internal,OU=ml,O=mycompany,L=shanghai,ST=shanghai,C=CN

plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opendistro-asynchronous-search-response*", ".replication-metadata-store"]

the log as below:

[2022-10-29T18:32:50,740][WARN ][o.o.t.TcpTransport       ] [model0] exception caught on transport layer [Netty4TcpChannel{localAddress=/172.31.34.109:44104, remoteAddress=/172.31.40.170:9300}], closing connection
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:480) ~[netty-codec-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:279) ~[netty-codec-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:623) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:586) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.79.Final.jar:4.1.79.Final]
	at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
	at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:371) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:314) ~[?:?]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:309) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.checkServerCerts(CertificateMessage.java:1357) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.onConsumeCertificate(CertificateMessage.java:1232) ~[?:?]
	at sun.security.ssl.CertificateMessage$T13CertificateConsumer.consume(CertificateMessage.java:1175) ~[?:?]
	at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:396) ~[?:?]
	at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:480) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1277) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:1264) ~[?:?]
	at java.security.AccessController.doPrivileged(AccessController.java:712) ~[?:?]
	at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:1209) ~[?:?]
	at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1549) ~[netty-handler-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1395) ~[netty-handler-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1236) ~[netty-handler-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1285) ~[netty-handler-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:510) ~[netty-codec-4.1.79.Final.jar:4.1.79.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:449) ~[netty-codec-4.1.79.Final.jar:4.1.79.Final]
	... 16 more

Can someone reach out and help me?

Eagerly waiting for a reply~ :sob:

Hey,
Did you try to put the full path of the location of the certificate files?

Example:
plugins.security.ssl.transport.pemcert_filepath: /HERE_PUT_FULL_PATH/model0.pem

Because the opensearch.yml and *.pem files are all under the “/opt/opensearch-2.3.0/config” directory,so I write such as “model0.pem” in opensearch.yml .

And I see “YAML files - OpenSearch documentation”,it also write the file name only without full path.

Then I think it maybe not the key solution,but I can try it to see what will happen.

thanks~

@Simha You have the below in your opensearch.yml.

discovery.seed_hosts: ["172.31.33.89", "172.31.34.109", "172.31.40.170"]
cluster.initial_cluster_manager_nodes: ["172.31.33.89", "172.31.34.109", "172.31.40.170"]

Also, your error reports the IP address and not the FQDN.

This might be the root cause as your certificates do not contain any IP address in SAN.
Try adding the IP addresses to the SAN or using FQDN instead of IP in the discovery.seed_hosts and cluster.initial_cluster_manager_nodes.

sorry, It didn’t work too. :sob:

when I changed it to :

discovery.seed_hosts: [model0,model1,data]
cluster.initial_cluster_manager_nodes:[model0,model1,data]

After reboot,every of 3 nodes show the error message such as below:

[2022-11-06T06:09:32,957][WARN ][o.o.d.SeedHostsResolver  ] [model0] failed to resolve host [model1]
java.net.UnknownHostException: model1
	at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
	at java.net.InetAddress.getAllByName0(InetAddress.java:1519) ~[?:?]
	at java.net.InetAddress.getAllByName(InetAddress.java:1377) ~[?:?]
	at java.net.InetAddress.getAllByName(InetAddress.java:1305) ~[?:?]
	at org.opensearch.transport.TcpTransport.parse(TcpTransport.java:615) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.transport.TcpTransport.addressesFromString(TcpTransport.java:557) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.transport.TransportService.addressesFromString(TransportService.java:1016) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.discovery.SeedHostsResolver.lambda$resolveHostsLists$0(SeedHostsResolver.java:182) ~[opensearch-2.3.0.jar:2.3.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) ~[opensearch-2.3.0.jar:2.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]
[2022-11-06T06:09:32,958][WARN ][o.o.d.SeedHostsResolver  ] [model0] failed to resolve host [data]
java.net.UnknownHostException: data
	at java.net.InetAddress$CachedAddresses.get(InetAddress.java:801) ~[?:?]
	at java.net.InetAddress.getAllByName0(InetAddress.java:1519) ~[?:?]
	at java.net.InetAddress.getAllByName(InetAddress.java:1377) ~[?:?]
	at java.net.InetAddress.getAllByName(InetAddress.java:1305) ~[?:?]
	at org.opensearch.transport.TcpTransport.parse(TcpTransport.java:615) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.transport.TcpTransport.addressesFromString(TcpTransport.java:557) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.transport.TransportService.addressesFromString(TransportService.java:1016) ~[opensearch-2.3.0.jar:2.3.0]
	at org.opensearch.discovery.SeedHostsResolver.lambda$resolveHostsLists$0(SeedHostsResolver.java:182) ~[opensearch-2.3.0.jar:2.3.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) ~[opensearch-2.3.0.jar:2.3.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]

@wjunshen model0, model1, data don’t match your CN or SAN (i.e. ip-172-31-34-109.ap-northeast-1.compute.internal).

If you’d like to use them then you have to add them to certificate’s SAN. Also, your nodes have to be able to resolve these names.

I am sorry to interrupt you @pablo .
I think the Problem here from @wjunshen is more about the System having Problems with the TLS Handshake rather than the Hostname/IP verification as he disabled the Hostname verification.

@wjunshen could you please share your Installation method like did you used the rpm Package or the tar.gz archive?
I assume that your Problem is either not every Node has the same CA Certificate or you use the wrong Private Key to the Certificate.
I can also imagne that you may have to set the Full Path to the Certificates like @Simha thought.
It’s just a test maybe he was right about this.

hi, my friends

@vi4life @Simha @pablo

I used tar.gz method to install opensearch.

And I write a ssh file to generate all of 3 nodes cert. U can see my first post,there are the content of generated ssh file.

@vi4life Nodes must resolve the names of the other nodes in the cluster to be able to communicate with each other. As @wjunshen stated this is tar.gz installation which requires name resolution either with /etc/hosts or an external DNS service. That wouldn’t be the case in the docker environment as docker uses internal DNS to resolve container names.

@wjunshen I strongly suggest fixing the name resolution between your OpenSearch nodes and then fixing SAN in the certificates.

hi,all

I have created the opensearch cluster successfully.

The main reason of this error is I used my “generated.sh” shell file to generate *.pem file at every node. That’s not correct~

I should use the “securityadmin.sh” tool to copy pem file to another two nodes

For example:

I used “generated.sh” shell file to generate *.pem file in model0 node, then I should use “securityadmin.sh” to do operation as below:

cd /opt/opensearch-2.3.0/plugins/opensearch-security/tools/
 
chmod +x securityadmin.sh

# assign host as public IP of model1 node, copy pem files about model1 to model1 node
./securityadmin.sh -h 18.182.46.139 -cd ../../../config/opensearch-security/ -icl -nhnv -cacert ../../../config/root-ca.pem -cert ../../../config/model1.pem -key ../../../config/model1-key.pem

# assign host as public IP of data node, copy pem files about data to data node
./securityadmin.sh -h 35.76.110.67 -cd ../../../config/opensearch-security/ -icl -nhnv -cacert ../../../config/root-ca.pem -cert ../../../config/data.pem -key ../../../config/data-key.pem

But we need to pay attention that the opensearch cluster must be started before using the “securityadmin.sh” tool

After copying pem files to another two nodes. In order to ensure that everything is safe,I restarted the opensearch cluster. But I don’t know whether it can work normally without restarting. I haven’t try on it.

At last, Thank you for your reply and help during these days.

Thank you very much~ :pray: :pray: :pray:

1 Like

@wjunshen Just a short note about securitydamin.sh script. There is no need to run securityadmin.sh against each node in the cluster. The initial run of the securityadmin.sh creates .opendistro_security index that holds all the security plugin configurations and is shared across all the nodes in the cluster.

securityadmi.sh script requires connectivity to the running OpenSearch node on port 9200.

1 Like