Opensearch cluster in RED state with SSL Hand exception

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.14.0

Describe the issue:
OPensearch cluster build in ECK, using helm chart, showing RED status when run below query in Devtools
GET /_cluster/health?pretty
{
“cluster_name”: “opensearch-cluster”,
“status”: “yellow”,
“timed_out”: false,
“number_of_nodes”: 12,
“number_of_data_nodes”: 8,
“discovered_master”: true,
“discovered_cluster_manager”: true,
“active_primary_shards”: 617,
“active_shards”: 859,
“relocating_shards”: 0,
“initializing_shards”: 0,
“unassigned_shards”: 392,
“delayed_unassigned_shards”: 0,
“number_of_pending_tasks”: 0,
“number_of_in_flight_fetch”: 0,
“task_max_waiting_in_queue_millis”: 0,
“active_shards_percent_as_number”: 68.66506794564349
}
Image attached as from monitoring of the cluster.(OSForum_topic2.png)

In between i could see warm nodes logs showing error as below .(OSForum_topic2_NOdelog.png)

Configuration:

Relevant Logs or Screenshots:


Hi @Deepa, Have you checked if your SSL certificates are still valid? When did that start happening (immediately after deploying the cluster or some time later)?

best,
mj

How can i check whether certificates are valid?
Cluster when built initially didnt had this problem, for past 1 month this is happening

Where is your cluster deployed?

EKS, ECS?

best,
mj

@Deepa, you could try something as per below:

curl --verbose --insecure -u admin:<password> https://<OpenSearch_node_FQDN_or_IP>:9200

the output should look something like:

best,
mj

Its EKS

I dont have terminal access to run this in any of the pods
We have inbuolt ArgoCD application where we deply our opensearch cluster, for that PRODuction terminal access is not there.
Any other option ?

@Deepa, You do not need to run it on the pods you can run it on any machine that has curl and access to the https://<OpenSearch_node_FQDN_or_IP>:9200

Can you access your pods with kubectl?

curl --verbose --insecure -u admin:<password> https://<OpenSearch_node_FQDN_or_IP>:9200

best,
mj


I got like this in one of the node

When am running this in docker container , am getting below result

Hi @Deepa,

Can you share your opensearch.yml content?

best,
mj

cluster.name: opensearch-cluster
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: opensearch-cluster-discovery
plugins.security.ssl.http.enabled_protocols=[TLSv1: TLSv1.1, TLSv1.2, TLSv1.3]
plugins.security.ssl.transport.enabled_protocols: [TLSv1, TLSv1.1, TLSv1.2, TLSv1.3]
network.bind_host: 0.0.0.0
cluster.initial_master_nodes: opensearch-cluster-bootstrap-0
compatibility.override_main_response_version: true
plugins.security.audit.type: internal_opensearch
plugins.security.authcz.admin_dn:

  • “CN=admin”
  • “CN=admin,OU=opensearch-cluster”
  • “OU=opensearch-cluster,CN=admin,”
  • “OU=opensearch-cluster,CN=opensearch-cluster”
    plugins.security.check_snapshot_restore_write_privileges: true
    plugins.security.enable_snapshot_restore_privilege: true
    plugins.security.nodes_dn:
  • “CN=opensearch-cluster”
  • “CN=opensearch-cluster,OU=opensearch-cluster”
    plugins.security.restapi.roles_enabled: [“all_access”, “security_rest_api_access”]
    plugins.security.ssl.http.enabled: true
    plugins.security.ssl.http.pemcert_filepath: cert/opensearch-node.pem
    plugins.security.ssl.http.pemkey_filepath: cert/opensearch-node-key.pem
    plugins.security.ssl.http.pemtrustedcas_filepath: cert/root-ca.pem
    plugins.security.ssl.transport.enforce_hostname_verification: false
    plugins.security.ssl.transport.resolve_hostname: false
    plugins.security.ssl.transport.pemcert_filepath: cert/opensearch-node.pem
    plugins.security.ssl.transport.pemkey_filepath: cert/opensearch-node-key.pem
    plugins.security.ssl.transport.pemtrustedcas_filepath: cert/root-ca.pem
    plugins.security.allow_unsafe_democertificates: false
    Plugins.security.allow_default_init_securityindex: true
    plugins.security.system_indices.enabled: true
    plugins.security.system_indices.indices: [“.opendistro-alerting-config”,“.opendistro-alerting-alert*”,“.opendistro-anomaly-results*”,“.opendistro-anomaly-detector*”,“.opendistro-anomaly-checkpoints”,“.opendistro-anomaly-detection-state”,“.opendistro-reports-“,”.opendistro-notifications-”,“.opendistro-notebooks”,“.opensearch-observability”,“.opendistro-asynchronous-search-response*”,“.replication-metadata-store”]
    indices.query.bool.max_clause_count: 1500

Hi @Deepa, have you had any progress on this?

Could you try (on each node):

curl -XGET "http://localhost:9200/_nodes" (instead of …/_cat/_nodes)

I would like you compare the output with:

and

best,
mj