Opensearch Observability, Security indices go unassigned after maintenance

Version:

{
  "name": "opensearch-dev-c1-master-105-132",
  "cluster_name": "cluster-c1",
  "cluster_uuid": "f1PkJj3ZShGeMIJeNgz9Yw",
  "version": {
    "distribution": "opensearch",
    "number": "2.6.0",
    "build_type": "deb",
    "build_hash": "7203a5af21a8a009aece1474446b437a3c674db6",
    "build_date": "2023-02-24T18:58:38.730482464Z",
    "build_snapshot": false,
    "lucene_version": "9.5.0",
    "minimum_wire_compatibility_version": "7.10.0",
    "minimum_index_compatibility_version": "7.0.0"
  },
  "tagline": "The OpenSearch Project: https://opensearch.org/"
}

I have 6 data nodes, 3 nodes in a zone ( 2 zones in total ).
I tried to shutdown a zones for testing HA ( it works fine ), after that I tried to start the zone which i just shutdown and i have 3 unassigned shard which is:

.opensearch-observability ( 1 unassigned shard )
.opendistro_security ( 2 unassigned shard )

Allocation Explain:

  "index": ".opensearch-observability",
  "shard": 0,
  "primary": false,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "NODE_LEFT",
    "at": "2023-04-28T05:45:40.672Z",
    "details": "node_left [a4cCIqilSwmQOvirGAcqHA]",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "52UfYA7BTu-ftyLUQYD1fg",
      "node_name": "opensearch-dev-c1-data-94-34",
      "transport_address": "10.5.94.34:9300",
      "node_attributes": {
        "zone": "zoneNTL",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "there are too many copies of the shard allocated to nodes with attribute [zone], there are [3] total configured shard copies for this shard id and [4] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
        }
      ]
    }

Cluster Setting

{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "awareness": {
            "attributes": "zone",
            "force": {
              "zone": {
                "values": [
                  "zoneA",
                  "zoneB"
                ]
              }
            }
          }
        }
      }
    },
    "plugins": {
      "index_state_management": {
        "template_migration": {
          "control": "-1"
        }
      }
    }
  },
  "transient": {}
}

I tried to run cluster reroute but not working. What I can do ?

The allocation explain shows that allocation didn’t work on the remaining nodes (because you have forced awareness). I’m thinking:

  • did it work later? Maybe shards were unassigned for a while due to Delaying allocation when a node leaves | Elasticsearch Guide [8.7] | Elastic
  • did it not work once you brought the nodes back up for other reasons? In that case, can you post the whole output of the allocation explain call? I think what you already posted is only from one of the nodes that didn’t get restarted
1 Like

yeah, I did start the node I shut off for testing HA. This is just for testing. I guess I have to restart one by one :cry:

Update: my miss configuration.