CCR and "shard could not be allocated to any of the nodes" problem

Versions: 2.9.

Two sigle-node clusters with demo certifikates. I am trying setup CCR according the doc:
on follower:

PUT _cluster/settings?pretty
{
  "persistent": {
    "cluster": {
      "remote": {
        "my-connection-alias": {
          "seeds": ["77.78.107.141:9300"]
        }
      }
    }
  }
}
PUT _plugins/_replication/restored_leader1_e2/_start?pretty
{
   "leader_alias": "my-connection-alias",
   "leader_index": "restored_leader1",
   "use_roles":{
      "leader_cluster_role": "all_access",
      "follower_cluster_role": "all_access"
   }
}

follower index is created in red state and (according to
GET _plugins/_replication/restored_leader1_e2/_status) after bootstraping end with err:

{
  "status": "FAILED",
  "reason": "shard could not be allocated to any of the nodes",
  "leader_alias": "my-connection-alias",
  "leader_index": "restored_leader1",
  "follower_index": "restored_leader1_e2"
}

number of replica is set to 0, no problem with disk space, no solution with reinstall opensearch.. any idea? thnx.

1 Like

Hi @djanko ,

Can you please share opensearch.yml for both clusters?

1 Like

Sure.
Leader (other lines r default=commented):

######## Start OpenSearch Security Demo Configuration ########
# WARNING: revise all the lines below before you go into production
plugins.security.ssl.transport.pemcert_filepath: esnode.pem
plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: esnode.pem
plugins.security.ssl.http.pemkey_filepath: esnode-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.allow_unsafe_democertificates: true
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn: ['CN=kirk,OU=client,O=client,L=test,C=de']
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: [all_access, security_rest_api_access]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [.plugins-ml-agent, .plugins-ml-config, .plugins-ml-connector,
  .plugins-ml-controller, .plugins-ml-model-group, .plugins-ml-model, .plugins-ml-task,
  .plugins-ml-conversation-meta, .plugins-ml-conversation-interactions, .plugins-ml-memory-meta,
  .plugins-ml-memory-message, .plugins-ml-stop-words, .opendistro-alerting-config,
  .opendistro-alerting-alert*, .opendistro-anomaly-results*, .opendistro-anomaly-detector*,
  .opendistro-anomaly-checkpoints, .opendistro-anomaly-detection-state, .opendistro-reports-*,
  .opensearch-notifications-*, .opensearch-notebooks, .opensearch-observability, .ql-datasources,
  .opendistro-asynchronous-search-response*, .replication-metadata-store, .opensearch-knn-models,
  .geospatial-ip2geo-data*, .plugins-flow-framework-config, .plugins-flow-framework-templates,
  .plugins-flow-framework-state]
node.max_local_storage_nodes: 3
######## End OpenSearch Security Demo Configuration ########
network.host: 0.0.0.0
discovery.type: single-node
plugins.security.disabled: false
path.repo: ["/mnt/backup/"]type or paste code here

follower:

######## Start OpenSearch Security Demo Configuration ########
# WARNING: revise all the lines below before you go into production
plugins.security.ssl.transport.pemcert_filepath: esnode.pem
plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: esnode.pem
plugins.security.ssl.http.pemkey_filepath: esnode-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.allow_unsafe_democertificates: true
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn: ['CN=kirk,OU=client,O=client,L=test,C=de']
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: [all_access, security_rest_api_access]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [.plugins-ml-agent, .plugins-ml-config, .plugins-ml-connector,
  .plugins-ml-controller, .plugins-ml-model-group, .plugins-ml-model, .plugins-ml-task,
  .plugins-ml-conversation-meta, .plugins-ml-conversation-interactions, .plugins-ml-memory-meta,
  .plugins-ml-memory-message, .plugins-ml-stop-words, .opendistro-alerting-config,
  .opendistro-alerting-alert*, .opendistro-anomaly-results*, .opendistro-anomaly-detector*,
  .opendistro-anomaly-checkpoints, .opendistro-anomaly-detection-state, .opendistro-reports-*,
  .opensearch-notifications-*, .opensearch-notebooks, .opensearch-observability, .ql-datasources,
  .opendistro-asynchronous-search-response*, .replication-metadata-store, .opensearch-knn-models,
  .geospatial-ip2geo-data*, .plugins-flow-framework-config, .plugins-flow-framework-templates,
  .plugins-flow-framework-state]
node.max_local_storage_nodes: 3
######## End OpenSearch Security Demo Configuration ########
network.host: 0.0.0.0
discovery.type: single-node
plugins.security.disabled: false

thnx

1 Like

I created user with right roles and run on follower:

curl -XPUT -k -H 'Content-Type: application/json' -u 'replication_user:9l9/u/Ojc91hez1vhuoL+g==' 'https://localhost:9200/_plugins/_replication/restored_leader1_e2/_start?pretty' -d '
{
   "leader_alias": "my-connection-alias",
   "leader_index": "restored_leader1",
   "use_roles":{
      "leader_cluster_role": "cross_cluster_replication_leader_full_access",
      "follower_cluster_role": "cross_cluster_replication_follower_full_access"
   }
}'

as you recommended me, but I got following err:

type "reason" : "no permissions for [indices:admin/plugins/replication/index/start] and User [name=replication_user, backend_roles=[cross_cluster_replication_follower_full_access], requestedTenant=null]"

According to Replication security - OpenSearch Documentation should those authorizations be sufficient(?).
But I fail that the problem with the rights was somehow related to the problem with allocation of Shards.
As admin it goes, but I will end up on the original error.

@djanko I’ve just ran the example provided in the docs using admin account and can confirm that it works as expected, therefore it would seem the issue is not with this configuration.

Can you please execute the following and provide the results:

GET /_cluster/allocation/explain?pretty (on the follower)
GET /restored_leader1_e2/_settings
GET /restored_leader1/_settings

Can you also provide the mappings and sample doc from the leader index so I can try and reproduce.