Find out if shards full sync or partial sync after node restart or storage reused

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.6.0, Ubuntu 22.04

Describe the issue:
For a cluster of 4 nodes: opensearch-0, opensearch-1, opensearch-2, opensearch-3.

And given the following index:

PUT /data
{"settings": {"number_of_shards": 3, "number_of_replicas": 4}}

I would like to be able to tell for the following 2 scenarios, whether the shards had been fully synced and copied to opensearch-3 of if only an incremental copy happened (only the missing data)

  1. opensearch-3 Node down for 30min, then brought back online
  2. opensearch-3 Node removed, while keeping the storage. Then Node recreated with the previously kept storage attached to this new node.

I tried the recovery and shard API, but it doesn’t seem to differentiate between “full sync” and “differential sync”.

(the shards API states: INITIALIZING: The shard is recovering from a peer shard or gateway. – but unsure whether that means that it’s a partial sync or a full sync)

I also enabled the logs to include anything index and translog related. But I don’t see this specific piece of information I’m looking for.

IndexShard: state: [CREATED]->[RECOVERING]->[POST_RECOVERY]->[STARTED]

Any idea?
Thank you

Use GET <target>/_recovery API, the response contains the recovery type which can tell you this recovery is from local storage(EXISTING_STORE) or other nodes(PEER), and the STAGE field can tell you the recovery progress.

1 Like