Document/Record count is not matching for leader and follower when CCR Plugin is enabled

sanjay · November 7, 2022, 8:56am

Hi Team,

I have established a Cross-Cluster Replication supported environment
where i could find a mismatch in the document/record count in the indices
in both follower and the leader clusters. In fact, The document/record
count in the follower seems to be more sometimes when i try to hit the
records from _cat/indices. At the same time, When i try to leave the cluster
for a long time, the data starts to match and settle.

Example:

[root@root leader]# curl http://${LEADER}/_cat/indices?v
health status index                uuid                     pri rep  docs.count  docs.deleted store.size pri.store.size
green  open   log-test-2022.11.07  2yDUiNCaTQeIiLXWHIA-QQ   1   1    4156        0            2.3mb      1.2


[root@root leader]# curl http://${FOLLOWER}/_cat/indices?v
health status index                uuid                     pri rep  docs.count  docs.deleted store.size pri.store.size
green  open   log-test-2022.11.07  XF_ebkznSueHHF4x85rZNA   1   1    35          0            2.3mb      1.2

But when i try to fetch the total hit counts via the below curl commands, Im
able to see consistent values in both the leader and follower opensearch
clusters. All the commands were run within very short time intervals (within 2 or 3 seconds)

[root@root leader]# curl http://${LEADER}/log-test-2022.11.07/_search?pretty| jq .hits.total.value
% Total    % Received % Xferd  Average Speed   Time  Time   Time  Current Dload  Upload   Total   Spent    Left  Speed
  100 23322  100 23322    0     0  63835      0 --:--:-- --:--:-- --:--:-- 63895
  6636

 [root@root follower]# curl http://${FOLLOWER}/log-test-2022.11.07/_search? 
 pretty | jq .hits.total.value
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
  100 23322  100 23322    0     0  63835      0 --:--:-- --:--:-- --:--:-- 63895
  6636

Is this expected? Does the data takes time to reflect in both the clusters
when we try to use the _cat/indices API? This mismatch in data was also observed in a single opensearch cluster where the document/record count were totally different when i tried using _cat/indices and <index_name>/_search?pretty APIs

Second Query: When CCR is configured is it mandatory to provide all the nodes of Leader (including cluster_manager, data, ingest nodes) as the seed hosts in the follower settings or just the master node is enough for the replication. Can this be a service name whose endpoints contain the list of Opensearch nodes of the leader cluster or should we mention individual nodes of the leader cluster for the follower to follow the leader.
Also, Is it mandatory to configure the remote_cluster_client role in the node.roles for all the nodes of the follower (i.e in the ingest, cluster_manager and data nodes)

TIA,
Sanjay

ankikala · November 14, 2022, 4:58am

The document/record
count in the follower seems to be more sometimes when i try to hit the
records from _cat/indices

The doc count can be inaccurate sometimes as the new docs are available only after flush. So if follower index docs has been flushed but not leader’s , you might see more count on follower.

For verification, can you do a flush & search and the verify the doc count on both leader and follower?

sanjay · November 18, 2022, 10:06am

Hi @ankikala, Any update on the second query in the above mentioned description?

Topic		Replies	Views
Cross Cluster Replication not working OpenSearch troubleshoot	0	38	December 28, 2024
Indices deleted in leader cluster are not deleted in follower cluster Cross-Cluster Replication	2	854	November 3, 2023
Cross cluster replication autofollow Cross-Cluster Replication configure	1	574	March 31, 2022
[ERROR] Can't start cross cluster replication Cross-Cluster Replication troubleshoot	13	2179	April 25, 2022
Cross Cluster Replication issue in K8s cluster OpenSearch	0	191	May 14, 2024

Document/Record count is not matching for leader and follower when CCR Plugin is enabled

Related topics