Document/Record count is not matching for leader and follower when CCR Plugin is enabled

Hi Team,

  1. I have established a Cross-Cluster Replication supported environment
    where i could find a mismatch in the document/record count in the indices
    in both follower and the leader clusters. In fact, The document/record
    count in the follower seems to be more sometimes when i try to hit the
    records from _cat/indices. At the same time, When i try to leave the cluster
    for a long time, the data starts to match and settle.

    Example:

    [root@root leader]# curl http://${LEADER}/_cat/indices?v
    health status index                uuid                     pri rep  docs.count  docs.deleted store.size pri.store.size
    green  open   log-test-2022.11.07  2yDUiNCaTQeIiLXWHIA-QQ   1   1    4156        0            2.3mb      1.2
    
    
    [root@root leader]# curl http://${FOLLOWER}/_cat/indices?v
    health status index                uuid                     pri rep  docs.count  docs.deleted store.size pri.store.size
    green  open   log-test-2022.11.07  XF_ebkznSueHHF4x85rZNA   1   1    35          0            2.3mb      1.2
    

But when i try to fetch the total hit counts via the below curl commands, Im
able to see consistent values in both the leader and follower opensearch
clusters. All the commands were run within very short time intervals (within 2 or 3 seconds)

[root@root leader]# curl http://${LEADER}/log-test-2022.11.07/_search?pretty| jq .hits.total.value
% Total    % Received % Xferd  Average Speed   Time  Time   Time  Current Dload  Upload   Total   Spent    Left  Speed
  100 23322  100 23322    0     0  63835      0 --:--:-- --:--:-- --:--:-- 63895
  6636

 [root@root follower]# curl http://${FOLLOWER}/log-test-2022.11.07/_search? 
 pretty | jq .hits.total.value
 % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current Dload  Upload   Total   Spent    Left  Speed
  100 23322  100 23322    0     0  63835      0 --:--:-- --:--:-- --:--:-- 63895
  6636

Is this expected? Does the data takes time to reflect in both the clusters
when we try to use the _cat/indices API?
This mismatch in data was also observed in a single opensearch cluster where the document/record count were totally different when i tried using _cat/indices and <index_name>/_search?pretty APIs

  1. Second Query: When CCR is configured is it mandatory to provide all the nodes of Leader (including cluster_manager, data, ingest nodes) as the seed hosts in the follower settings or just the master node is enough for the replication. Can this be a service name whose endpoints contain the list of Opensearch nodes of the leader cluster or should we mention individual nodes of the leader cluster for the follower to follow the leader.
    Also, Is it mandatory to configure the remote_cluster_client role in the node.roles for all the nodes of the follower (i.e in the ingest, cluster_manager and data nodes)

TIA,
Sanjay

The document/record
count in the follower seems to be more sometimes when i try to hit the
records from _cat/indices

The doc count can be inaccurate sometimes as the new docs are available only after flush. So if follower index docs has been flushed but not leader’s , you might see more count on follower.

For verification, can you do a flush & search and the verify the doc count on both leader and follower?

Hi @ankikala, Any update on the second query in the above mentioned description?