Hello everyone, I’m testing data replication with some test data.
So in the leader cluster. I’m using this to insert test data
I did setup autofollow for every index
POST _plugins/_replication/_autofollow?pretty
{
"leader_alias" : "leader-cluster",
"name": "replication-from-leader",
"pattern": "*",
"use_roles":{
"leader_cluster_role": "all_access",
"follower_cluster_role": "all_access"
}
}
So my problem here, when I used elasticsearch test data to insert data into OpenSearch in the leader cluster with the command:
python3 es_test_data.py --es-url=http://my-ip:9200 --username=admin --password=admin --index_name=my-index-0 --batch_size=1000 --count=50000
I run it 3 times, the first and second time, it seems fine, but when I hit the third time, the document doesn’t increase to 150000. It remains 100000 in the leader cluster
green open my-index-0 cO44ybVmSc21ukqTInKLoA 1 1 100000 0 19.3mb 9.6mb
But in follow cluster:
green open my-index-0 k6HYqjf7Szyxg2jviLZsMw 1 1 119000 0 21.5mb 10.6mb
And log appears in the leader cluster
[2022-03-31T18:29:53,831][INFO ][o.o.c.m.MetadataCreateIndexService] [opensearch-c1-leader-dev] [my-index-0] creating index, cause [auto(bulk api)], templates [], shards [1]/[1]
[2022-03-31T18:29:54,157][INFO ][o.o.c.m.MetadataMappingService] [opensearch-c1-leader-dev] [my-index-0/cO44ybVmSc21ukqTInKLoA] create_mapping [test_type]
[2022-03-31T18:29:54,711][INFO ][o.o.c.r.a.AllocationService] [opensearch-c1-leader-dev] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[my-index-0][0]]]).
[2022-03-31T18:30:07,843][INFO ][o.o.c.s.IndexScopedSettings] [opensearch-c1-leader-dev] [my-index-0] updating [index.translog.generation_threshold_size] from [64mb] to [32mb]
[2022-03-31T18:30:07,908][INFO ][o.o.c.s.IndexScopedSettings] [opensearch-c1-leader-dev] [my-index-0] updating [index.translog.generation_threshold_size] from [64mb] to [32mb]
[2022-03-31T18:30:07,908][INFO ][o.o.c.s.IndexScopedSettings] [opensearch-c1-leader-dev] [my-index-0] updating [index.plugins.replication.translog.retention_lease.pruning.enabled] from [false] to [true]
After 15 minutes or even higher, it shows the correct number of documents I put in both leader and cluster
I have 3 nodes with the same master and data spec, each node has 1G heap ( This is just for fast testing )
Index setting ( seem like auto-created )
{
"my-index-0" : {
"settings" : {
"index" : {
"number_of_shards" : "1",
"translog" : {
"generation_threshold_size" : "32mb"
},
"plugins" : {
"replication" : {
"translog" : {
"retention_lease" : {
"pruning" : {
"enabled" : "true"
}
}
}
}
},
"provided_name" : "qwe1",
"creation_date" : "1648725841414",
"number_of_replicas" : "1",
"uuid" : "sCZtIR4gT9St6l3y_37AYg",
"version" : {
"created" : "135247827"
}
}
}
}
}
Test with other index , this time i put 50k document each time and do it for 4 times
Both of leader and follow show this
green open os-1 16G6gZ_ZSoCzBz_SbpPhlg 1 1 100000 0 23mb 11.1mb
But when i Check replication status
{
"status" : "SYNCING",
"reason" : "User initiated",
"leader_alias" : "leader-cluster",
"leader_index" : "os-1",
"follower_index" : "os-1",
"syncing_details" : {
"leader_checkpoint" : 199999,
"follower_checkpoint" : 149999,
"seq_no" : 149999
}
}
Hmmm, interesting. Why leader show 100000 but the leader checkpoint is 19999?
I’m not sure this is a bug or need, will report if I get to confirm this is a bug
P/s: I haven’t test without replication