Describe the issue:
My issue is similar to this issue
After a replication job as been paused for a number of hours, stopping it does not delete the task and when attempting to recreate, it states a task is already running.
I do see several paused replication tasks when querying the tasks:
“action”: "indices:admin/plugins/replication/index/pause
However, they’re not cancellable.
“cancellable”: false,
Configuration:
Relevant Logs or Screenshots:
{“error”:{“root_cause”:[{“type”:“resource_already_exists_exception”,“reason”:“task with id {replication:index:index_v1} already exist”}],“type”:“resource_already_exists_exception”,“reason”:“task with id {replication:index:indesx_v1} already exist”},“status”:400
What can i do to restart these replication jobs. This occurs both on single index replication and on autofollow replication.
@ankikala
Any advice. I am trying to start replication with
PUT /_plugins/_replication/eddie-aws-000002/_start?pretty
{ "leader_alias": "connection-to-aws", "leader_index": "eddie-aws-000002", "use_roles":{ "leader_cluster_role":"all_access","follower_cluster_role":"all_access" }}
I Dashbaord I see error:
{
"error" : {
"root_cause" : [
{
"type" : "parse_exception",
"reason" : "request body or source parameter is required"
}
],
"type" : "parse_exception",
"reason" : "request body or source parameter is required"
},
"status" : 400
}
In log I see errors:
[WARN ][o.o.p.PersistentTasksClusterService] [pesmaster-node2] persistent task replication:index:eddie-aws-000002 failed
Jan 5 12:28:54 pesmaster02-spc pesmaster-node2[1664]: java.lang.IllegalArgumentException: this node does not have the remote_cluster_client role
[WARN ][o.o.c.s.ClusterApplierService] [pesmaster-node2] failed to notify ClusterStateListener
Jan 5 12:28:54 pesmaster02-spc pesmaster-node2[1664]: java.lang.IllegalStateException: p must not be null
[ERROR][o.o.r.a.i.TransportReplicateIndexMasterNodeAction] [pesmaster-node2] Failed to trigger replication for eddie-aws-000002 - java.lang.IllegalStateException: Timed out when waiting for persistent task after 30s
I have remote cluster client on other side and I replicate one index, but can´t replicate one more…
@vnovotny98 Looks like the request and logs are not related
Regarding the request, it seems that it is not constructed currently based on the 400 error. If you trying from dev tools, make sure that there are no extra line breaks or try to execute using curl.
Regarding the logs,
java.lang.IllegalArgumentException: this node does not have the remote_cluster_client role
It looks like the remote cluster is not setup correctly.
If you’ve overridden node.roles in opensearch.yml on the follower cluster, make sure it also includes the remote_cluster_client role. Reference: Redirecting…
Also, verify the cluster settings for the remote cluster settings.