New index went to RED after cleaning up old one

manoj · October 23, 2023, 10:00am

Versions: OpenSearch 1.2.4

Configuration:

2 Data nodes with 5GB storage on each node
ISM policy to delete index test-index-* if min_size > 3GB.

Describe the issue:

Hi,
I’m running OpenSearch 1.2.4 with ISM plugin, 2 data nodes and 5GB data storage on each node.
I’ve created ISM policy to delete index test-index-* if primary shard size crosses 3GB.

Once the index test-index-1 crossed 3GB, ISM started it’s transition.
But it took long time to delete the index even the job_interval was set to 1 minute.
Due to this delay, index size increased and crossed 4GB and disk space crossed 90% with error logs saying high disk watermark [90%] exceeded on [-M9VQsyxSlKaSwz66EDvIQ][node-0][/opt/opensearch/data/nodes/0] free: 249.7mb[5%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete.

As soon as ISM deleted the index test-index-1, new index with same name test-index-1 created since logstash is configured to send the logs to index test-index-1.
But the newly created index test-index-1 went to RED state saying after allocating [[test-index-1][2], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (9.6% free), preventing allocation.

The new index stuck in RED state forever. No new logs are ingesting into the cluster.
Can anyone help me to understand the cause of this scenario?

Relevant Logs or Screenshots:

[node-1] high disk watermark [90%] exceeded on [-M9VQsyxSlKaSwz66EDvIQ][node-0][/opt/opensearch/data/nodes/0] free: 250.4mb[5%], shards will be relocated away from this node; currently relocating away shards totalling [0] bytes; the node is expected to continue to exceed the high disk watermark when these relocations are complete
[node-1] low disk watermark [85%] no longer exceeded on [-M9VQsyxSlKaSwz66EDvIQ][node-0][/opt/opensearch/data/nodes/0] free: 1.9gb[40.2%]
[node-1] releasing read-only-allow-delete block on indices: [[.opendistro-ism-config, test-index-1, .opendistro-job-scheduler-lock]]
[node-1] JobSweeper started listening to operations on index .opendistro-ism-config
[node-1] [test-index-1/71OqucRrRpCkC5dgF1gYpA] deleting index
[node-1] [test-index-1] creating index, cause [auto(bulk api)], templates [default], shards [5]/[1]"
[node-1] after allocating [[test-index-1][2], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (9.6% free), preventing allocation
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8% free), preventing allocation
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8.7% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9.7% free), preventing allocation
[node-1] Cluster health status changed from [YELLOW] to [RED] (reason: [index [test-index-1] created]).
[node-1] Index [test-index-1] matched ISM policy template and will be managed by policy_3
[node-1] [test-index-1/earVkmofReelu-PiWvhVwg] create_mapping [_doc]
[node-1] [test-index-1/earVkmofReelu-PiWvhVwg] update_mapping [_doc]
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8% free), preventing allocation
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8.7% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9.7% free), preventing allocation
[node-1] after allocating [[test-index-1][1], node[null], [R], recovery_source[peer recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8.1% free), preventing allocation
[node-1] after allocating [[test-index-1][2], node[null], [R], recovery_source[peer recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8% free), preventing allocation
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8% free), preventing allocation
[node-1] after allocating [[test-index-1][3], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8.7% free), preventing allocation
[node-1] after allocating [[test-index-1][4], node[null], [P], recovery_source[new shard recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[deciders_no]]] node [-M9VQsyxSlKaSwz66EDvIQ] would have more than the allowed 10% free disk threshold (9.7% free), preventing allocation
[node-1] after allocating [[test-index-1][1], node[null], [R], recovery_source[peer recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8.1% free), preventing allocation
[node-1] after allocating [[test-index-1][2], node[null], [R], recovery_source[peer recovery], s[UNASSIGNED], unassigned_info[[reason=INDEX_CREATED], at[2023-10-16T23:37:53.369Z], delayed=false, allocation_status[no_attempt]]] node [6k_6bdRfRQGA6nnnVqUqDQ] would have more than the allowed 10% free disk threshold (8% free), preventing allocation
[node-1] No Old History Indices to delete
[node-1] No Old History Indices to delete

Topic		Replies	Views
Index state management abnormal executed OpenSearch troubleshoot	4	155	June 28, 2024
Failed Transition Index Management	24	7699	November 26, 2020
SM shrink works, but source index isn’t deleted afterward: how should I delete the original? OpenSearch	20	134	September 10, 2025
ISM config help without data stream or daily base index naming OpenSearch	8	551	February 21, 2023
Deadloack with 'disk usage exceeded flood-stage watermark' OpenSearch	3	2030	August 13, 2024

New index went to RED after cleaning up old one

Related topics