Deadloack with 'disk usage exceeded flood-stage watermark'

Mathieu · August 12, 2024, 12:50pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 2.10

Describe the issue:

I’m using OpenSearch through Wazuh. Single node.
I ran out of space on this node. I’ve then stopped and expanded the partition from 30Gb to 100Gb, making the server a 70Gb free disk space.

[o.o.e.NodeEnvironment    ] [node-1] using [1] data paths, mounts [[/ (/dev/sda1)]], net usable_space [72.5gb], net total_space [89.1gb], types [ext4]

Unfortunately, I’m not able to restart the server.

I’m getting this error

[INFO ][o.o.s.c.ConfigurationRepository] [node-1] Wait for cluster to be available ...
[INFO ][o.o.c.s.ClusterSettings  ] [node-1] updating [plugins.index_state_management.template_migration.control] from [0] to [-1]
[INFO ][o.o.a.c.HashRing         ] [node-1] Node added: [p_LMT0O-TOmzGaTlOwEbBg]
[INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [node-1] Detected cluster change event for destination migration
[INFO ][o.o.a.c.HashRing         ] [node-1] Add data node to AD version hash ring: p_LMT0O-TOmzGaTlOwEbBg
[INFO ][o.o.a.c.HashRing         ] [node-1] All nodes with known AD version: {p_LMT0O-TOmzGaTlOwEbBg=ADNodeInfo{version=2.10.0, isEligibleDataNode=true}}
[INFO ][o.o.a.c.HashRing         ] [node-1] Rebuild AD hash ring for realtime AD with cooldown, nodeChangeEvents size 0
[INFO ][o.o.a.c.HashRing         ] [node-1] Build AD version hash ring successfully
[INFO ][o.o.a.c.ADDataMigrator   ] [node-1] Start migrating AD data
[INFO ][o.o.a.c.ADDataMigrator   ] [node-1] AD job index doesn't exist, no need to migrate
[INFO ][o.o.a.c.ADClusterEventListener] [node-1] Init AD version hash ring successfully
[ERROR][o.o.s.l.LogTypeService   ] [node-1] Failed creating LOG_TYPE_INDEX
org.opensearch.cluster.block.ClusterBlockException: index [.opensearch-sap-log-types-config] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];
at org.opensearch.cluster.block.ClusterBlocks.indicesBlockedException(ClusterBlocks.java:243) ~[opensearch-2.10.0.jar:2.10.0]

When I check over the internet, they suggest to pass commands to the cluster, unfortunately, this is not possible for me because the node stops just after the error.

Is there something we can do to make it happy and start again ?

Configuration:

network.host: 127.0.0.1
node.name: node-1
discovery.type: single-node

cluster.name: wazuh

http.port: 9200-9299
transport.tcp.port: 9300-9399
node.max_local_storage_nodes: "3"
path.data: /var/lib/wazuh-indexer
path.logs: /var/log/wazuh-indexer

...

plugins.security.authcz.admin_dn:
- "CN=admin,OU=Wazuh,O=Wazuh,L=California,C=US"
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.nodes_dn:
- "CN=node-1,OU=Wazuh,O=Wazuh,L=California,C=US"
plugins.security.restapi.roles_enabled:
- "all_access"
- "security_rest_api_access"

plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opensearch-observability", ".opendistro-asynchronous-search-response*", ".replication-metadata-store"]

### Option to allow Filebeat-oss 7.10.2 to work ###
compatibility.override_main_response_version: true

Mantas · August 12, 2024, 3:24pm

Mathieu:

[ERROR][o.o.s.l.LogTypeService   ] [node-1] Failed creating LOG_TYPE_INDEX
org.opensearch.cluster.block.ClusterBlockException: index [.opensearch-sap-log-types-config] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];
at org.opensearch.cluster.block.ClusterBlocks.indicesBlockedException(ClusterBlocks.java:243) ~[opensearch-2.10.0.jar:2.10.0]

Hi @Mathieu,

Have you tried adjuting the below (note: all settings in this list are dynamic):

cluster.routing.allocation.disk.watermark.low
cluster.routing.allocation.disk.watermark.high
cluster.routing.allocation.disk.watermark.flood_stage

more here:

Please see how to configure your OpenSearch here (in your case opensearch.yml):

Mathieu · August 12, 2024, 3:49pm

Yes I tried, even the cluster.routing.allocation.disk.threshold_enabled: false

The documentation mentions

This will also remove any existing index.blocks.read_only_allow_delete index blocks when disabled

Unfortunately is has no effect on my side

Is there a way to “cleanup” this index index has read-only-allow-delete block ?

network.host: 127.0.0.1
node.name: node-1
discovery.type: single-node

cluster.name: wazuh

http.port: 9200-9299
transport.tcp.port: 9300-9399
node.max_local_storage_nodes: "3"
path.data: /var/lib/wazuh-indexer
path.logs: /var/log/wazuh-indexer

cluster.routing.allocation.disk.threshold_enabled: false

cluster.routing.allocation.disk.watermark.low: 86%
cluster.routing.allocation.disk.watermark.high: 91%
cluster.routing.allocation.disk.watermark.flood_stage: 96%

plugins.security.ssl.http.pemcert_filepath: /etc/wazuh-indexer/certs/node-1.pem
plugins.security.ssl.http.pemkey_filepath: /etc/wazuh-indexer/certs/node-1-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem
plugins.security.ssl.transport.pemcert_filepath: /etc/wazuh-indexer/certs/node-1.pem
plugins.security.ssl.transport.pemkey_filepath: /etc/wazuh-indexer/certs/node-1-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem
plugins.security.ssl.http.enabled: true
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.transport.resolve_hostname: false

plugins.security.authcz.admin_dn:
- "CN=admin,OU=Wazuh,O=Wazuh,L=California,C=US"
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.nodes_dn:
- "CN=node-1,OU=Wazuh,O=Wazuh,L=California,C=US"
plugins.security.restapi.roles_enabled:
- "all_access"
- "security_rest_api_access"

plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opensearch-observability", ".opendistro-asynchronous-search-response*", ".replication-metadata-store"]

### Option to allow Filebeat-oss 7.10.2 to work ###
compatibility.override_main_response_version: true

Mathieu · August 13, 2024, 5:54am

ok, I finally managed to fix this issue.

For information, I moved away all the plugins in a backup directory, commented out all the related configuration, then I managed to start the server

I ran the command to remove the read only block.

Shutdown the server then reset the plugins and configuration settings.

It seems that there is a plugin that was not happy that.

Thanks !
M.

Topic		Replies	Views
New index went to RED after cleaning up old one OpenSearch troubleshoot , index-management	0	521	October 23, 2023
Open Search Data Insert issue OpenSearch troubleshoot	4	464	July 6, 2023
Memory Leak/Garbage Colector issues OpenSearch	12	380	March 21, 2025
Recovering the OpenSearch cluster when "Data too large" error occurs OpenSearch	2	1276	July 5, 2024
How to reduce node's disk usage? OpenSearch	2	627	August 4, 2023

Deadloack with 'disk usage exceeded flood-stage watermark'

Related topics