Cluster down after typo on search backpressure cluster setting

Important: Don’t try this on a production cluster.

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch v2.7.0

Describe the issue:
I wanted to try out search backpressure in a test cluster, so to enable it, instead of doing

PUT _cluster/settings
{
  "persistent": {
    "search_backpressure.mode": "enforced"
  }
}

I did (note the typo in enforcedd):

PUT _cluster/settings
{
  "persistent": {
    "search_backpressure.mode": "enforcedd"
  }
}

just to check out how it’ll behave. Instead of rejecting it, like any other cluster setting, e.g.:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.enable": "blah"
  }
}

that gives

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Illegal allocation.enable value [BLAH]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Illegal allocation.enable value [BLAH]"
  },
  "status": 400
}

it brought the cluster down, since it cannot apply value enforcedd.
Now the cluster is completely confused (can’t blame it) as it can’t apply the value to the nodes.
Restarts won’t help as I applied the value in persistent settings.
Nodes won’t talk to each other as they are busy trying to apply that wrong value.

Is there any way to remove this persistent setting to allow the cluster to come up again?
Or shall I say goodbye?

Relevant Logs or Screenshots:

[2023-07-24T17:12:51,082][INFO ][o.o.c.s.ClusterSettings  ] [osarally101-sokratis1_master] updating [search_backpressure.mode] from [monitor_only] to [enforcedd]
[2023-07-24T17:12:51,082][WARN ][o.o.c.s.ClusterSettings  ] [osarally101-sokratis1_master] failed to apply settings
java.lang.IllegalArgumentException: Invalid SearchBackpressureMode: enforcedd
        at org.opensearch.search.backpressure.settings.SearchBackpressureMode.fromName(SearchBackpressureMode.java:50) ~[opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.search.backpressure.settings.SearchBackpressureSettings.lambda$new$0(SearchBackpressureSettings.java:117) ~[opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.settings.Setting$Updater.apply(Setting.java:1241) ~[opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.settings.AbstractScopedSettings$SettingUpdater.lambda$updater$0(AbstractScopedSettings.java:696) ~[opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.settings.AbstractScopedSettings.applySettings(AbstractScopedSettings.java:232) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:556) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:484) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:186) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:282) [opensearch-2.7.0.jar:2.7.0]
        at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:245) [opensearch-2.7.0.jar:2.7.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
        at java.lang.Thread.run(Thread.java:833) [?:?]

@spapadop This setting could be reverted by configuring a null value. However, as you’ve noticed, nodes are looped in the reported error which prevents any cluster settings modifications.

OpenSearch node applies the settings in the following order:

  1. Transistent setting
  2. Persistent setting
  3. opensearch.yml
  4. default settings

So if there was a chance of setting the reported cluster setting in the opensearch.yml, it wouldn’t override the persistent setting.

Take a look at this reported bug OpenSearch handling for invalid setting value instead of corrupting the state · Issue #7598 · opensearch-project/OpenSearch · GitHub

There is also a method described which allows to remove an unwanted setting. However, please be aware that the tool reports itself as A CLI tool to do unsafe cluster and index manipulations on current node

many thanks @pablo for the prompt response and solution, indeed I brought the cluster back to life and I’m happy this bug is being followed up.

How have you managed to get a cluster back?

Hi @rlevitsky, I had to run this:

OPENSEARCH_PATH_CONF=/etc/path/to/my/conf /usr/share/opensearch/bin/opensearch-node remove-settings search_backpressure.mode