Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 2.11, deployed with official Helm chart in Kubernetes
Describe the issue:
I have a three node OpenSearch setup in Kubernetes. I created one index that I write into at 2am every night, nothing else is happening on the cluster. One day at 5pm we had a blackout and all pods went down immediately. When the Kubernetes cluster came back on, one container would not start:
{"type": "server", "timestamp": "2023-11-24T14:52:02,526Z", "level": "ERROR", "component": "o.o.b.OpenSearchUncaughtExceptionHandler", "cluster.name": "os", "node.name": "os-mngr-1", "message": "uncaught exception in thread [main]",
"stacktrace": ["org.opensearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/opensearch/data]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?",
...
Usually, I would delete the PVC/disk of that pod, restart it and everything would be running fine (because my index has two replicas). This time I was trying out a more gentle approach, that I eventually want to automate in the Helm chart: deleting the locks before starting the container.
So I deleted the following two files and the container would start without any warnings/errors:
/usr/share/opensearch/data/nodes/0/node.lock
/usr/share/opensearch/data/nodes/0/_state/write.lock
While the brutal approach with deleting the entire disk works like a charme, deleting the lock files leaves some of my indices in a yellow state.
GET _cat/allocation?v
shows me that there are 4 unassigned shards:
shards disk.indices disk.used disk.avail disk.total disk.percent node
18 43.8mb 45.6mb 1.9gb 2gb 2 os-mngr-0
1 208b 40.3mb 1.9gb 2gb 1 os-mngr-1
18 40.1mb 41.9mb 1.9gb 2gb 2 os-mngr-2
4 UNASSIGNED
But my cluster settings are pretty much all on default (such as cluster.routing.allocation.enable):
GET _cluster/settings
{
"persistent": {
"plugins": {
"index_state_management": {
"template_migration": {
"control": "-1"
}
}
}
},
"transient": {}
}
I would expect that by recovering the node (by removed the lock-files) to show all my indices in a green state, or, if it turns out the indices are not up-to-date or borken, to sync it with the other two nodes.
Is my approach not working? Is deleting the disk my only option in this case?
Configuration:
Pretty vanilla configuration through helm
Relevant Logs or Screenshots: