Within a server, I have a node with data role, let’s say data-node1 with below setting in opensearch.yml:
path.data: "/var/lib/opensearch/data"
Now, I want to stop data-node1, bootstrap another node (e.g. data-node2) and add the same configuration to it, so that it “picks up the work” from where data-node1 has left it:
path.data: "/var/lib/opensearch/data"
That isn’t possible as it creates a conflict since the path has already been used by another data node. Is there any way to achieve this, other than setting a new data path for data-node2 and drain the old node towards that new data path?
I could have used node.max_local_storage_nodes but firstly it’s deprecated and secondly this isn’t really what I want to achieve.
Is that what your referring? or is this Data-Path on a separt/different shared volume?
If not, then each Opensearch data path is there own. the way they share information is by clustering. you would have a master/leader nodes and data-nodes.
Thanks for the response. I had a typo on my above post, fixed now. Let me try to elaborate.
Here are the contents of /var/lib/opensearch/data:
-rw-r--r--. 1 opensearch opensearch 5 Jun 13 09:15 batch_metrics_enabled.conf
-rw-r--r--. 1 opensearch opensearch 5 Jun 13 09:15 logging_enabled.conf
drwxr-xr-x. 3 opensearch opensearch 15 Jun 13 09:15 nodes
-rw-r--r--. 1 opensearch opensearch 5 Jun 13 09:15 performance_analyzer_enabled.conf
-rw-r--r--. 1 opensearch opensearch 5 Jun 13 09:15 rca_enabled.conf
-rw-r--r--. 1 opensearch opensearch 5 Jun 13 09:15 thread_contention_monitoring_enabled.conf
A happy data-node1 is writing data to it, business as usual.
Now, I want to stop data-node1 and retire it. Then, I bootstrap a new node: data-node2, who should continue from where data-node1 is left, i.e. read all the data on /var/lib/opensearch/data, “declare” to the cluster that all respective shards now belong to data-node2 and continue business as usual.
The actual motivation behind is that these /var/lib/opensearch/data data may live in a cephfs or s3 cluster and just be mounted on a host. If the above could work, it will enable us to change backend hosts transparently, i.e. kill a rhel8 host that had mounted these data and bootstrap a rhel9 host, mount the same path and continue working happily.
What worries me is that metadata regarding the old data-node1 live within that data and my understanding is that those metadata cannot (at the moment) be simply reset by a new data-node2 connecting to the data path. And probably for a good reason. But it would be nice to explore such possibilities to support use-cases like the one I mention above.
Let me know if something remains unclear and again thanks a lot for the discussion.
Ok i understand now, to use the same index/data as Node-1 for node-2 it would be best to create a snapshot. Then upload it into node-2 and start services.
Second option
As for bootstrap only thing I can think of is Opensearch node-1 had two volumes like this.