Describe the issue: Is a “dedicated coordinating cluster” to support Cross Cluster Search a best practice or architecture pattern that is widely used or proposed. As I understand “dedicated coordinating cluster” will contain coordinating nodes only (and perhaps Master nodes as well). Their purpose is to just receive a search request and pass it to one or more remote OpenSearch Clusters where data is stored.
Dedicated Coordinating Cluster becomes single access point for client requests and intelligently routes client requests to indices on relevant Clusters. They can be scaled independently of other Clusters as required.
@farman Thanks for the question. I think there is a misunderstand with the word “cluster” in this case. Coordinating nodes should not be running as a separate OpenSearch cluster. The can be one or more (perhaps thats where the word “cluster” causes confusion) of coordinating nodes, managed by the same cluster manager node as the rest of the nodes in the particular OpenSearch cluster. But there is only one OS Cluster in this case. If your question is, is it best practice to use coordinating nodes to prevent bottlenecks for search-heavy workloads including cross cluster search, then the answer is yes, this is recommended best practice.
@Anthony Thanks for your response. As per OpenSearch documentation related to CCS a distinction is made between “coordinating cluster” (the cluster that receives the search request and forwards to the “remote cluster”) and “remote cluster” (the cluster where the request is actually executed). In the “coordinating cluster” it is possible to have all type of nodes. However, for the scenario when there are multiple opensearch clusters say geography-wise where data is stored, then, it makes more sense to have a dedicated coordinating cluster which can query different clusters, combine the results from them and then sends the response to the client, instead of each remote cluster communicating with each other. The below figure highlights this scenario. In such as case it makes sense for the Coordinating Cluster to just have “Coordinater” nodes, right ?
@farman This is a very interesting question and although I have never seen anyone attempt do this, after testing this locally, see my finding below:
(Tested with OpenSearch 2.19.0)
Can a cluster form with no data nodes?
Yes, it can. A cluster with only cluster_manager and remote_cluster_client nodes will form and elect a manager successfully. However, it will permanently sit in red status because system indices (like .opensearch-observability, .kibana etc.) cannot have their shards allocated anywhere. Worth factoring into your monitoring setup since you will have constant red health on the coordinating cluster.
What roles do the nodes actually need?
The minimum viable role for your coordinating cluster nodes is node.roles: [remote_cluster_client]. This is enough to:
Join the cluster
Accept client search requests (coordinating behavior is implicit on all nodes, not a role you set)
Initiate CCS requests to the remote clusters
Without remote_cluster_client, the node cannot reach remote clusters at all, so an empty roles list would break CCS even if it were accepted.
Does CCS actually work through such a node?
Yes. I ran a CCS query against remote-cluster:test-index via a [remote_cluster_client]-only node and got results back correctly.
One more very important caveat - security
If you plan to run with the Security plugin enabled (which is likely in a real deployment), you will hit a bootstrap deadlock: the security plugin stores its config in the .opendistro_security index, which requires a data node to be created. A coordinating cluster with no data nodes cannot initialize security at all. You would either need to disable security on all the clusters as non secured node can’t perform the necessary handshake (which of course is NOT recommended), or include at least one data node solely for system index allocation in the cluster with no data nodes.
So the architecture is viable for an unsecured setup, but in production with security enabled it needs rethinking.
@Anthony Thank you very much for trying this out and for your detailed response. As I understand from your response, if I include a single “data node” for system index allocation within the “coordinating cluster” this should work, right ? Or do you foresee any other issues as well ?
@farman a single data node will work, The cluster will be in yellow state as no replicas will be able to be assigned, but the functionality that you are looking for will indeed work.
Thanks @Anthony . Maybe then best to have 2 or 3 Data Nodes so that replication and redundancy is there. Since it is only security related data that is stored so the resources required for these nodes including storage will be very less.