I am needing to set up a fresh OpenSearch install, and I want to use a cluster with nodes on multiple VMs for better availability. According to the docs
There are many ways to design a cluster
So I’m wondering what is the best cluster architecture for my requirements:
High availability for ingest
OpenSearch Dashboards available at a single domain name
Distribution of the data over multiple VMs in case one of them goes down
Some more questions:
How do I decide what roles to give to each node?
Do I need a dedicated coordinating node (vs. a cluster manager) if I have more than one data node?
For high availability, can I ingest documents into any of the data nodes, or must they all go through a certain node?
Which node should I install Dashboards on?
@merlinz01, your nodes will form a cluster which will be managed by the manager cluster (it will be elected from cluster-manager-eligible nodes on your cluster), by default, each node is a cluster-manager-eligible, data, ingest, and coordinating node (you can configure it accordingly more here: Creating a cluster - OpenSearch Documentation).
OS Dashboards are instaled and configured separately from your nodes, please find more here:
# The URLs of the OpenSearch instances to use for all your queries.
opensearch.hosts: ["http://localhost:9200"]
Will all the ingest network traffic have to route through the manager node? Or will the clients be able to load-balance which node they send the logs to? If I send data to various nodes with the ingest role on the same cluster, will that cause problems?
In the opensearch.hosts configuration, do I put all the cluster nodes, or just the manager node?