OpenSearch multiple master cluster configuration

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Using OpenSearch and Dashboards 2.6

Describe the issue:
I am in the process of setting up an OpenSearch single cluster with 3 master nodes and would appreciate some recommendations/ suggestions for improvement :

We have two datacentres that are connected to azure, planning to setup 2 data and 1 dedicated manager node each at both datacentres and one dedicated manager node in azure for quorum.

The flow of logs is from fluent-bit->fluentd->OpenSearch->OpenSearch Dashboard

  • How does one deal with the changing IP of the opensearch manager node(cluster) since the destination opensearch IP has to be hardcoded on fluentd config? My understanding is that when the active manager node fails, another one takes over but then fluentd would still be sending data to the old manager node’s IP which creates a problem.

  • Can the cluster function without a dedicated co-ordinating or ingest nodes? Right now, I have a dev setup with just one manager node and 2 data nodes and it seems to be functioning well.

Relevant Logs or Screenshots:

@Amith Did you have a look at this section in OpenSearch documentation?

Using master nodes as ingest nodes is not the best approach in my opinion.

@pablo I tried going through the documentation but I can’t seem to identify on how to specify the opensearch IP when there are multiple masters. Below is my sample fluentd config :

<match Win.*>
  @type opensearch
  host 10.223.16.11 #IP address of single master node
  port 9200
  user admin
  password admin
  scheme https
  index_name winlog
</match>

If more masters are added, I would assume the configuration file on opensearch master node m1 would look like below, but how can I specify the destination IP on fluentd side?

network.host: 0.0.0.0
discovery.seed_hosts: ["node d1","node d2","node m1", "node m2","node m3"]
cluster.initial_cluster_manager_nodes: ["node m1", "node m2","node m3"]
node.roles: [cluster_manager]

@Amith According to Fluentd’s documentation you can specify multiple OpenSeach nodes in the opensearch output.

Alternatively, you can set reverse proxy in front of the master nodes and use reverse proxy IP address as the output IP address in Fluentd.

Thanks @pablo, I have decided to use a load balancer/ reverse proxy in our setup. I have restructured my cluster to have 8 nodes as shown below. This single cluster will be spread across 2 on-premises datacenters and azure. Does this look like a good starting point for a cluster? I am planning to add more data/ingest nodes as need arises.

Also, while specifying the IP address for this cluster, does it always have to be the master node’s IP addresses? I am referring to the IP address which will be used as the backend pool for the second load balancer, where data will be sent from fluentd. I am assuming this will be the 3 IP address of the master nodes, but wanted to clarify…