What is the optimal node role to receive data from a logstash opensearch-output?

mmf · September 8, 2025, 3:29pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 3.0.0

Describe the issue:

Not an issue per se, but I’m trying to understand if there’s any known best-practice regarding which node roles should be the ones to receive data from an opensearch-output logstash output plugin.

The only information I found was on this elastic forum thread suggesting having the master and coordinator roles as part of the output host pool, and this stackoverflow question which suggests not adding the master role to the pool in order to guarantee that master nodes only perform the necessary important tasks inherit to their role.

But that leads to a bit of a contradiction. I was wondering if anyone has any advice? As of right now we simply have all nodes in the pool of hosts on the output.

Thank you.

Configuration:

Cluster 1 - 9 Data, 3 Cluster Managers, 3 Coordinators
Cluster 2 - 6 Data, 3 Cluster Managers, 3 Coordinators

stecino · September 9, 2025, 1:19pm

I have a set of coordinator nodes that I send my Logstash data to. Fluentd for some reason always picks the first coordinator node, so I have a load balancer VIP with those coordinator nodes in it. That way when request gets to coordinator nodes they can do the load balancing between the data nodes. Same way for clients querying OS, separate set of coordinator nodes behind a load balancer vip with DNS tied to it.

mmf · October 23, 2025, 3:09pm

Hello @stecino,

Thanks for the reply. In my case our coordinator and also master nodes have much lower OS specs than the data nodes. And are much less is quantity as well.

Our current output configuration as of right now outputs to all the nodes in the cluster. It makes sense in theory to send the data to the coordinators so they can fan out to the corresponding nodes as you mentioned, but would we potentially be creating a bottleneck by just sending the output logs to 2 coordinator servers and removing the others, which currently amount to 10 total nodes in the output?

Thank you.

Anthony · October 24, 2025, 12:04pm

@mmf The traffic from logstash should be directed to coordinating nodes, whose job is to receive and process incoming traffic, which then route the traffic to the correct shards on the data nodes. This ensures that the data nodes are solely focused on efficient storing and searching operations and cluster-managers are not processing these requests and the operation run smoothly. You should scale the coordinating nodes in accordance with your requirements. Hope this helps

jakabasej5 · October 26, 2025, 10:26am

Your concern about a bottleneck is valid.

Topic		Replies	Views
Clarification question: To which nodes do I send traffic? OpenSearch discuss , configure , install	1	342	December 22, 2023
Designing OpenSearch Cluster - OpenSearchDashboard should point to which node? Coordinator Or Manager or Data nodes Request For Comments discuss , configure	1	657	October 24, 2022
Should Dashboards and Data Prepper config point to all nodes of OpenSearch cluster? OpenSearch configure	2	40	September 3, 2025
Coordinating nodes are manager nodes OpenSearch troubleshoot , configure	3	483	May 19, 2023
Setting up three-node cluster for high availability OpenSearch configure , install	4	1822	August 12, 2024

What is the optimal node role to receive data from a logstash opensearch-output?

Related topics