Not an issue per se, but I’m trying to understand if there’s any known best-practice regarding which node roles should be the ones to receive data from an opensearch-output logstash output plugin.
The only information I found was on this elastic forum thread suggesting having the master and coordinator roles as part of the output host pool, and this stackoverflow question which suggests not adding the master role to the pool in order to guarantee that master nodes only perform the necessary important tasks inherit to their role.
But that leads to a bit of a contradiction. I was wondering if anyone has any advice? As of right now we simply have all nodes in the pool of hosts on the output.
I have a set of coordinator nodes that I send my Logstash data to. Fluentd for some reason always picks the first coordinator node, so I have a load balancer VIP with those coordinator nodes in it. That way when request gets to coordinator nodes they can do the load balancing between the data nodes. Same way for clients querying OS, separate set of coordinator nodes behind a load balancer vip with DNS tied to it.
Thanks for the reply. In my case our coordinator and also master nodes have much lower OS specs than the data nodes. And are much less is quantity as well.
Our current output configuration as of right now outputs to all the nodes in the cluster. It makes sense in theory to send the data to the coordinators so they can fan out to the corresponding nodes as you mentioned, but would we potentially be creating a bottleneck by just sending the output logs to 2 coordinator servers and removing the others, which currently amount to 10 total nodes in the output?
@mmf The traffic from logstash should be directed to coordinating nodes, whose job is to receive and process incoming traffic, which then route the traffic to the correct shards on the data nodes. This ensures that the data nodes are solely focused on efficient storing and searching operations and cluster-managers are not processing these requests and the operation run smoothly. You should scale the coordinating nodes in accordance with your requirements. Hope this helps