How to send logs kafka to opensearch

mann · October 8, 2021, 1:31pm

So, need understanding as if I send logs through beats to Kafka then Kafka will send the logs to OpenSearch. So, is that possible if yes so how to do that, if not then why?

searchymcsearchface · October 8, 2021, 1:44pm

I think it’s more typical to also have something like logstash or fluentd between beats/kafka and opensearch. This article describes a setup for ES/Kibana, but you could easily substitute OpenSearch:

mann · October 8, 2021, 2:07pm

@searchymcsearchface Thanks for the reply, why are you suggesting logstash between Kafka & Opensearch can we eliminate that?

SearchingForCam · October 10, 2021, 8:57am

The reason is Kafka can spool your messages to Opensearch. If you need to take OS offline for maintenance, Kafka can spool the logs until OS is back (amongst other reasons).
Our configuration is:
Beats and Syslog Devices > Logstash-Ingestors > Kafka > Logstash-Consumers > OpenSearch (OS).
For doco - check the logstash doco for Kafka input and output plugins.

amitai · October 10, 2021, 9:13am

does this relate to this thread?

@kris looks like we are going to need help forking that like we did the clients - also seems like there are real use cases for this in the community:)

mann · October 10, 2021, 10:18am

Thanks for the reply, I am bit confused why exactly logstash required in this case as Kafka can also work as ETL tool like logstash so we can completely eliminate logstash and use Kafka only right?

If not what functionality that logstash provides which Kafka cannot?

amitai · October 10, 2021, 4:41pm

Technically you could use an existing kafka connect sink (I don’t know of one for opensearch) or write a kafka consumer that does the changes you require. At logz.io we have a logging microservice that is the kafka consumer, we do all our ETL there using, as well as other internal logic, our opensource library for parsing logs: Sawmill.

mann · October 11, 2021, 6:00am

@amitai @searchymcsearchface

I see, so as you are saying we can eliminate logstash right? so as per our use case we are creating streaming data architecture as I am just worried as if we use logstash as per this architecture -
beats → kafka → logstash → opensearch, so logstash will able to handle backpressure or not?

Also, If we eliminate logstash in streaming data architecture - beats → kafka → opensearch so in this, are we able to send logs from kafka to opensearch directly and are we able to handle backpressure and all the functionality that logstash provided as a ETL tool as in this case we are eliminating logstash.

So, Basically I wants to eliminates logstash in streaming data architecture and handle things with only kafka, but need a conclusion if I can eliminate logstash or not, and why?

amitai · October 11, 2021, 8:57am

There is nothing special about ingesting with Logstash. You would need a connector/consumer for ETL. This can be done in many different ways and is highly dependant on your use case.

mann · October 11, 2021, 9:06am

@amitai, Thanks for the reply.
So, Kafka can able to directly push data into opensearch?

amitai · October 11, 2021, 9:13am

this I don’t know. In theory, yes, since there is a kafka connect sink for elasticsearch. For opensearch you may need to write a new one. This is probably why most opt to use Logstash instead. The link I provided to the thread suggests a connector may be a popular request.
Are you managing kafka on your own or using confluent?
A kafka connector or Logstash or whatever are always just instances that do the ETL. You will always have something there as Opensearch doesnt poll data on its own:)

mann · October 11, 2021, 9:57am

@amitai Thanks for the quick reply.

Yes, so far we have not decided but mostly we will be leaning towards managing Kafka on our own. I think it’s clear as for now we don’t have any ready-made Kafka connector to use with OpenSearch that’s why logstash/fluentd make our things easier, else there is always an open solution to creating our own Kafka connector like elastic search for OpenSearch as well.

kris · October 11, 2021, 10:33pm

great input @amitai - let’s see what can be done

mann · October 12, 2021, 5:47am

@amitai @searchymcsearchface
Can you gave high level idea when to use logstash and when to use fluentd? which is more better approach to go with and why?

nickytd · October 12, 2021, 11:15am

Here is an helm chart that supports the case your are looking at and can be used for experimentation purposes K8S Logging stack with Opensearch (featuring kafka and fluentd)

mann · October 12, 2021, 11:39am

Thanks, if I am talking about elastic beats, then with elastic beats as a lightweight shipper fluentD fit with it or not?

As per our architecture there are two approaches and we want to decide one out of that.

elastic beats → kafka → logstash → opensearch.
elastic beats → kafka → fluentD → opensearch.

Which one is more suitable?

searchymcsearchface · October 12, 2021, 4:25pm

If you aren’t already deeply involved in beats + logstash, I would look elsewhere for a few reasons:

Beats seems to be winding down to a degree and is questionable overall with OpenSearch. A) The originators of Beats, Monica and Tudor, have left Elastic to start their own (unrelated) company and there is a non-OSS technology filling the same niche within the proprietary ES stack. So, while this is conjecture on my part, I wouldn’t personally bet on a vibrant future for Beats. B) Currently, the most recent version of Beats has an explicit ES check in which blocks OpenSearch. Eliminating this would require an extensive forking process that no one has the appetite for at the moment. The versioning scheme for Beats is lock step with ES, so there is no telling what will happen or if old versions (compatible with OpenSearch) will be patched.
Logstash is a useful tool and there is a path forward with it (OpenSearch output plugin). However, it’s written in Ruby (a fine language, but not always known for high performance and a bit of an outlier in this part of the stack) and it has the same idiosyncratic versioning that the rest of ES has (breaking changes can and do come in minors and it’s driven by the ES release cycle).

If I were starting out greenfield, I would pick Fluent Bit in place of beats as it’s light weight and a CNCF project - it’s not going anywhere. As an aggregator, think about Fluentd (CNCF, but Ruby) or Data Prepper (under the OpenSearch umbrella, written in Java).

mann · October 13, 2021, 11:35am

@searchymcsearchface Thanks for sharing your point of view. That’s really helpful.

It seems fluent bit can work as an alternative to elastic beats but need some more understanding as elastic beats have several beats like filebeat, metricbeat, packetbeats, auditbeat, heartbeat, and functionbeat. So fluent bit is capable to manage all such variety of beats in it or not? and is it really a complete wrapper of the fluent bit where it manages all such types of beats variations?

Just wanted to know if the fluent bit is capable to provide all features and functionality that elastic beats provide or not? What is the major decision-making point, that we should not use a fluent bit and use elastic beats, if there are not any then it seems that fluent bit is the best fit here with fluentD.

searchymcsearchface · October 13, 2021, 2:07pm

Here are all the inputs for fluent bit (it’s pretty comprehensive):

I think the only thing you don’t really get out of the box with Fluent Bit that you do get with the beats family is built in dashboards. You can make those dashboards yourself of course though.

mann · October 14, 2021, 7:18am

@searchymcsearchface Thanks again.

Also, is it possible to provide support of some 3rd party data sources in a fluent bit? like it is possible in elastic beat.

I am listing down a few of them here -
Data sources - Splunk, g-suite, office 365, Guard Duty, CloudTrail, Azure, Okta, Thread Intel, etc.

How should we see the support of these types of data sources and many others in a fluent bit?

Topic		Replies	Views
Sending logs to opensearch via filebeat Open Source Elasticsearch and Kibana discuss	2	504	November 2, 2024
Best way to ingest the live log stream data to open search OpenDistro	2	581	January 13, 2022
Can we send data from open search to kafka using logstash Open Source Elasticsearch and Kibana	1	85	July 15, 2024
Logstash to opensearch Open Source Elasticsearch and Kibana configure	4	3001	September 20, 2022
OpenSearch on ELK stack General Feedback	5	1302	November 15, 2023

How to send logs kafka to opensearch

Related topics