I’m experiencing issues with getting logs from servers to OpenSearch. I used to use Logstash, but the logs would arrive 1 or 2 weeks after they were generated. So, I started using Filebeat 7.12.1, and the logs started coming in real-time. However, recently, I can’t get logs from 3 servers. (I stopped getting data to OpenSearch at the same time. 2 servers still using Logstash, 1 is using filebeat) Restarting the Logstash service worked for one server, but it didn’t solve the issue for the other server.(I use filebeat for getting data from this server) It is strange that they all stopped sending(?) data at the same time.
I’ve checked the health of the cluster and nodes, and they all show green status. OpenSearch storage is not full; there’s enough space for the logs to come in. When I run the command “GET _cat/nodes?v,” it shows that the RAM usage is at 98%.
Where could the problem be? I’ve tried all the solutions I could find online.
I do not received any error message.
Hi. Please show us your Logstash, Opensearch and Filebeat configurations. You can replace the real IPs, but to keep the logic.
We find two porblem
- We can’t open the log.txt files because filebeat keeps them open. (In the 1st server the 4th log.txt file can’t open)
For this problem I checked elasictic search guide page.
And checked my “filebeat.yml” file. And the close_timout: 30 is already exist.
- Filebeat tries to get data from the last “log.txt” file while system tries to delete it.
Our structure is if the first log.txt file is full, it renames it self as log1.txt and so on. The last file got deleted. And a new log.txt file is created.
"filebeat.yml "
I will check the logstash.yml file too (I will edit here after I check). But it seems weird that all the servers stop sending data to opensearch at the same time. And restarting logstash works for one server but restarting filebeat in the other server does not work.
-----Update I checked the Filebeat’s logs and found this error.
“pipeline/output.go:180 failed to publish events: temporary bulk send failure”
I checked the cluster health and it is good. I checked the filebeat.yml, and it is okay. (We do not have hosts: [“your_opensearch_server:9200”] this in our yml file. but it used to works fine with this .yml)
Can you help me about this problem?