We deployed ODFE 1.7.0 (ES 7.6.1) onto a Kubernetes cluster for log monitoring. Logs are collected using Fluent Bit and sent to Elasticsearch. Everything seems to be working fine. But one of the admins recently contacted me to ask about a large number (700,000+ in 10 hours) of (org.elasticsearch.index.IndexNotFoundException and sun.nio.fs.UnixException) exceptions being detected by Dynatrace. Looking into it, these exception happen all the time. I suspect/assume they are triggered with every “chunk” sent from Fluent Bit.
I confirmed that Elasticsearch appears to be working fine and log messages are flowing in as expected. And there are no ERROR (or WARNING) messages in the Elasticsearch log file indicating this is a problem. But I’m being asked to clarify why the exceptions are occurring.
The IndexNotFoundException exceptions mention a specific index name that I know is the index name Fluent Bit is sending documents to. But we have an ingest pipeline that intercepts the incoming load and redirects them to different indexes based on fields within the message. So, no documents are every being written to the dummy index and it is never created. But I’m confused why Elasticsearch is attempting to verify it exists and since there seems to be tight correlation between the Elasticsearch exception and the file system exception, why it is (apparently) making calls to the file system to check for the index.
I eventually went ahead and created the “dummy” index and assumed that this would eliminate these exceptions. But I was wrong: that didn’t have a meaningful effect on the number of these exceptions.
I thought this was an issue limited to the core Elasticsearch but the looking over the stack trace, (or the pseudo-stack trace I obtained from Dynatrace) I see several ODFE security classes mentioned.
In any case, can anyone clarify why these exceptions are being thrown when nothing is actually being sent to this “dummy” index? Can we assume these are harmless or do they indicate a problem?