Configuration recommendation

I’m currently working with a 2-data node and 1-master node cluster (v2.9.0) that ingests JSON-formatted custom application logs from various microservices. The data structure of these logs is highly dynamic, leading to frequent shard failures. So far, the only solution has been to purge the index, but this isn’t feasible long-term, as data retention requirements mandate a 6-month retention period.

Additionally, the custom logs contain nested JSON structures with multiple key layers that need to be parsed and extracted. I’ve set up a custom pipeline to handle this extraction, but I’m encountering frequent “maximum fields exceeded” errors. Currently, I’m using Logstash for processing, but I’m open to alternative approaches or optimizations.

Any guidance on enhancing index stability and managing field count limitations would be greatly appreciated.

Hi @jsamuel12,

Have you considered extending index.mapping.total_fields.limit?
more info here: Mappings and field types - OpenSearch Documentation

Best,
mj

@Mantas ,

I’ve increased the total field size to 6K (as of 11/4/2024) but I feel this is just dirty and will rear it’s ugly head in the future. I’ve asked the development team to create better logging as this issue only occurs from our custom application.