I am analyzing our opendistro system with performance analyzer dashboards.
I am seeing that 99.99% of the write throughput is used for merge and about 30% of the cpu.
I tried to disable index merging from the warm node but still it remained the same.
Any idea why opendistro spends so much resources on merge?
opendistro version: 1.12
With the default configuration, if you write at full throttle, it’s normal that most CPU time goes on merging.
You can tweak the merge policy to control the general balance between how expensive writes are on one hand and how expensive full text search is and how large (i.e. un-compacted) the index becomes. Here’s an oldie but goldie on the topic (on Solr, but both use Lucene, so it’s the same): Solr: Optimize Is (Not) Bad for You – Video & Slides
Thank you for your answer.
I also saw the video and learned some new important things.
I didn’t mention that I tried to increase the index refresh interval from 30s to 60s but didn’t see noticeable change.
I appreciate your help.
You’re welcome. With the refresh interval you get diminishing returns, so it’s kind of normal not to see them at some point. 30s is quite high for high-ingestion use-cases.