We are trying to index a large amount of files from a fileshare. For this, 100 documents are indexed per batch, running through an ingest-attachment as well as embedding pipeline for vector embeddings. This works well, with one issue: memory is constantly increasing, until the opensearch process gets killed by the system.
I already tried increasing RAM and heap space, but this does not solve the issue:
after 19.400 indexed documents, the process gets killed because it uses too much memory. Regardless of the total available memory or the configured heap size.
Tested with:
16GB total memory, 4GB heap
32GB total memory, 16GB heap
Both configurations break exactly after the same amount of indexed batches: 194/451.
Sadly deactivating the refresh interval did not help, the process still runs OOM. But I added additional logging and this looks like the non-heap memory keeps growing and runs out.
When I tested with a small batch during the day, the memory usage also increased and didn’t go back down until I restartet opensearch. We could add a monitoring and reboot opensearch everytime memory gets low, but that can only be a temporary workaround. I’ll investigate further if the embedding pipeline causes this, but any help is highly appreciated.
So I tried now with a small subest of files with and without vector-embedding pipeline. To me it seems, the vector-embedding pipeline might be what causes the increasing memory usage. If I disable it, heap memory usage is similar, but overall memory usage does not increase.
The pipeline in question takes base64 encoded data and first runs it through the ingest-attachment plugin, then chunks the text and runs multiple text chunks through the text-embedding. For the test above I only excluded the text-embedding part and memory stayed fine.
No, sadly I was not able to test out if the issue also exists in a newer version, because we are stuck with OpenSearch 2.15, which is the version which is available in SLES15 and that’s what our server team provides me with. Thefor I am checking memory during indexing and restart the server if needed…
If you find a solution, I would be grateful if you can post it here. Thanks!
We are going to experiment with tweaking the GC settings. We are using OpenSearch 2.16. Hopefully the maintainers of the plugin can reproduce this issue and an issue can be created at GH.