How to calculate memory consumption when using the Lucene Engine for KNN vectors?

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
AWS OpenSearch version 2.11

Describe the issue:
In our initial implementation of semantic search, we stuck with the default property mappings for storing KNN vectors. We quickly hit memory limits with the KNN circuit breaker triggering, and had to significantly increase the size of our data node instances. We have been tracking memory usage using the API /_plugins/_knn/stats?pretty.

We recently switched to use the Lucene engine (from nmslib) to enable “Efficient Filtering,” and have found that the memory usage stats are no longer populated. I found that an existing defect has been logged here: [BUG] KNN stats empty on cluster · Issue #1279 · opensearch-project/k-NN · GitHub.

I have also come across an article that suggests that when using the Lucene engine, the graph structure and the actual vectors are stored in separate segments, and only the graph structure is loaded into memory, not the vectors. If this is the case, it could alleviate our concerns over memory consumption and potentially save money on data node instances. Can someone confirm if this is indeed the case?

Configuration:

Relevant Logs or Screenshots:

So for Lucene the KNN stats api doesn’t work. The loading of both graphs and vectors in RAM(outside of heap) is controlled by Operating system via mmap system call. Hence atleast for now there is no way to understand how much memory Lucene HNSW is using at a point in time.

But having said that the memory which is required by Lucene to do HNSW still remains the same, which is pointed out in this formula: k-NN index - OpenSearch Documentation

So a rule of them you can use is to ensure that you nodes have enough ram(removing JVM) to hold these graphs and vectors. But as Operating system needs some other files to be loaded in RAM then graph files may be swapped out(this can lead to increase in latency).

This is where engines like nmslib and faiss are little better which ensure that graph files are always loaded in memory.

2 Likes