Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
AWS OpenSearch version 2.11
Describe the issue:
In our initial implementation of semantic search, we stuck with the default property mappings for storing KNN vectors. We quickly hit memory limits with the KNN circuit breaker triggering, and had to significantly increase the size of our data node instances. We have been tracking memory usage using the API /_plugins/_knn/stats?pretty.
We recently switched to use the Lucene engine (from nmslib) to enable “Efficient Filtering,” and have found that the memory usage stats are no longer populated. I found that an existing defect has been logged here: [BUG] KNN stats empty on cluster · Issue #1279 · opensearch-project/k-NN · GitHub.
I have also come across an article that suggests that when using the Lucene engine, the graph structure and the actual vectors are stored in separate segments, and only the graph structure is loaded into memory, not the vectors. If this is the case, it could alleviate our concerns over memory consumption and potentially save money on data node instances. Can someone confirm if this is indeed the case?
Configuration:
Relevant Logs or Screenshots: