Opensearch KNN implementation

Hi, I am using Opensearch KNN query as indicated here

We have around 3M docs and each doc has one embedding of 1024 dimension. We are using 5 datanodes which are r6g.xlarge instance

I have following questions based on usage

  1. I first tried with Lucene as the engine for KNN and noticed that the requests were still having high latency around 4s - 7s and majority of the time was spent in rewritting operation as seen by profile api. Also, graph_memory_usage was always 0. So does it mean that lucene does not load graph native libs in memory ?
  2. Then I switched to using nmslib as the engine and saw great performance improvement around ~1s when the graph is loaded in memory. Our current graph memory usage is at 80% on each node. Is this normal ?
  3. How often is the graph stored in memory ? Is it one time operation or does it get evicted after a while ? Since I do notice cold starts when there is no activity for a while
  4. Among nmslib vs Faiss which one to use ? Does Opensearch v2.5 support Faiss ?

Hi @cgchinmay

  1. Lucene engine graph usage is not tracked part of the stats. Lucene graphs are indeed loaded off-heap. Our current graph usage stats are from native engines(nmslib, faiss). Are you using OpenSearch 2.5? Lucene engine has a performance issue prior to this release
  2. Memory usage depends on the number of vectors and dimension of each vector. Memory usage estimate is documented here
  3. Graph is always stored in memory as long as it does not hit cache capacity. Once cache capacity is hit, graphs will be pre-empted in LRU fasion
  4. If your use case is plane k neighbors search with out filters, recommend using our default engine nmslib. If you have pre filtering use cases, we recommend using faiss from 2.9 release

Thanks for the response. We are using Opensearch v2.5.
We switched to using nsmlib for now .

@vamshin Thanks for the response, Have few more questions around how nmslib works

  1. It appears that nmslib engine extracts K neighbors from each segment. Our current refresh interval is 5s and that resulted in 193 segments. Is there a way to set segments to a fixed number in v2.5 ?
  2. Based on our experiments, it appears that we get K neighbors from each segment. But if a certain segment has K+1 documents which are more relevant than other segments, then we would miss out on some important documents. Increasing K does fix this issue but it also introduces some less relevant documents from other segments. Is my observation correct ? If yes then how to address this problem ? What options do we have ?
  3. All the examples of nmslib with KNN in Opensearch documentation seem to use l2 space type. I tried with cosinesimil. Is there any recommendation around which space_type we use ?