Hi, as per my understanding KNN’s HNSW memory consumption is based on constructing a connected vector graph and using routing key means reducing search space by partitioning data. Then, what would happen if the vector data is indexed with routing key? Will OpenSearch construct multiple smaller separated HNSW graphs which can lead to reducing number of edges between the smaller graphs which means reducing memory consumption?
Is Routing key you mentioned the same thing with _routing field?
If so, I don’t think it’s quite relevant with HNSW algorithm of KNN.
When indexing a document using routing query parameter,
GET sample-index1/_doc/1?routing=JohnDoe1
the opensearch cluster relocates it to a specific segment based on the hash rule. It can increase memory when indexing but decrease when searching if you specify _route field.
Thank @yeonghyeonKo and yes, I mean _routing field and memory on searching.
According to this AWS document, it says memory consumption on searching is about “1.1 * (4 * d + 8 * m) * num_vectors”. Then I think if vectors data can be partitioned (using _routing field), then it doesn’t need to create edges between partitions.