Approximate KNN vrs Exact KNN how can I combine the benefits of both approaches

esalazar · October 11, 2023, 11:20pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue:
I would like to use the approximate knn to reduce the number of results a vector query has, however each time I execute the same query, with the same data, the number of hits is not the same. I imagine this is part of the disadvantages of using approximate knn, however I am wondering if there is any alternative to get a consistent number of hits every execution, as this problem causes a bad user experience, as the aggregations for a particular query may change from time to time.
On the other hand the problem of using exact knn is that always returns most of the results and even trying to set the min_score parameter doesnt seem to provide as good results as approximate knn. Any recommendation

Example approximate knn

GET sample_1/_search
{
“size”: 10,
“query”: {
“knn”: {
“vector”: {
“vector”: vector_1536,
“k”: 10
}
}
}
}

Example exact knn with script score

GET sample_1/_search
{
“size”: 10,
“track_total_hits”: true,
“query”: {
“script_score”: {
“query”: {
“match_all”: {}
},
“script”: {
“source”: “knn_score”,
“lang”: “knn”,
“params”: {
“field”: “vector”,
“query_value”: vector_1536,
“space_type”: “cosinesimil”
}
}
}
}
}

Configuration:
PUT _template/test_template
{
“index_patterns”: [
“sample*”
],
“settings”: {
“index”: {
“knn”: true,
“knn.space_type”: “cosinesimil”,
“knn.algo_param.ef_search”: 512,
“mapping.total_fields.limit”: 10000
}
},
“mappings”: {
“dynamic”: “true”,
“properties”: {
“title”: {
“type”: “text”
},
“vector”: {
“method”: {
“engine”: “nmslib”,
“space_type”: “cosinesimil”,
“name”: “hnsw”,
“parameters”: {
“ef_construction”: 512,
“m”: 24
}
},
“type”: “knn_vector”,
“dimension”: 1536
}
}
}
}

Relevant Logs or Screenshots:

Topic		Replies	Views
Approximate kNN total hits inconsistent k-NN	1	488	January 4, 2023
kNN (nmslib) returns a fewer results than expected k-NN troubleshoot	2	221	August 26, 2024
Exact KNN / Approx KNN k-NN	6	866	October 28, 2023
Approx neighbor query is very slow k-NN	6	2215	November 30, 2021
Exact KNN queries not cached k-NN	2	436	November 14, 2023

Approximate KNN vrs Exact KNN how can I combine the benefits of both approaches

Example approximate knn

Example exact knn with script score

Related topics