I’ve created an index with two vectors with the following schema
ID, feature_vector1, feature_vector2
Here feature_vector1
is list of 500 floats and feature_vector2
is list of 100 floats
I want to have the following.
- perform search on the entire index using
feature_vector1
- using the results of the search, I want to perform reranking of the results based on the similarity with
feature_vector2
for the corresponding ID
I was able to perform KNN on feature_vector1
.
But not sure how to apply post_filter
on top the result obtained on feature_vector1
. Is there a way to achieve this.
1 Like
Hi @navmarri,
Could you help me understand what you mean by reranking of results with feature_vector2?
Post filter would basically trim away(filter out) the documents obtained from original query. it does not rescore?
@vamshin
First we obtain the knn results from feature_vector1
. Just for the IDs that are obtained. I want to apply knn using feaure_vector2
. We can think this as chaining query performing knn on top of the results of first knn results.
Does it make sense?
@navmarri I see what you mean. This is more like prefilter support for k-NN which is currently not available. We are working on this feature. Support custom scoring function for vectors. Using k-NN scores in a script_score query · Issue #50 · opendistro-for-elasticsearch/k-NN · GitHub.
As a work around probably, you could do boolean and
operation(intersection between results from feature_vector1
and feature_vector2
). Not a complete solution but should work. You might need to provide large k
for getting results from intersection.
Example query to work around. You can also choose the weightage for the query to reflect scoring among the matched documents
curl -X POST "localhost:9200/myindex/_search" -H 'Content-Type: application/json' -d'
{
"size" : 2,
"query": {
"bool": {
"must": [
{
"function_score": {
"query": {
"knn": {
"my_vector": {
"vector": [7, 8],
"k": 2
}
}
},
"weight": 0.5
}
},
{
"function_score": {
"query": {
"knn": {
"my_vector": {
"vector": [3, 4],
"k": 2
}
}
},
"weight": 0.5
}
}
]
}
}
}
'
@vamshin Thanks for the suggestion.
What is the default score_mode
Is it summation of the weight from the two functions and pick the max
?
Yes. Its sum of the scores and picks max. This gives you ability to give more weightage to the docs from 1st knn query or 2nd knn query. In the example i mentioned, we are giving equal weightage. Note, choose a very large k(you might want to experiment), to get better results.