Versions: OpenSearch 2.10 + FAISS Engine + HNSW
Describe the issue:
My question is mainly about the difference between the two queries in the picture. One query uses FAISS efficient filter. But the other has the k-NN query in the should of a bool query and has specified a filter for the bool query. Documentation of OpenSearch for filter in bool query says:
and operator that is applied first to reduce your dataset before applying the queries."
How does this work if there is a k-NN query in a part of the bool query? Is the filter applied before or after the k-NN? Does it work similar to how FAISS efficient filter works?
Relevant Logs or Screenshots:
@Alireza Thanks for reaching out. Please find my ans below.
There is a big difference between both the queries. In the first query what will happen is, docs from
filter and docs from
knn query will be intersected and final intersected results will be returned. Think of it like a post filter. This has the potential to return less than K number of documents.
In the second query what will happen while doing vector search we will select those documents only which evaluate to true on the filter provided. Think of this as Filter while Search. This will ensure that if K documents are there for filters then K documents are returned.
I hope this clarifies.
Documentation : k-NN search with filters - OpenSearch documentation
The first Query is Post Filter and second query is Efficient Filters.
As always, thank you very much @Navneet.
I was a little confused by the document which said this for the filter in bool queries:
and operator that is applied first to reduce your dataset before applying the queries.”
I will try to make a PR to the documentation then to improve this as this behavior is different for k-NN queries inside bool queries.