Boosting a bool that wraps a knn query not working

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
“version” : {
“distribution” : “opensearch”,
“number” : “2.3.0”,
“lucene_version” : “9.3.0”,
},

Describe the issue:
We are attempting to weight the various component scores of our production query as we look to add a knn search component to what is otherwise a normal lexical matching query. We’ve noticed that wrapping a knn query in a bool > should structure, then boosting the bool doesn’t effect the _score produced by that component. This approach does work for a normal match type query,

A few questions:

  • Is this expected behavior?
  • Is using a function_score with a weight query similar in performance to this approach?

Cheers!

Relevant Logs or Screenshots:

image

Hi!

Regarding your issue with boosting the bool query containing the KNN search component, it is possible that this behavior is expected. When using a bool query with the “should” clause, the score of each clause is combined using the “disjunction max” scoring strategy, which means that the score of the query is the maximum score of any matching clause. Boosting the bool query itself will only increase the score of the entire bool query, but not the individual scores of each clause. Therefore, boosting the bool query may not have the desired effect on the KNN search component score.

Regarding your second question, using a function_score query with a weight query can be a valid alternative to boosting the bool query. With function_score, you can specify a weighting function that assigns weights to each component of the query based on different criteria, such as relevance, recency, popularity, etc. The weight query can be used to apply a specific weight to a specific component of the query. This approach can be more flexible and customizable than boosting the bool query.

However, the performance of the function_score query may depend on the complexity of the weighting function and the size of the index. Therefore, it is recommended to test and compare the performance of both approaches in your specific use case.

Hi @orazaly1508

What I understand is that in a hybrid query the boost will not be applied to the knn component and only to the others.

What leaves me in doubt is that depending on the boost configuration of other components such as “match” in the query, the knn score may even be irrelevant if these boosts are high.

Am I understanding correctly?