Normalisation in Hybrid Search

What is the best way to normalise scores of BM25 and ANN results. I am trying to build a hybrid search system where the results are ranked by combining the scores of keyword search and Neural search in a linear way.

To achieve this combination, the scores should be normalised on a same scale. I tried min-max normalisation of BM25 scores but that involves additional query to find the max score first and then use script scoring to normalise the BM25 scores (score/max-score) and sum them with respective KNN cosine similarity score in the subsequent query. Any other better ways to achieve this ?


Hi Praveen. We are actively working on this problem and we put RFC out recently. [RFC] High Level Approach and Design For Normalization and Score Combination · Issue #126 · opensearch-project/neural-search · GitHub

Your inputs will be helpful. Please feel free to provide feedback in the above RFC


This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.