Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.17
Describe the issue:
I queried the index using BM25, KNN, and hybrid modes with weights set to bm25_weight=0.3 and knn_weight=0.7, and a size of 1. I assumed that the document retrieved in hybrid mode should be one of the documents returned by either the BM25 or KNN search, depending on the scores. However, the document retrieved in hybrid mode has a different ID from both searches.
This is the pipeline for the hybrid search:
search_pipeline_settings = {
“description”: “Post processor for hybrid search (combine keyword and vector)”,
“phase_results_processors”: [
{
“normalization-processor”: {
“normalization”: {“technique”: “min_max”},
“combination”: {
“technique”: “arithmetic_mean”,
“parameters”: {“weights”: [0.3, 0.7]},
},
}
}
],
}
Below are the results. It shows document with ids 32 and 98 were retrieved for BM25 and KNN reaches, but hybrid search returned 358.
{
“query”: “How to compute the deductible?”,
“bm25”: [
{
“_index”: “Index_0”,
“_id”: “32”,
“_score”: 26.64432,
}
],
“knn”: [
{
“_index”: “Index_0”,
“_id”: “98”,
“_score”: 0.7990964,
}
],
“hybrid”: [
{
“_index”: Index_0”,
“_id”: “358”,
“_score”: 0.79635715
}
]
}