Hybrid score explanation processor does not work as expected

crow_of_judgement · July 31, 2025, 6:38pm

Versions:
AWS Managed OpenSearch upgraded to 2.19.
Working in Dashboards v2.19.0 Dev Console.

Describe the issue:
I am trying to use Hybrid Score Explanation Processor and replicate the example in the docs here. My goal is to get the pre-normalized scores from the KNN phase. Please, correct me if I am wrong, but as I understood I should be able to get those with this processor.

I am working through Dashboards Dev Console. I have created sample index, populated it with sample data, created same processor as per docs and performed similar query (knn instead of neural).

Unfortunately, in the response I cannot see the explanation of KNN search. There is full explanation for the match phase but for knn I get this:

{
  "value": 1,
  "description": "min_max normalization of:",
  "details": [
    {
      "value": 1,
      "description": "No Explanation",
      "details": []
    }
  ]
}

In the docs it should look something like this:

{
  "value": 0.8503647,
  "description": "min_max normalization of:",
  "details": [
      {
          "value": 0.015177966,
          "description": "within top 5",
          "details": []
      }
  ]
}

Query and doc vectors are different to the potential pre-normalized score cannot be 1 and instead of "description": "within top 5" it says "description": "No Explanation".

Please, confirm whether it is possible to get the pre-normalized scores for the neural or KNN phase with hybrid query and this explainer processor and if yes, please help me configure it correctly.

Also, does the vector engine and method affect this issue? Here I have used faiss but in working environment we used nmslib because of the requirement to use cosinesimil space.

… and please excuse me for possible rookie mistakes or misunderstandings. I am relatively new to OpenSearch…

Configuration:

This are the steps that I have performed.

Test index creation:

PUT test-index
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
      "my_vector1": {
        "type": "knn_vector",
        "dimension": 8,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "faiss"
        }
      },
      "my_text1": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

Population with sample data:

POST test-index/_bulk
{ "index": { "_id": "1" } }
{ "my_text1": "The quick brown fox jumps over the lazy dog", "my_vector1": [0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45] }
{ "index": { "_id": "2" } }
{ "my_text1": "OpenSearch makes vector search easy",    "my_vector1": [0.50, 0.45, 0.40, 0.35, 0.30, 0.25, 0.20, 0.15] }
{ "index": { "_id": "3" } }
{ "my_text1": "Sample document for KNN vector testing", "my_vector1": [0.12, 0.22, 0.32, 0.42, 0.52, 0.62, 0.72, 0.82] }

Hybrid search explainer processor in hybrid pipeline:

PUT /_search/pipeline/test-hse-pipeline
{
  "description": "Post processor for hybrid search",
  "phase_results_processors": [
    {
      "normalization-processor": {
        "normalization": {
          "technique": "min_max"
        },
        "combination": {
          "technique": "arithmetic_mean"
        }
      }
    }
  ],
  "response_processors": [
    {
        "hybrid_score_explanation": {}
    }
  ]
}

Test search query:

GET test-index/_search?search_pipeline=test-hse-pipeline&explain=true
{
  "size": 1,
  "_source": false,
  "query": {
    "hybrid": {
      "queries": [
        {
          "match": {
            "my_text1": {
              "query": "quick fox vector testing"
            }
          }
        },
        {
          "knn": {
            "my_vector1": {
              "vector": [0.0, 0.15, 0.23, 0.25, 0.30, 0.35, 0.40, 0.45],
              "min_score": 0.2
            }
          }
        }
      ]
    }
  }
}

Relevant Logs or Screenshots:

Response of test search query with explainer:

{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_shard": "[test-index][0]",
        "_node": "x6xPcNjgTFiCJNXv0-yTjw",
        "_index": "test-index",
        "_id": "3",
        "_score": 1,
        "_explanation": {
          "value": 1,
          "description": "arithmetic_mean combination of:",
          "details": [
            {
              "value": 1,
              "description": "min_max normalization of:",
              "details": [
                {
                  "value": 0.5753642,
                  "description": "sum of:",
                  "details": [
                    {
                      "value": 0.2876821,
                      "description": "weight(my_text1:vector in 0) [PerFieldSimilarity], result of:",
                      "details": [
                        {
                          "value": 0.2876821,
                          "description": "score(freq=1.0), computed as boost * idf * tf from:",
                          "details": [
                            {
                              "value": 2.2,
                              "description": "boost",
                              "details": []
                            },
                            {
                              "value": 0.2876821,
                              "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                              "details": [
                                {
                                  "value": 1,
                                  "description": "n, number of documents containing term",
                                  "details": []
                                },
                                {
                                  "value": 1,
                                  "description": "N, total number of documents with field",
                                  "details": []
                                }
                              ]
                            },
                            {
                              "value": 0.45454544,
                              "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                              "details": [
                                {
                                  "value": 1,
                                  "description": "freq, occurrences of term within document",
                                  "details": []
                                },
                                {
                                  "value": 1.2,
                                  "description": "k1, term saturation parameter",
                                  "details": []
                                },
                                {
                                  "value": 0.75,
                                  "description": "b, length normalization parameter",
                                  "details": []
                                },
                                {
                                  "value": 6,
                                  "description": "dl, length of field",
                                  "details": []
                                },
                                {
                                  "value": 6,
                                  "description": "avgdl, average length of field",
                                  "details": []
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    },
                    {
                      "value": 0.2876821,
                      "description": "weight(my_text1:testing in 0) [PerFieldSimilarity], result of:",
                      "details": [
                        {
                          "value": 0.2876821,
                          "description": "score(freq=1.0), computed as boost * idf * tf from:",
                          "details": [
                            {
                              "value": 2.2,
                              "description": "boost",
                              "details": []
                            },
                            {
                              "value": 0.2876821,
                              "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                              "details": [
                                {
                                  "value": 1,
                                  "description": "n, number of documents containing term",
                                  "details": []
                                },
                                {
                                  "value": 1,
                                  "description": "N, total number of documents with field",
                                  "details": []
                                }
                              ]
                            },
                            {
                              "value": 0.45454544,
                              "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                              "details": [
                                {
                                  "value": 1,
                                  "description": "freq, occurrences of term within document",
                                  "details": []
                                },
                                {
                                  "value": 1.2,
                                  "description": "k1, term saturation parameter",
                                  "details": []
                                },
                                {
                                  "value": 0.75,
                                  "description": "b, length normalization parameter",
                                  "details": []
                                },
                                {
                                  "value": 6,
                                  "description": "dl, length of field",
                                  "details": []
                                },
                                {
                                  "value": 6,
                                  "description": "avgdl, average length of field",
                                  "details": []
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value": 1,
              "description": "min_max normalization of:",
              "details": [
                {
                  "value": 1,
                  "description": "No Explanation",
                  "details": []
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

pablo · August 11, 2025, 4:58pm

@crow_of_judgement Your output is correct. The explain parameter for a FAISS engine was implemented in version 3.0.0.

github.com/opensearch-project/opensearch-build

release-notes/opensearch-release-notes-3.0.0.md

main

# OpenSearch and OpenSearch Dashboards 3.0.0 Release Notes

## Release Highlights

OpenSearch 3.0 delivers significant upgrades for performance, data management, security, vector database functionality, and more to help you build and deploy powerful, flexible solutions for search, analytics, observability, and other use cases.

### New and Updated Features

* Among the significant performance improvements included in OpenSearch 3.0 is an update to range queries. Applying smarter strategies for numeric and date fields, OpenSearch can now answer range filters with far fewer I/O operations to deliver 25% faster performance in Big5 benchmarks.
* New optimization features for high-cardinality queries introduce execution hints for cardinality aggregation, enabling users to better balance precision and performance. This enhancement achieves a 75% reduction in p90 latency in benchmark testing compared to the previous release.
* Concurrent segment search is now enabled by default for k-NN, delivering up to 2.5x faster query performance. Additionally, improvements to the floor segment size setting help improve tail latencies by up to 20%.
* Date histogram aggregations now benefit from enhanced filter rewrite optimization that supports sub-aggregations, offering significant performance gains for real-world use cases requiring multi-level aggregations.
* Derived source for k-NN vectors is production ready in this release, optimizing vector search performance with up to 30x improvement in cold start query latencies. This feature can also reduce storage requirements by 3x across the Faiss, Lucene, and NMSLIB libraries.
* Semantic sentence highlighting introduces context-aware highlighting to identify and highlight relevant sentences based on meaning, not just keyword matches, working seamlessly with traditional search as well as neural and hybrid search. This feature includes a pre-trained model for basic semantic highlighting use cases.
* Concurrent Segment Search is now enabled by default for k-NN, delivering up to 2.5x faster query performance. Additionally, improvements to the floor segment size setting help improve tail latencies by up to 20%.
* The new explain parameter for Faiss engine queries provides detailed insights into k-NN query scoring processes. This enhancement helps users understand and optimize their query results by providing a comprehensive view into search result scores.
* This release changes the default BM25 scoring function from LegacyBM25Similarity to BM25Similarity. This provides better compatibility with the latest Apache Lucene optimizations and removes unnecessary legacy code, leading to cleaner, more maintainable implementations while preserving search result quality.
* Piped Processing Language (PPL) receives powerful new capabilities with lookup, join, and subsearch commands, improving log correlation and filtering capabilities. These enhancements, backed by Apache Calcite, enable better query planning and execution for interactive data exploration.
* Query insights see major improvements with a new live queries API for real-time monitoring and a verbose parameter for optimized dashboard performance. Dynamic columns in the query insights dashboard support efficient query analysis.
* The observability experience is enhanced with contextual launch for anomaly detection, allow you to launch an anomaly detector from the main dashboard and automatically populating relevant logs in the Discover view. This streamlined workflow can significantly accelerate the task of investigating anomalies.

This file has been truncated. show original

github.com/opensearch-project/k-NN

Explain API for Exact/ANN/Radial/Disk based KNN search on Faiss

main ← neetikasinghal:explain

opened 07:40PM - 17 Jan 25 UTC

neetikasinghal

+1704 -143

### Description Add support for explain for Exact/ANN/Radial/Disk/Filtering k-n…n search. Score calculation explanation is currently added only for ANN search. Proposal for explain is given here: https://github.com/opensearch-project/k-NN/issues/875#issuecomment-2611349466 ### Related Issues Resolves #875 ### Check List - [x] New functionality includes testing. - [ ] New functionality has been documented. - [ ] API changes companion pull request [created](https://github.com/opensearch-project/opensearch-api-specification/blob/main/DEVELOPER_GUIDE.md). - [x] Commits are signed per the DCO using `--signoff`. - [x] Public documentation issue/PR [created](https://github.com/opensearch-project/documentation-website/issues/new/choose). By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/k-NN/blob/main/CONTRIBUTING.md#developer-certificate-of-origin).

crow_of_judgement · August 11, 2025, 5:19pm

Oh I see, explanation for neural/knn search was not yet implemented in 2.19 for faiss.
Thank you!

Topic		Replies	Views
Hybrid Query Explanation: Only a single min_max_normalization block OpenSearch discuss	5	48	February 17, 2026
OpenSearch Explain with Hybrid Radial Search OpenSearch discuss	1	79	June 30, 2025
Hybrid Score Explain Output's Structure Diverges from Docs OpenSearch discuss	2	90	June 17, 2025
Hybrid Search Normalization for Nested Queries OpenSearch troubleshoot , configure	3	165	March 10, 2025
Hybrid search returning duplicate docs Machine Learning troubleshoot	8	1251	August 7, 2024

Hybrid score explanation processor does not work as expected

Related topics