Semantic highlighting results not great

osman · August 1, 2025, 1:35pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 3

Describe the issue: Semantic highlighting not great

Excited to see the semantic highlighting feature, some great work there. Unfortunately although I know benchmarking is ~70% it seems to do badly more often than not. Am I missing something?

For example

POST _plugins/_ml/models/6iIoV5gBcMCYoFc8ugqg/_predict
{
  "question": "Does this school have a robotics program?",
  "context": "mcclymonds high is a school with a strong emphasis on faith and god. We have a great basketball team and athletics department. Our robotics program ranks second in the country. We have a special program for students who love to code. Christian teachings are at the heart of our curriculum. "
}

Returns

{
  "inference_results": [
    {
      "output": [
        {
          "name": "highlights",
          "dataAsMap": {
            "highlights": [
              {
                "start": 0,
                "end": 68,
                "text": "mcclymonds high is a school with a strong emphasis on faith and god.",
                "position": 0
              }
            ]
          }
        }
      ]
    }
  ]
}

or

POST _plugins/_ml/models/6iIoV5gBcMCYoFc8ugqg/_predict
{
  "question": "does this school have a program for autism?",
  "context": "We have a great basketball team and athletics department. We also excel at supporting students with developmental challenges. We have a special program for students with autism. Christian teachings are at the heart of our curriculum. mcclymonds high is a school with a strong emphasis on faith and god."
}

returns

{
  "inference_results": [
    {
      "output": [
        {
          "name": "highlights",
          "dataAsMap": {
            "highlights": [
              {
                "start": 178,
                "end": 233,
                "text": "Christian teachings are at the heart of our curriculum.",
                "position": 3
              }
            ]
          }
        }
      ]
    }
  ]
}

Any insights? I’m using the model suggested in the tutorial semantic-highlighter-v1.

junqiu · September 16, 2025, 6:32pm

Hi @osman, could you share a bit more about the setup?

What is the model configuration looks like?
Have you noticed if the behavior is consistent across different domains of questions or contexts?

Thanks,

Junqiu

Topic		Replies	Views
Hybrid query highlight lexical matches OpenSearch discuss , troubleshoot , feature-request	5	35	December 9, 2025
Highlight the whole phrase OpenSearch	1	75	September 3, 2025
Highlighting Issue: Mismatch Between Analyzer Output and Highlighting Results OpenSearch	0	97	June 26, 2024
How can I highlight terms from a neural sparse search? Machine Learning	8	777	July 13, 2024
How to use OpenSearch highlights to search documents OpenSearch	1	22	September 12, 2025

Semantic highlighting results not great

Related topics