How can I highlight terms from a neural sparse search?

Is it possible to get highlighting to work on a neural sparse search?

The only way I’ve been able to get any kind of highlighting work on a neural sparse search is to change the highlight_query to use a keyword type search, but that of course only ends up matching the specific keywords.

What I’d like to happen is if a result is returned because the user searched the word ocean but it hit on sea that it would highlight the word sea to make it clear why the result was returned.

Is this possible?

Hi @dswitzer2 , I’m not an expert about the highlight_query. I tried it at my endpoint but it doesn’t highlight the tokens.
However, if your target is make it clear why the result was returned, we can use search with explain. Here is an example request and the response. We can know the matched tokens and their detailed contributions

# request
POST index_name/_search?explain=true
{
  "_source":False,
  "query": {
    "neural_sparse": {
        "text_sparse":{
            "query_tokens":{
                "hi":1.1,
                "hello":1.2
            }
        }
    }
  }
}
# response
{'took': 16,
 'timed_out': False,
 '_shards': {'total': 3, 'successful': 3, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 2, 'relation': 'eq'},
  'max_score': 4.664844,
  'hits': [{'_shard': '[test][2]',
    '_node': 'KA-CsRWNRaSyXeGgxn3MhA',
    '_index': 'test',
    '_id': 'UIB3YI8BrogF1buXrkbW',
    '_score': 4.664844,
    '_explanation': {'value': 4.664844,
     'description': 'sum of:',
     'details': [{'value': 0.9710938,
       'description': 'Linear function on the text_sparse field for the hi feature, computed as w * S from:',
       'details': [{'value': 1.1,
         'description': 'w, weight of this function',
         'details': []},
        {'value': 0.8828125,
         'description': 'S, feature value',
         'details': []}]},
      {'value': 3.6937501,
       'description': 'Linear function on the text_sparse field for the hello feature, computed as w * S from:',
       'details': [{'value': 1.2,
         'description': 'w, weight of this function',
         'details': []},
        {'value': 3.078125,
         'description': 'S, feature value',
         'details': []}]}]}},
   {'_shard': '[test][2]',
    '_node': 'KA-CsRWNRaSyXeGgxn3MhA',
    '_index': 'test',
    '_id': 'UYB3YI8BrogF1buXskbT',
    '_score': 4.6632814,
    '_explanation': {'value': 4.6632814,
     'description': 'sum of:',
     'details': [{'value': 3.3085938,
       'description': 'Linear function on the text_sparse field for the hi feature, computed as w * S from:',
       'details': [{'value': 1.1,
         'description': 'w, weight of this function',
         'details': []},
        {'value': 3.0078125,
         'description': 'S, feature value',
         'details': []}]},
      {'value': 1.3546876,
       'description': 'Linear function on the text_sparse field for the hello feature, computed as w * S from:',
       'details': [{'value': 1.2,
         'description': 'w, weight of this function',
         'details': []},
        {'value': 1.1289062,
         'description': 'S, feature value',
         'details': []}]}]}}]}}

Thanks for the feedback. I’ve looked at the explain results before when troubleshooting, but did not think about trying to use it to extract keywords out for highlighting.

This at least gives me something to play with if it’s determined that we need to get highlighting on the token keywords. It does seem like it’ll be expensive to do it this way, since the explain adds overhead and I’d have to return the vector embeddings with the source of my query.

Too bad there isn’t a way to get it to return just a list of query keywords that a neural search hit on, because I’m using query_text instead of query_tokens in my search:

POST index_name/_search?explain=true
{
  "_source":False,
  "query": {
    "neural_sparse": {
      "query_text": "hello"
      , "model_id": "MODEL_ID_HERE"
    }
  }
}

(My real query is way more complex.)

The explain also works for query_text option and will returns the matched expanded tokens. If we want to return the matched keywords from origin query text, it is not supported here. Because the actual lucene query (disjunctive FeatureQuery) is not related to the query_text any more after we do model inference. The matched tokens are even not present in origin query text because the ml model expands them. And there is no approach to determine we match “ocean” because we search by “sea” now. Because the deep learning model are still black box for us and we can’t parse these logics out from the neural network weights.

I checked the implementation of explain here. It’s conducted in fetch phase, which means we first get the doc ids of top N hits, then we fetch their _source and get the explaination object only for these top N hits. So the cost is constant here and is much smaller compared to model inference and search for now (unless we’re using a very large N now).

and I’d have to return the vector embeddings with the source of my query
I don’t get it well, do you mean we have to return the sparse_vector _source to use explain? If so we don’t need to return it to use explain.

If you think current explanation is too redundant in wording, there are methods to make it more concise. You can create a feature request at neural-search repo and we can discuss whether to implement it and the schedule for release with other users and maintainers

Thank you so much for taking the time to provide all this valuable feedback. I really appreciate it.

My understanding of how the neural_sparse search works, is obviously wrong. I was thinking since it used BM25 under the hood, that it was essentially taking the keywords generated from the inference process and storing those in Lucene, so and then breaking the search query down using the tokenizer inference process and searching against those keywords. That’s why I thought maybe the keywords matched might be something that was known.

We have 2 working modes for neural sparse: bi-encoder and doc-only. What you described is how the doc-only works. But for bi-encoder mode it still needs model inference for query (generate weight for token, and expand tokens with similar semantics). And when conduct designs we need to include all the 2 modes, so find the token from query text is not very straightforward for neural sparse

I’m using doc-only (doc and tokenizer models), so at least now I know my mental model wasn’t wrong!

I do think it’s useful that matched tokens would be returned from the search query, but I might be in the minority.