Highlighting not working for fields excluded from _source using term_vector + fvh

Hi everyone,

We have a requirement where we need searchable plaintext fields, but we do not want those fields to be returned in _source responses.

Our mapping looks similar to this:

{
  "_source": {
    "excludes": [
      "rawQuestion",
      "rawAnswer"
    ]
  },
  "settings": {
    "analysis": {
      "analyzer": {
        "conversation_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "stop",
            "snowball"
          ]
        }
      }
    }
  },
  "properties": {
    "rawQuestion": {
      "type": "text",
      "term_vector": "with_positions_offsets",
      "similarity": "BM25",
      "analyzer": "conversation_analyzer"
    },
    "rawAnswer": {
      "type": "text",
      "term_vector": "with_positions_offsets",
      "similarity": "BM25",
      "analyzer": "conversation_analyzer"
    }
  }
}

We verified the following:

  • term vectors are generated correctly

  • positions and offsets are present

  • search works correctly on these fields

We are trying to enable highlighting on these fields even though they are excluded from _source.

Sample highlight configuration:

"highlight": {
  "fields": {
    "rawQuestion": {
      "type": "fvh"
    },
    "rawAnswer": {
      "type": "fvh"
    }
  },
  "pre_tags": ["<em>"],
  "post_tags": ["</em>"]
}

However, highlighting is not being returned for these fields.

We also tried:

  • querying only a single field

  • removing fuzziness

  • using require_field_match: false

but highlights are still missing.

We checked the _termvectors API and confirmed that offsets and positions exist correctly.

Could someone please confirm whether highlighting should work in this setup using only term_vector + fvh, or whether store: true is required for reliable highlighting when fields are excluded from _source?

Also, are there any known limitations/issues with fvh highlighting on analyzed/stemmed fields when _source excludes are used?

Any suggestions or guidance would be greatly appreciated. Thanks!

@sudheerN As per the latest OpenSearch documentation, the highlight function will work with stored or _source fields. The highliter needs a field’s content to work. The "term_vector": "with_positions_offsets" provides the term position but not the field content needed for highlighting. Without the _source field, the stored field is required.

Thanks a lot for the clarification. This really helped us understand the issue better.

Really appreciate your help and guidance on this!