Hi everyone,
We have a requirement where we need searchable plaintext fields, but we do not want those fields to be returned in _source responses.
Our mapping looks similar to this:
{
"_source": {
"excludes": [
"rawQuestion",
"rawAnswer"
]
},
"settings": {
"analysis": {
"analyzer": {
"conversation_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"stop",
"snowball"
]
}
}
}
},
"properties": {
"rawQuestion": {
"type": "text",
"term_vector": "with_positions_offsets",
"similarity": "BM25",
"analyzer": "conversation_analyzer"
},
"rawAnswer": {
"type": "text",
"term_vector": "with_positions_offsets",
"similarity": "BM25",
"analyzer": "conversation_analyzer"
}
}
}
We verified the following:
-
term vectors are generated correctly
-
positions and offsets are present
-
search works correctly on these fields
We are trying to enable highlighting on these fields even though they are excluded from _source.
Sample highlight configuration:
"highlight": {
"fields": {
"rawQuestion": {
"type": "fvh"
},
"rawAnswer": {
"type": "fvh"
}
},
"pre_tags": ["<em>"],
"post_tags": ["</em>"]
}
However, highlighting is not being returned for these fields.
We also tried:
-
querying only a single field
-
removing fuzziness
-
using
require_field_match: false
but highlights are still missing.
We checked the _termvectors API and confirmed that offsets and positions exist correctly.
Could someone please confirm whether highlighting should work in this setup using only term_vector + fvh, or whether store: true is required for reliable highlighting when fields are excluded from _source?
Also, are there any known limitations/issues with fvh highlighting on analyzed/stemmed fields when _source excludes are used?
Any suggestions or guidance would be greatly appreciated. Thanks!