Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
opensearch 2.11
Describe the issue:
I set the ignore_above
option to 32000 for the message
field in the mapping table of a specific index in OpenSearch.
However, when I tried to insert certain data into the index, I encountered the following error:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Document contains at least one immense term in field=\"message\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[60, 49, 51, 52, 62, 49, 32, 50, 48, 50, 52, 45, 49, 50, 45, 49, 56, 32, 49, 53, 58, 50, 48, 58, 49, 48, 32, 49, 55, 50]...', original message: bytes can be at most 32766 in length; got 34255"
}
],
"type": "illegal_argument_exception",
"reason": "Document contains at least one immense term in field=\"message\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[60, 49, 51, 52, 62, 49, 32, 50, 48, 50, 52, 45, 49, 50, 45, 49, 56, 32, 49, 53, 58, 50, 48, 58, 49, 48, 32, 49, 55, 50]...', original message: bytes can be at most 32766 in length; got 34255",
"caused_by": {
"type": "max_bytes_length_exceeded_exception",
"reason": "max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 34255"
}
},
"status": 400
}
According to the error message, the message
field length is 34255, which caused the error.
However, I had clearly set ignore_above
to 32000, so why wasn’t the value of 34255 ignored and why was it attempted to be indexed?
The length of the value in the “message” field is 37,695 characters including spaces and 37,322 characters excluding spaces. Anyway, the character length exceeds 32,000.
A second question: I changed the ignore_above
value for the message
field to 8000 and tried to insert the same log into the index again. This time, the message
field was ignored and the document was successfully indexed. Why did this work correctly this time?