Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.9.0 using Docker, MacOS Ventura, Firefox
Describe the issue:
I followed this guide: How to Use the Synonyms Feature Correctly in Elasticsearch | by Lynn Kwong | Towards Data Science to configure an analyzer filtering tokens against a synonyms.txt.
However, when I try the analyzer, if I keep on making the same request, it indefinitely alternates between returning the term I give as a token and returning the term plus its synonyms:
First result:
{
"tokens": [
{
"token": "PS",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
}
]
}
Second result:
{
"tokens": [
{
"token": "PlayStation",
"start_offset": 0,
"end_offset": 6,
"type": "SYNONYM",
"position": 0
},
{
"token": "PS",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
}
]
}
As a result (I think) search queries behave non-deterministically (I guess it’s not REALLY non-deterministic since it has a pattern, but the point is, it gives me different results on the same request).
I’m a completely newbie to Elastic/OpenSearch but I can’t imagine this is the expected behavior?
Thank you.
Configuration:
PUT /inventory_synonym_graph_file
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"index_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"search_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"synonym_filter"
]
}
},
"filter": {
"synonym_filter": {
"type": "synonym_graph",
"synonyms_path": "synonyms.txt",
"updateable": true
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "index_analyzer",
"search_analyzer": "search_analyzer"
}
}
}
}
Relevant Logs or Screenshots: