Hello,
I want to implement a search in the documents with a lot of terms in German which is full of compound words.
It seems I can use hyphenation_decompounder filter for that:
The problem I have - I’m using managed OpenDistro from Deutsche Telekom and I don’t understand how to load the hyphenation_patterns.xml to the cluster.
In this specific case I would recommend you to experiment with this feature locally first. This will help you understand what exactly you need to have in place.
I haven’t been using this specific token filter myself (because I did not have the need to work with German-like languages, which this filter is primarily used for) but I think I see what is the issue here.
It is looking for “analysis/hyphenation_patterns.xml” file and it was not found at “/rds/datastore/elasticsearch/v7.6.2/package/elasticsearch-7.6.2/config/analysis/hyphenation_patterns.xml” location.
This seems like it was expecting this file in “<elasticsearch_install_folder>/config/analysis/hyphenation_patterns.xml”. Which is why I recommend you to test this locally first so that you better understand what file needs to go where…
I do not know what “managed OpenDistro from Deutsche Telekom” is offering you but I would guess that “config/analysis/hyphenation_patterns.xml” file is simply not in place. And the reason is that in most cases you need to install it prior starting the node yourself. There are several hyphenation files available for download on the web but they are usually associated with specific license, and in many cases the license does not play well with AL2 which means it can not be distributed with OpenDistro (or OpenSearch) easily. That is why users have to download and install them manually themselves.
The problem is not finding the file. I have it.
And I tried to do it locally via docker - I can easily copy the file, and it’s working. So having SSH to the server will solve the problem.
The question is - how to copy a config file to the managed version of OpenDistro from Deutsche Telekom.
I don’t have SSH there, so I can’t just copy file there. I was looking for any API endpoints that can do it, but I cannot find an option to extend it like e.g. Elastic Cloud via extensions https://cloud.elastic.co/deployment-features/extensions
I don’t see any API in OpenDistro that allow to do such things.