Built-in Filters and Tokenizers Available in OpenSearch from ElasticSearch

Hello -
Older versions (7.xx) of ElasticSearch have lots of built-in tokenizers and filters. However, OpenSearch documentation doesn’t mention the same list. I was wondering whether there is a list of built-in Tokenizers and Filters for OpenSearch, so that we can determine our move to OpenSearch.

I am particularly interested in the N-Gram, Edge N-Gram

Thanks,
Manu

So its funny I am literally looking for the same thing now… It seems like there isn’t a comprehensive list at the moment but here was an example I found of an edge-ngram filter being used.

1 Like

Yes, all those token filters exist. The documentation is not great, but you can always refer to Elastic’s 7.10 documentation which is the version OpenSearch originated from.

https://www.elastic.co/guide/en/elasticsearch/reference/7.10/analysis-tokenfilters.html

I hope I didn’t step on anyone’s toes, but I took the liberty of filing https://github.com/opensearch-project/documentation-website/issues/790

That’s definitely something we should document. Thanks for letting us know!

2 Likes

Thank you all for your suggestions!

@nateynate thanks for filing the request. It will be great to have a documentation