Term Vectors, stemmers, tokenizers, stop words etc

Last year I had a pilot project to use ElasticSearch to power a full-text search project. I was using 7.7 at the time, on AWS ElasticSearch service.

I’ve created custom analysers as described here Language analyzers | Elasticsearch Guide [8.4] | Elastic, I’ve used filters like html_strip, and I was planning of using the Term Vectors API Term vectors API | Elasticsearch Guide [8.4] | Elastic

Now that I’m revisiting the project, I can’t find references to any of these in the OpenSearch documentation, though if it is a 7.10 fork those features should be there. Is this an oversight in the documentation or are there differences between the projects?

In general I’m a bit worried as all the docs behind OpenSearch are focused on logs ingestion and there’s not too many examples of the text analysis capabilities. It would help me a lot make a decision on whether to base this project on OpenSearch or if I should go with an Elastic Cloud license.

Thanks!

Hello @orestis - welcome to the community. As you mentioned, yes, it is derived from 7.10.2 . However, we did not fork the documentation at the time. The team is working diligently building necessary content for the documentation, and we do track that in the open as well on the GitHub repository. Here is direct link to the backlog. I hope this helps.

Thanks for clarifying. I thought it might be a documentation issue. I guess until the docs are rewritten (they weren’t under the same license? huh) I can use the existing ElasticSearch 7.10.2 docs for some things.

2 Likes