Migrate from Azure Search?

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

OpenSearch 2.11 on Ubuntu 22.04. Fresh installation, 1 single node in test, 3 nodes in a clusterfro production.

Describe the issue:

I’m trying to transfer a search service currently running on Azure Cognitive Search to OpenSearch 2.11

My index is for “books” and i have 2 millions of documents.

I have multiple concern :

  • Azure Facets outperforms Aggregats on huge category (with more than 500k docs).
  • Azure Scoring is really fast too, i tried some script_score / field_value_factor, but again i have an impact on my response time.

Mind you, I’m not saying that everything is bad, because for categories with fewer than 100,000 documents, OpenSearch is faster.

It’s frustrating, there are two or three entry points in my application, which would go from 1.1s to 2.7s …

Do you have any ideas?

I’ve already activated Concurrent segment search and my test platform is a Ryzen 9 5950x (decent CPU on single thread).

Thanks

Yathus

Some ideas:

  • Tweak your merge policy for query performance. Some insights here: Solr: Optimize Is (Not) Bad for You – Video & Slides (yes it’s old and for Solr, but 95% of this stuff will apply for you).
  • Check your queries: if you can get rid of scripts, that would be great. If you can move clauses that don’t influence your relevance to filters, that would be great.
  • Monitor OpenSearch and check for cache hit ratios and evictions. If you have decent hit ratio for a cache and lots of evictions, increase that cache. Just make sure you have enough heap, too.
  • Metrics should tell you what’s your bottleneck. Do you hit 100% CPU when you see those slow queries? If you do, maybe you can add more nodes or bigger CPUs, I assume self-hosted OpenSearch will offset the costs anyway. If you don’t, then it sounds like you’ll have to parallelize more: without concurrent search, some more shards should help. With concurrent search, check your number of threads and make segments more even (check the merge policy options above - you’ll want to reduce the max_segment_size and increase floor_segment => this will make big segments smaller and will reduce the number of small segments).