Timeouts during the search load test

PaulNarbe · December 22, 2022, 5:37pm

Versions
Opensearch 2.3

Describe the issue
We implemented a search queries load test for the OpenSearch cluster with the knn plugin installed.
During the load test we get a lot of timeouts shortly after the start.
Index with ~22M items is pre-uploaded (144Gb storage, 32 shards, 890 segments). The vectors inside are 512-d, lucene hnsw is used.
ef_construction: 32
M: 64
load test concurrency: 25
What is the expected behavior?
Either timings are low or any watched metrics shows clearly the reason of the problem (something to be scaled or reconfigured).
What is your host/environment?
The cluster is run on the 16 m6g.xlarge.search data nodes and 3 r6g.large.search master nodes.
Do you have any additional context?
We’re trying to monitor the source of the problem using CPUUtilization, JVMMemoryPressure, Free Storage.
None of those gets close to the limit during the test.
KNNGraphMemoryUsage is always 0, which is different from faiss and nmslib hnsw tests.
Can you please give me any guidance on what metrics or potential problems should I look for?

martin.g · January 4, 2023, 9:59pm

Hello,

What is your timeout setting (I assume you got timeouts for query load type), value of k, amount of RAM allocated for JVM on data nodes, num of replicas?

I can give few general recommendations after looking on provided description:
try values of m and ef_construction that are closer to Lucene defaults: m = 16, ef_construction = 100.
I’m not sure if you merge segments or not, but with your settings 890/32shards ~28 segments per shard can be a lot. try lower number of segments to say 10 per shard, you can call force_merge from Index API with max_segments = 10 (number of segments is per shard). More segments give you better recall but tradeoff is higher latency as search times per segment are combined and sum is greater than search time for a single big segment.

Topic		Replies	Views
Problems with kNN-searches OpenSearch troubleshoot , configure	1	485	March 21, 2024
Getting latency and timeouts for initial knn search queries even after using warmup api k-NN	3	1087	March 31, 2021
Knn search is too slow OpenSearch troubleshoot	1	939	June 29, 2023
KNN Queries Are Slow And Not Cached k-NN	1	178	December 6, 2024
Why my OpenSearch vector search is slow OpenSearch	5	1315	January 18, 2024

Timeouts during the search load test

Related topics