When performing bulk requests we see performance differences in speed between "_search?pretty&preference=_replica" & "_search?pretty&preference=_primary"

SearchBoy · November 3, 2025, 9:55am

opensearch 2.19.3

if setting preference=_primary_first, search operations are always fast costing 40ms
if preference=replica，the first query will be slow spending 2-5s after a few time silence
this happens when there are some bulking operations
set request_cache=false does not help
“first query is slow, then fast, if stop query for 1 minute, then first query will be slow” sounds like cache-miss question, but making datanode from 32G to 256G does not help and closing query-cache, request-cache, fielddata-cache are all useless

this query works on oss based cluster to make this more obvious，while on SSD, slow query spend 3-5s;

More confusing, After refresh_interval of target index was set to 1s explicitly which is exactly equal to the default value, no slow query appears ???!!!

Is this about Skip shard refreshes if shard is `search idle` by s1monw · Pull Request #27500 · elastic/elasticsearch · GitHub , search Idle ???

search idle !!! One Week debug !!!

Leeroy · November 4, 2025, 12:25pm

Hey @SearchBoy ,

Could you provide a sample of your bulk request used when noticing the speed differences.

Leeroy.

Leeroy · November 4, 2025, 12:40pm

Hey @SearchBoy ,

From initial testing I don’t see any consistent difference, if you could provide a sample of the bulk and size of the index, total shards and replicas.

Leeroy.

SearchBoy · November 6, 2025, 6:53am

Details

two datanode(8c32G) and one clustermanager node(8c32G)
6 primary shard with 6 replica shards
every shard: “p STARTED 1553477 1.6gb”
query is just like : “from”:0,“size”:40,“query”:{“bool”:{“must”:[{“term”:{“commodity_status”:{“value”:20,“boost”:1.0}}},{“term”:{“commodity_show_status”:{“value”:0,“boost”:1.0}}},{“terms”:{“template_id”:… “adjust_pure_negative”:true,“boost”:1.0}},“sort”:[{“commodity_update_time”:{“order”:“desc”}}],“track_total_hits”:false

query1920×880 837 KB
bulk is just update these existed doc with 500 TPS（translog: request）

Besides

I think this is relevent to IO, because in my mirro cluster which is based on “oss file system“, searching with replica spend 30-50s while _primary 40ms
what make me confused are :
1. I can not grap a waiting stack with jstack, it seems like there is no any long waiting action
2. I print on- cpu and wall time flamegraph every 5s repeating 6 times constantly (30s totally repeat 3 times)(on oss fs cluster to make things easier for jstack or asprof), however
  1. first 2 or 3 on-cpu graphs show there is no runing search stack, which seems mean no-search-operation or ‘blocked before search threadpool‘ in this 10-15s？ {picture: no-search-stack}
  2. last 3 or 4 on-cpu graph, “org/opensearch/search/SearchService.executeQueryPhase“, which means search action exactly cost 20s {picture: QueryPhase}
  3. on wall-time graph, no openseach waiting {picrute: wall-time}

for ((i=0;i<6;i++));do
date;
#./async-profiler/build/bin/asprof --wall 1ms -d 5 -f perf_wall$i.html 1
./async-profiler/build/bin/asprof -e cpu -i 1ms -d 5 -f perf_cpu$i.html 1
done

I am really confused about on which opensearch function the search query wait….

SearchBoy · November 6, 2025, 6:53am

2 or 3 constantly-printed graph show same stack

SearchBoy · November 6, 2025, 6:55am

2 or 3 constantly-printed graph show same stack

SearchBoy · November 6, 2025, 6:57am

5 constantly-printed graph show same stack

sorry，last 3 graph has search releted stack, but its rate is lower than 0.1%

SearchBoy · November 6, 2025, 6:58am

really appreciate you answering my question

SearchBoy · November 6, 2025, 7:06am

SearchBoy · November 9, 2025, 2:48pm

It’s all about search idle, thanks

Topic		Replies	Views
Very slow first few search queries OpenSearch	0	497	May 27, 2024
How fast is the query..? plz,.help me OpenSearch	0	168	February 2, 2024
Slow fetching performance OpenSearch troubleshoot	0	74	May 16, 2025
Slow query performance issue Performance Analyzer	39	4558	February 21, 2023
Open Search cluster is running high cpu and response time is also high OpenSearch discuss	11	3288	November 9, 2023

When performing bulk requests we see performance differences in speed between "_search?pretty&preference=_replica" & "_search?pretty&preference=_primary"

Related topics