Performance with Parallel API Calls in OpenSearch Serverless

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue:
I’m using AWS OpenSearch Serverless to manage around 1 million documents with a parent-child relationship. I’ve exposed an REST API that constructs an aggregation (aggs) query based on user requests and uses the _search API to query OpenSearch.

When testing the API via Postman (single request), the response is fast—returning in just a few milliseconds. However, when I trigger 10-15 parallel API calls from the UI/client-side, the response time increases significantly to around 6-7 seconds.

Since indexing performs well with 0.5 OCU, I’d like to know if there’s a way to increase only the minimum Search OCU (e.g., from 0.5 to 2) to better handle concurrent search requests

Could the performance drop with parallel queries be due to:

  • Search OCU limitations?
  • Resource throttling or lack of redundancy?

Any guidance on this would be greatly appreciated!

Configuration:

  • Redundancy: Disabled
  • Search OCUs: Set to 0.5 (min)
  • Indexing OCUs: Running well with 0.5 OCU due to autoscaling

Relevant Logs or Screenshots: