Opensearch Replication & Recovery performance issue. tooks so long (100GiB -> 2 ~ 3 Hours)

jameskim · September 25, 2025, 4:31am

opened 08:29AM - 27 Aug 25 UTC

bug Indexing:Performance

### Describe the bug After upgrading our OpenSearch cluster from 2.18.0 to 3.1.…0, we observe a significant increase in disk read operations while write throughput/IOPS remain roughly the same. The workload is ingest-only (searches disabled). On 2.18.0, read IOPS stayed around 200–400 per node; on 3.1.0, under the same conditions, read IOPS jump to 400–3000+ per node. This looks like a regression in indexing path / segment lifecycle that triggers substantially more background reads during ingestion (probably merge threads). **Requests for guidance** Are there changes in 3.1.x that would increase background reads during indexing (e.g., segment lifecycle, merge behavior, replication strategy interactions, compaction/refresh defaults)? Any recommended settings in 3.1.x to restore 2.18-like read profiles for ingest-only use cases? ### Related component Indexing:Performance ### To Reproduce - Create a 3-node cluster on t4g.2xlarge with gp3 volumes (500 GB, 125 MB/s, 3000 IOPS) using image public.ecr.aws/opensearchproject/opensearch:3.1.0. - Apply the cluster settings shown above. - Create indices via the index template shown above. - Ingest time-series logs at ~1.6k docs/sec per node; disable searches entirely (ingest-only). - Retain only 2 days of indices, total size ~160 GB, reaching 1971 shards (650+ per node). - Observe per-node disk metrics: read throughput 20–40 MB/s, read IOPS 400–3000+, while writes remain 15–60 MB/s, 200–600 IOPS. - Repeat the same steps with OpenSearch 2.18.0 and note that read IOPS stay around 200–400 per node. ### Expected behavior Read IOPS during ingest-only workload should be comparable to 2.18.0 (approximately 200–400 read IOPS per node), given identical hardware, shard layout, and ingestion rate. **Actual behavior** On 3.1.0, read IOPS increase to 400–3000+ per node under the same workload and configuration, while write IOPS remain similar to 2.18.0. **Impact** Higher disk reads lead to increased storage load and cost risk, potential saturation of gp3 baseline, and reduced indexing headroom. ### Additional Details **Environment** test cluster setup: **OpenSearch version:** 3.1.0 (Docker image public.ecr.aws/opensearchproject/opensearch:3.1.0) **Previous version (baseline):** 2.18.0 **Cluster size:** 3 data nodes (also cluster-manager/ingest) **Instance type:** t4g.2xlarge (8 vCPU, 32 GB RAM) on AWS **Storage:** gp3 500 GB per node, 125 MB/s baseline throughput, 3000 IOPS **JVM opts:** -Xms16g -Xmx16g -XX:MaxGCPauseMillis=400 **Workload:** time-series log ingestion only (@timestamp field), searches disabled **Data volume:** indices retained 2 days, total size ~160 GB **Shards:** 1971 total (~650+ per node) **Observed metrics (per node)** **Indexing rate:** ~1.6k docs/sec **Write throughput:** 15–60 MB/s, 200–600 write IOPS **Read throughput (problem):** 20–40 MB/s, 400–3000+ peak read IOPS on 3.1.0 (On 2.18.0: typically 200–400 read IOPS) Cluster settings ``` OPENSEARCH_JAVA_OPTS = "-Xms16g -Xmx16g -XX:MaxGCPauseMillis=400" node.attr.temp = "hot" node.roles = "cluster_manager,data,ingest,remote_cluster_client" plugins.security.ssl.http.enabled = "false" plugins.security.system_indices.enabled = "false" plugins.security.ssl.http.clientauth_mode = "NONE" plugins.security.protected_indices.enabled = "false" indices.recovery.max_bytes_per_sec = "60mb" cluster.routing.rebalance.enable = "all" cluster.routing.allocation.allow_rebalance = "indices_primaries_active" cluster.routing.allocation.disk.threshold_enabled = "false" cluster.routing.allocation.node_initial_primaries_recoveries = "2" cluster.routing.allocation.node_concurrent_recoveries = "16" cluster.max_shards_per_node = "3000" cluster.routing.allocation.balance.prefer_primary = "true" cluster.indices.replication.strategy = "SEGMENT" ``` **Index template:** ``` { "replication": { "type": "SEGMENT" }, "allocation": { "max_retries": "300" }, "mapping": { "total_fields": { "limit": "2000" }, "depth": { "limit": "20" }, "ignore_malformed": "true" }, "refresh_interval": "120s", "translog": { "flush_threshold_size": "1024mb", "sync_interval": "120s", "durability": "async" }, "unassigned": { "node_left": { "delayed_timeout": "5m" } }, "number_of_replicas": "1", "merge_on_flush": { "enabled": "false", "max_full_flush_merge_wait_time": "30s", "policy": "default" }, "codec": "default", "routing": { "allocation": { "require": { "temp": "hot" }, "total_shards_per_node": "3" } }, "number_of_shards": "3", "use_compound_file": "false", "merge": { "scheduler": { "max_thread_count": "1" }, "policy.max_merge_at_once": "10", "policy": "log_byte_size" } } ```

you can see this !

i resoved issue by adding jvm flags

Topic		Replies	Views
Opensearch ingestion is slow and timeouts are occuring very frequently OpenSearch	11	615	January 20, 2025
Performance Issues while Snapshot is taken OpenSearch	5	124	September 25, 2025
Performance between OpenSearch installation by docker, VM and pyhsical machine? OpenSearch discuss	6	2208	June 21, 2022
Opensearch Performance tuning OpenSearch	6	1014	October 10, 2023
AWS Opensearch slow performance OpenSearch index-management	3	1589	October 22, 2023

Opensearch Replication & Recovery performance issue. tooks so long (100GiB -> 2 ~ 3 Hours)

Related topics