Slow indexing performance

I moved from ELK 7.x about a month ago. So I’m still learning Opensearch. I am seeing slower indexing performance so far though. I am using Logstash with the Opensearch output plugin to push logs to Opensearch.

I see the indexing fluctuate from 1600 to 3200 EPS consistently. ELK EPS was quite a bit higher and more consistent with the same underlying setup. We create three indexes per day with only one of them being quite large. Two are around 100MB per day and the other is anywhere from 40GB to 70GB per day.

I’ve read what I can find and am not seeing how I up the EPS so far. I’m using the same index template as I did with ELK. Refresh interval is set to 5 minutes, one shard and no replicas.

I am running in docker via docker swarm. Docker compose did not perform very well. Just one Opensearch v1.3.0 node and a dashboards docker container. It is running in an AWS memory optimized instance as was ELK.

Any thoughts / tips?

thanks

2 Likes

See [BUG] Indexing Performance Degraded in OpenSearch 1.3.+ · Issue #2916 · opensearch-project/OpenSearch · GitHub. Enable G1GC and see if it helps.

1 Like

Thanks. How do I enable G1GC for the Docker version of Opensearch?

I’ve tried the below but Opensearch will not start with these as the OPENSEARCH_JAVA_OPTS in the docker-compose.yml file.

-XX:+UseG1GC -XX:InitialHeapSize=32g -XX:MaxHeapSize=32g -XX:MaxGCPauseMillis=500 -XX:+DisableExplicitGC

Got this from reading: JVM Tuning with G1 GC. A Garbage-First Garbage Collector… | by Mark Nienaber | Medium

OPENSEARCH_JAVA_OPTS is correct, you probably should try something like:

-Xms30g
-Xmx30g
-XX:-UseConcMarkSweepGC
-XX:-UseCMSInitiatingOccupancyOnly
-XX:+UseG1GC
-XX:G1ReservePercent=25
-XX:InitiatingHeapOccupancyPercent=30
2 Likes

We observed similar performance degradation while upgrading from 1.2.4 to 1.3. This OpenSearch cluster is deployed via Helm Chart on a k8s environment. We’ll try +UseG1GC .
How to enable G1GC for helm chart managed cluster?

Thanks for the thread.

[opensearch@opensearch-cluster-client-57 ~]$ ./jdk/bin/java --version
openjdk 11.0.14.1 2022-02-08
OpenJDK Runtime Environment Temurin-11.0.14.1+1 (build 11.0.14.1+1)
OpenJDK 64-Bit Server VM Temurin-11.0.14.1+1 (build 11.0.14.1+1, mixed mode)
/usr/share/opensearch/jdk/bin/java -Xshare:auto -Dopensearch.networkaddress.cache.ttl=60 -Dopensearch.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.locale.providers=SPI,COMPAT -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.io.tmpdir=/tmp/opensearch-14041072601323771892 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Dclk.tck=100 -Djdk.attach.allowAttachSelf=true -Djava.security.policy=/usr/share/opensearch/plugins/opensearch-performance-analyzer/pa_config/opensearch_security.policy -Dopensearch.cgroups.hierarchy.override=/ -Xmx8g -Xms8g -XX:MaxDirectMemorySize=4294967296 -Dopensearch.path.home=/usr/share/opensearch -Dopensearch.path.conf=/usr/share/opensearch/config -Dopensearch.distribution.type=tar -Dopensearch.bundled_jdk=true -cp /usr/share/opensearch/lib/* org.opensearch.bootstrap.OpenSearch -Ecluster.name=opensearch-cluster -Ediscovery.seed_hosts=opensearch-cluster-master-headless -Enode.roles=ingest,remote_cluster_client, -Enode.name=opensearch-cluster-client-0 -Enetwork.host=0.0.0.0

We observed the performance degradation with OpenSearch 1.3 as well. @mrweber Your bug report helps a lot. The OpenSearch cluster we have is deployed with Helm Chart and running on top of k8s (managed by Rancher).

To apply the G1GC collector is a bit tricky for helm chart. By adding the config map may or may not work. We updated the helm chart to obtain the config/jvm.options . I’m not Java people hence it takes awhile for me to apply the change. One key point is the number in the fron of each line. That means which JDK version to apply the line. The default is 14 for G1CG. OpenSearch 1.3 uses JDK 11. Hence you need to change the version range for the G1CG otherwise, it’ll go SerialGC which has very poor performance.

    ## GC configuration
    8-13:-XX:+UseConcMarkSweepGC
    8-13:-XX:CMSInitiatingOccupancyFraction=75
    8-13:-XX:+UseCMSInitiatingOccupancyOnly
    ## G1GC Configuration
    # NOTE: G1 GC is only supported on JDK version 10 or later
    # to use G1GC, uncomment the next two lines and update the version on the
    # following three lines to your version of the JDK
    10-13:-XX:-UseConcMarkSweepGC
    10-13:-XX:-UseCMSInitiatingOccupancyOnly
    10-14:-XX:+UseG1GC
    10-14:-XX:G1ReservePercent=25
    10-14:-XX:InitiatingHeapOccupancyPercent=30

As you can see from the OpenSearch exporter, the indexing rate is much higher and the GC latency and times are way lower. This change along with the refresh_interval increasing brings the performance back as OpenSearch 1.2.4. Or maybe even better since we had refresh_interval as 5sec before. It’s 60s now.

@mrweber @theflakes theflakes I’m curious how do you measure the indexing performance in general?

First, thanks @mrweber for reporting this and finding the root cause.

If anyone is interested in how this regression slipped through, or interested in contributing to improving performance benchmarking, please see [BUG] Performance tests for 1.3 unable to detect indexing performance degradation · Issue #2985 · opensearch-project/OpenSearch · GitHub.

1 Like

Hello @hugok ,
I’m using the same environment as you (K8S + Rancher). I tried to put in my helm values file:

opensearchJavaOpts: "-Xmx10G -Xms10G -XX:-UseConcMarkSweepGC -XX:-UseCMSInitiatingOccupancyOnly -XX:+UseG1GC -XX:G1ReservePercent=25 -XX:InitiatingHeapOccupancyPercent=30"

But it did not work.

I’m using OpenSearch Helm version 1.0.4. Which Helm version are you using?
Thank you.

Hey @louzadod, you may need to to update the Helm chart to obtain the config/jvm.options, as per this comment Slow indexing performance - #6 by hugok. Hope it helps.

Thanks a lot @reta and @hugok .
After running the updated helm, the configuration was updated as bellow:

cd config

[opensearch@logs-corporativos-data-3 config]$ cat jvm.options
## GC configuration
8-13:-XX:+UseConcMarkSweepGC
8-13:-XX:CMSInitiatingOccupancyFraction=75
8-13:-XX:+UseCMSInitiatingOccupancyOnly
## G1GC Configuration
# NOTE: G1 GC is only supported on JDK version 10 or later
# to use G1GC, uncomment the next two lines and update the version on the
# following three lines to your version of the JDK
10-13:-XX:-UseConcMarkSweepGC
10-13:-XX:-UseCMSInitiatingOccupancyOnly
10-14:-XX:+UseG1GC
10-14:-XX:G1ReservePercent=25
10-14:-XX:InitiatingHeapOccupancyPercent=30

Is it the expected result?
Thank you again.

Regards.

Hi there - I know some time has passed since the last comment on this thread.

I would also be interested to know, in general how indexing performance might be measured and how I could identify a lag/buildup in documents and possibly alert on it.

I’d also interested to know where that grafana dashboard has come from? I don’t recognise it from the opensearch prometheus exporter.

Thanks, Will.