OpenSearch Benchmark - Most important metrics?

Versions
1.3.5

Describe the issue:
I have started using OpenSearch Benchmark to know better the performance of OpenSearch and then have the oportunity to try to change configurations and instance types(we are running OpenSearch in AWS with EC2 machines) to reduce costs if possible.

The issue is that we don’t know which metrics should be the ones that we have to keep an eye(probably index-append is one).

Configuration:
We have each component running in different EC2 instances(master, coordination, data nodes).

Relevant Logs or Screenshots:
This is the result of a test with 1 coordination, 1 master and 1 data node running in AWS.
But I don’t know which metrics are the ones that we need to pay more attention.

Metric Task Value Unit
Cumulative indexing time of primary shards 16.56 min
Min cumulative indexing time across primary shards 0.0031 min
Median cumulative indexing time across primary shards 3.2661 min
Max cumulative indexing time across primary shards 3.4066 min
Cumulative indexing throttle time of primary shards 0 min
Min cumulative indexing throttle time across primary shards 0 min
Median cumulative indexing throttle time across primary shards 0 min
Max cumulative indexing throttle time across primary shards 0 min
Cumulative merge time of primary shards 0.199183 min
Cumulative merge count of primary shards 30
Min cumulative merge time across primary shards 0 min
Median cumulative merge time across primary shards 0.0338333 min
Max cumulative merge time across primary shards 0.0646333 min
Cumulative merge throttle time of primary shards 0 min
Min cumulative merge throttle time across primary shards 0 min
Median cumulative merge throttle time across primary shards 0 min
Max cumulative merge throttle time across primary shards 0 min
Cumulative refresh time of primary shards 1.96083 min
Cumulative refresh count of primary shards 342
Min cumulative refresh time across primary shards 0.0174167 min
Median cumulative refresh time across primary shards 0.340917 min
Max cumulative refresh time across primary shards 0.543833 min
Cumulative flush time of primary shards 1.1289 min
Cumulative flush count of primary shards 12
Min cumulative flush time across primary shards 0.0000833333 min
Median cumulative flush time across primary shards 0.236033 min
Max cumulative flush time across primary shards 0.272483 min
Total Young Gen GC time 1.487 s
Total Young Gen GC count 107
Total Old Gen GC time 0 s
Total Old Gen GC count 0
Store size 2.90879 GB
Translog size 0.000000358559 GB
Heap used for segments 0.594326 MB
Heap used for doc values 0.0189705 MB
Heap used for terms 0.469574 MB
Heap used for norms 0.0643311 MB
Heap used for points 0 MB
Heap used for stored fields 0.0414505 MB
Segment count 81
Min Throughput index-append 81865.6 docs/s
Mean Throughput index-append 82936.1 docs/s
Median Throughput index-append 83063.4 docs/s
Max Throughput index-append 83666.4 docs/s
50th percentile latency index-append 362.276 ms
90th percentile latency index-append 403.487 ms
99th percentile latency index-append 448.818 ms
100th percentile latency index-append 473.381 ms
50th percentile service time index-append 362.276 ms
90th percentile service time index-append 403.487 ms
99th percentile service time index-append 448.818 ms
100th percentile service time index-append 473.381 ms
error rate index-append 0 %
100th percentile latency refresh-after-index 10867.8 ms
100th percentile service time refresh-after-index 10867.8 ms
error rate refresh-after-index 100 %
Min Throughput wait-until-merges-finish 36.12 ops/s
Mean Throughput wait-until-merges-finish 36.12 ops/s
Median Throughput wait-until-merges-finish 36.12 ops/s
Max Throughput wait-until-merges-finish 36.12 ops/s
100th percentile latency wait-until-merges-finish 27.2992 ms
100th percentile service time wait-until-merges-finish 27.2992 ms
error rate wait-until-merges-finish 0 %
Min Throughput index-stats 89.86 ops/s
Mean Throughput index-stats 89.92 ops/s
Median Throughput index-stats 89.92 ops/s
Max Throughput index-stats 89.95 ops/s
50th percentile latency index-stats 5.2308 ms
90th percentile latency index-stats 5.67666 ms
99th percentile latency index-stats 6.10312 ms
99.9th percentile latency index-stats 12.4938 ms
100th percentile latency index-stats 18.4893 ms
50th percentile service time index-stats 4.51572 ms
90th percentile service time index-stats 4.75969 ms
99th percentile service time index-stats 5.15814 ms
99.9th percentile service time index-stats 5.47606 ms
100th percentile service time index-stats 5.59424 ms
error rate index-stats 0 %
Min Throughput node-stats 89.61 ops/s
Mean Throughput node-stats 89.85 ops/s
Median Throughput node-stats 89.89 ops/s
Max Throughput node-stats 89.93 ops/s
50th percentile latency node-stats 5.28912 ms
90th percentile latency node-stats 5.7878 ms
99th percentile latency node-stats 7.80665 ms
99.9th percentile latency node-stats 9.05109 ms
100th percentile latency node-stats 12.6956 ms
50th percentile service time node-stats 4.56531 ms
90th percentile service time node-stats 4.86757 ms
99th percentile service time node-stats 7.09032 ms
99.9th percentile service time node-stats 8.37496 ms
100th percentile service time node-stats 11.7565 ms
error rate node-stats 0 %
Min Throughput default 49.25 ops/s
Mean Throughput default 49.56 ops/s
Median Throughput default 49.6 ops/s
Max Throughput default 49.73 ops/s
50th percentile latency default 11.6869 ms
90th percentile latency default 13.3666 ms
99th percentile latency default 14.3299 ms
99.9th percentile latency default 14.7419 ms
100th percentile latency default 15.8083 ms
50th percentile service time default 11.1059 ms
90th percentile service time default 12.2242 ms
99th percentile service time default 12.8185 ms
99.9th percentile service time default 13.4524 ms
100th percentile service time default 14.705 ms
error rate default 0 %
Min Throughput term 87.75 ops/s
Mean Throughput term 88.26 ops/s
Median Throughput term 88.25 ops/s
Max Throughput term 88.78 ops/s
50th percentile latency term 1315.74 ms
90th percentile latency term 1765.02 ms
99th percentile latency term 1851.59 ms
99.9th percentile latency term 1859.7 ms
100th percentile latency term 1861.49 ms
50th percentile service time term 11.4914 ms
90th percentile service time term 12.2079 ms
99th percentile service time term 12.6647 ms
99.9th percentile service time term 12.9717 ms
100th percentile service time term 19.0611 ms
error rate term 0 %
Min Throughput phrase 87.15 ops/s
Mean Throughput phrase 87.48 ops/s
Median Throughput phrase 87.51 ops/s
Max Throughput phrase 87.79 ops/s
50th percentile latency phrase 2326.27 ms
90th percentile latency phrase 3211.9 ms
99th percentile latency phrase 3407.36 ms
99.9th percentile latency phrase 3426.07 ms
100th percentile latency phrase 3426.67 ms
50th percentile service time phrase 11.3894 ms
90th percentile service time phrase 12.4448 ms
99th percentile service time phrase 13.2777 ms
99.9th percentile service time phrase 16.0467 ms
100th percentile service time phrase 27.9762 ms
error rate phrase 0 %
Min Throughput country_agg_uncached 3.58 ops/s
Mean Throughput country_agg_uncached 3.59 ops/s
Median Throughput country_agg_uncached 3.59 ops/s
Max Throughput country_agg_uncached 3.59 ops/s
50th percentile latency country_agg_uncached 233.557 ms
90th percentile latency country_agg_uncached 235.408 ms
99th percentile latency country_agg_uncached 237.622 ms
100th percentile latency country_agg_uncached 243.659 ms
50th percentile service time country_agg_uncached 232.712 ms
90th percentile service time country_agg_uncached 234.323 ms
99th percentile service time country_agg_uncached 236.966 ms
100th percentile service time country_agg_uncached 242.573 ms
error rate country_agg_uncached 0 %
Min Throughput country_agg_cached 97.91 ops/s
Mean Throughput country_agg_cached 98.47 ops/s
Median Throughput country_agg_cached 98.52 ops/s
Max Throughput country_agg_cached 98.86 ops/s
50th percentile latency country_agg_cached 10.0041 ms
90th percentile latency country_agg_cached 10.9815 ms
99th percentile latency country_agg_cached 14.0725 ms
99.9th percentile latency country_agg_cached 16.1659 ms
100th percentile latency country_agg_cached 16.6761 ms
50th percentile service time country_agg_cached 9.48344 ms
90th percentile service time country_agg_cached 10.1225 ms
99th percentile service time country_agg_cached 10.8447 ms
99.9th percentile service time country_agg_cached 12.8469 ms
100th percentile service time country_agg_cached 15.866 ms
error rate country_agg_cached 0 %
Min Throughput scroll 20.03 pages/s
Mean Throughput scroll 20.04 pages/s
Median Throughput scroll 20.04 pages/s
Max Throughput scroll 20.05 pages/s
50th percentile latency scroll 531.602 ms
90th percentile latency scroll 538.218 ms
99th percentile latency scroll 549.317 ms
100th percentile latency scroll 562.911 ms
50th percentile service time scroll 529.465 ms
90th percentile service time scroll 536.145 ms
99th percentile service time scroll 547.158 ms
100th percentile service time scroll 560.738 ms
error rate scroll 0 %
Min Throughput expression 1.5 ops/s
Mean Throughput expression 1.5 ops/s
Median Throughput expression 1.5 ops/s
Max Throughput expression 1.5 ops/s
50th percentile latency expression 416.483 ms
90th percentile latency expression 418.424 ms
99th percentile latency expression 421.599 ms
100th percentile latency expression 424.739 ms
50th percentile service time expression 415.42 ms
90th percentile service time expression 417.281 ms
99th percentile service time expression 420.517 ms
100th percentile service time expression 423.541 ms
error rate expression 0 %
Min Throughput painless_static 1.5 ops/s
Mean Throughput painless_static 1.5 ops/s
Median Throughput painless_static 1.5 ops/s
Max Throughput painless_static 1.5 ops/s
50th percentile latency painless_static 502.002 ms
90th percentile latency painless_static 504.741 ms
99th percentile latency painless_static 512.938 ms
100th percentile latency painless_static 519.533 ms
50th percentile service time painless_static 501.212 ms
90th percentile service time painless_static 503.798 ms
99th percentile service time painless_static 512.416 ms
100th percentile service time painless_static 518.387 ms
error rate painless_static 0 %
Min Throughput painless_dynamic 1.5 ops/s
Mean Throughput painless_dynamic 1.5 ops/s
Median Throughput painless_dynamic 1.5 ops/s
Max Throughput painless_dynamic 1.5 ops/s
50th percentile latency painless_dynamic 478.191 ms
90th percentile latency painless_dynamic 479.996 ms
99th percentile latency painless_dynamic 484.324 ms
100th percentile latency painless_dynamic 491.608 ms
50th percentile service time painless_dynamic 477.305 ms
90th percentile service time painless_dynamic 478.799 ms
99th percentile service time painless_dynamic 482.913 ms
100th percentile service time painless_dynamic 490.957 ms
error rate painless_dynamic 0 %
Min Throughput decay_geo_gauss_function_score 1 ops/s
Mean Throughput decay_geo_gauss_function_score 1 ops/s
Median Throughput decay_geo_gauss_function_score 1 ops/s
Max Throughput decay_geo_gauss_function_score 1 ops/s
50th percentile latency decay_geo_gauss_function_score 455.099 ms
90th percentile latency decay_geo_gauss_function_score 456.336 ms
99th percentile latency decay_geo_gauss_function_score 456.901 ms
100th percentile latency decay_geo_gauss_function_score 459.396 ms
50th percentile service time decay_geo_gauss_function_score 453.792 ms
90th percentile service time decay_geo_gauss_function_score 454.763 ms
99th percentile service time decay_geo_gauss_function_score 455.521 ms
100th percentile service time decay_geo_gauss_function_score 458.388 ms
error rate decay_geo_gauss_function_score 0 %
Min Throughput decay_geo_gauss_script_score 1 ops/s
Mean Throughput decay_geo_gauss_script_score 1 ops/s
Median Throughput decay_geo_gauss_script_score 1 ops/s
Max Throughput decay_geo_gauss_script_score 1 ops/s
50th percentile latency decay_geo_gauss_script_score 454.019 ms
90th percentile latency decay_geo_gauss_script_score 455.213 ms
99th percentile latency decay_geo_gauss_script_score 458.038 ms
100th percentile latency decay_geo_gauss_script_score 461.409 ms
50th percentile service time decay_geo_gauss_script_score 452.711 ms
90th percentile service time decay_geo_gauss_script_score 453.708 ms
99th percentile service time decay_geo_gauss_script_score 456.396 ms
100th percentile service time decay_geo_gauss_script_score 459.691 ms
error rate decay_geo_gauss_script_score 0 %
Min Throughput field_value_function_score 1.5 ops/s
Mean Throughput field_value_function_score 1.5 ops/s
Median Throughput field_value_function_score 1.5 ops/s
Max Throughput field_value_function_score 1.5 ops/s
50th percentile latency field_value_function_score 174.385 ms
90th percentile latency field_value_function_score 176.118 ms
99th percentile latency field_value_function_score 177.909 ms
100th percentile latency field_value_function_score 178.351 ms
50th percentile service time field_value_function_score 173.177 ms
90th percentile service time field_value_function_score 174.626 ms
99th percentile service time field_value_function_score 176.32 ms
100th percentile service time field_value_function_score 176.706 ms
error rate field_value_function_score 0 %
Min Throughput field_value_script_score 1.5 ops/s
Mean Throughput field_value_script_score 1.5 ops/s
Median Throughput field_value_script_score 1.5 ops/s
Max Throughput field_value_script_score 1.5 ops/s
50th percentile latency field_value_script_score 220.912 ms
90th percentile latency field_value_script_score 221.965 ms
99th percentile latency field_value_script_score 223.945 ms
100th percentile latency field_value_script_score 226.025 ms
50th percentile service time field_value_script_score 219.467 ms
90th percentile service time field_value_script_score 220.774 ms
99th percentile service time field_value_script_score 222.286 ms
100th percentile service time field_value_script_score 225.226 ms
error rate field_value_script_score 0 %
Min Throughput large_terms 1.1 ops/s
Mean Throughput large_terms 1.1 ops/s
Median Throughput large_terms 1.1 ops/s
Max Throughput large_terms 1.1 ops/s
50th percentile latency large_terms 606.171 ms
90th percentile latency large_terms 608.44 ms
99th percentile latency large_terms 611.637 ms
100th percentile latency large_terms 619.257 ms
50th percentile service time large_terms 598.301 ms
90th percentile service time large_terms 600.373 ms
99th percentile service time large_terms 603.796 ms
100th percentile service time large_terms 611.463 ms
error rate large_terms 0 %
Min Throughput large_filtered_terms 1.1 ops/s
Mean Throughput large_filtered_terms 1.1 ops/s
Median Throughput large_filtered_terms 1.1 ops/s
Max Throughput large_filtered_terms 1.1 ops/s
50th percentile latency large_filtered_terms 609.564 ms
90th percentile latency large_filtered_terms 612.481 ms
99th percentile latency large_filtered_terms 616.127 ms
100th percentile latency large_filtered_terms 619.682 ms
50th percentile service time large_filtered_terms 602.022 ms
90th percentile service time large_filtered_terms 604.688 ms
99th percentile service time large_filtered_terms 608.773 ms
100th percentile service time large_filtered_terms 612.379 ms
error rate large_filtered_terms 0 %
Min Throughput large_prohibited_terms 1.1 ops/s
Mean Throughput large_prohibited_terms 1.1 ops/s
Median Throughput large_prohibited_terms 1.1 ops/s
Max Throughput large_prohibited_terms 1.1 ops/s
50th percentile latency large_prohibited_terms 607.44 ms
90th percentile latency large_prohibited_terms 609.881 ms
99th percentile latency large_prohibited_terms 612.648 ms
100th percentile latency large_prohibited_terms 616.866 ms
50th percentile service time large_prohibited_terms 599.814 ms
90th percentile service time large_prohibited_terms 602.215 ms
99th percentile service time large_prohibited_terms 604.533 ms
100th percentile service time large_prohibited_terms 609.535 ms
error rate large_prohibited_terms 0 %
Min Throughput desc_sort_population 1.5 ops/s
Mean Throughput desc_sort_population 1.51 ops/s
Median Throughput desc_sort_population 1.51 ops/s
Max Throughput desc_sort_population 1.51 ops/s
50th percentile latency desc_sort_population 17.1126 ms
90th percentile latency desc_sort_population 17.931 ms
99th percentile latency desc_sort_population 18.8698 ms
100th percentile latency desc_sort_population 19.4662 ms
50th percentile service time desc_sort_population 15.8801 ms
90th percentile service time desc_sort_population 16.1396 ms
99th percentile service time desc_sort_population 16.9675 ms
100th percentile service time desc_sort_population 17.8969 ms
error rate desc_sort_population 0 %
Min Throughput asc_sort_population 1.5 ops/s
Mean Throughput asc_sort_population 1.51 ops/s
Median Throughput asc_sort_population 1.51 ops/s
Max Throughput asc_sort_population 1.51 ops/s
50th percentile latency asc_sort_population 15.3997 ms
90th percentile latency asc_sort_population 16.4551 ms
99th percentile latency asc_sort_population 16.7365 ms
100th percentile latency asc_sort_population 17.722 ms
50th percentile service time asc_sort_population 14.3106 ms
90th percentile service time asc_sort_population 14.6205 ms
99th percentile service time asc_sort_population 15.3431 ms
100th percentile service time asc_sort_population 16.1823 ms
error rate asc_sort_population 0 %
Min Throughput asc_sort_with_after_population 1.5 ops/s
Mean Throughput asc_sort_with_after_population 1.51 ops/s
Median Throughput asc_sort_with_after_population 1.51 ops/s
Max Throughput asc_sort_with_after_population 1.51 ops/s
50th percentile latency asc_sort_with_after_population 18.7667 ms
90th percentile latency asc_sort_with_after_population 19.6079 ms
99th percentile latency asc_sort_with_after_population 19.9175 ms
100th percentile latency asc_sort_with_after_population 22.507 ms
50th percentile service time asc_sort_with_after_population 17.491 ms
90th percentile service time asc_sort_with_after_population 17.8833 ms
99th percentile service time asc_sort_with_after_population 18.4936 ms
100th percentile service time asc_sort_with_after_population 20.5504 ms
error rate asc_sort_with_after_population 0 %
Min Throughput desc_sort_geonameid 6.02 ops/s
Mean Throughput desc_sort_geonameid 6.02 ops/s
Median Throughput desc_sort_geonameid 6.02 ops/s
Max Throughput desc_sort_geonameid 6.03 ops/s
50th percentile latency desc_sort_geonameid 16.0802 ms
90th percentile latency desc_sort_geonameid 17.4 ms
99th percentile latency desc_sort_geonameid 17.7661 ms
100th percentile latency desc_sort_geonameid 17.8499 ms
50th percentile service time desc_sort_geonameid 15.0023 ms
90th percentile service time desc_sort_geonameid 16.2232 ms
99th percentile service time desc_sort_geonameid 16.6905 ms
100th percentile service time desc_sort_geonameid 16.8908 ms
error rate desc_sort_geonameid 0 %
Min Throughput desc_sort_with_after_geonameid 5.99 ops/s
Mean Throughput desc_sort_with_after_geonameid 5.99 ops/s
Median Throughput desc_sort_with_after_geonameid 5.99 ops/s
Max Throughput desc_sort_with_after_geonameid 6 ops/s
50th percentile latency desc_sort_with_after_geonameid 52.3744 ms
90th percentile latency desc_sort_with_after_geonameid 53.2922 ms
99th percentile latency desc_sort_with_after_geonameid 53.8648 ms
100th percentile latency desc_sort_with_after_geonameid 54.5537 ms
50th percentile service time desc_sort_with_after_geonameid 51.7369 ms
90th percentile service time desc_sort_with_after_geonameid 52.2084 ms
99th percentile service time desc_sort_with_after_geonameid 52.806 ms
100th percentile service time desc_sort_with_after_geonameid 53.7037 ms
error rate desc_sort_with_after_geonameid 0 %
Min Throughput asc_sort_geonameid 6.02 ops/s
Mean Throughput asc_sort_geonameid 6.02 ops/s
Median Throughput asc_sort_geonameid 6.02 ops/s
Max Throughput asc_sort_geonameid 6.03 ops/s
50th percentile latency asc_sort_geonameid 14.9471 ms
90th percentile latency asc_sort_geonameid 16.5713 ms
99th percentile latency asc_sort_geonameid 16.995 ms
100th percentile latency asc_sort_geonameid 17.0488 ms
50th percentile service time asc_sort_geonameid 14.2477 ms
90th percentile service time asc_sort_geonameid 15.3667 ms
99th percentile service time asc_sort_geonameid 15.7857 ms
100th percentile service time asc_sort_geonameid 15.9123 ms
error rate asc_sort_geonameid 0 %
Min Throughput asc_sort_with_after_geonameid 6.02 ops/s
Mean Throughput asc_sort_with_after_geonameid 6.02 ops/s
Median Throughput asc_sort_with_after_geonameid 6.02 ops/s
Max Throughput asc_sort_with_after_geonameid 6.03 ops/s
50th percentile latency asc_sort_with_after_geonameid 16.4933 ms
90th percentile latency asc_sort_with_after_geonameid 17.7143 ms
99th percentile latency asc_sort_with_after_geonameid 18.4687 ms
100th percentile latency asc_sort_with_after_geonameid 19.438 ms
50th percentile service time asc_sort_with_after_geonameid 15.487 ms
90th percentile service time asc_sort_with_after_geonameid 16.518 ms
99th percentile service time asc_sort_with_after_geonameid 17.4808 ms
100th percentile service time asc_sort_with_after_geonameid 18.7512 ms
error rate asc_sort_with_after_geonameid 0 %

@Jon-AtAWS - would you be able to assist on this? @joan.romero is looking for some guidance on reducing costs on AWS / EC2 instances. Thanks!