[Performance Tuning] Checking the KNN threads

By reading the performance tuning. The knn allows multi threads indexing. Performance tuning - OpenSearch documentation

We have the cluster configure to index thread quantity 3 and apply the knn enable to true for the index as below. May I know how to validate this index is using 3 threads?

{
  "filebeat-reindex-2022.04.18" : {
    "settings" : {
      "index" : {
        "mapping" : {
          "total_fields" : {
            "limit" : "10000"
          }
        },
        "refresh_interval" : "60s",
        "number_of_shards" : "50",
        "provided_name" : "filebeat-reindex-2022.04.18",
        "max_docvalue_fields_search" : "200",
        "query" : {
          "default_field" : [
            "message",
            "msg",
            "sysloghost",
            "severity",
            "programname",
            "request_method",
            "request_path",
            "protocoli",
            "status_int",
            "referer",
            "user_agent",
            "auth_token",
            "client_etag",
            "transaction_id",
            "source",
            "log_info",
            "start_time",
            "end_time",
            "policy_index",
            "account",
            "container",
            "object",
            "wire_status_int",
            "reserve1",
            "reserve2",
            "additional_info",
            "host.name",
            "account_keyword",
            "container_keyword",
            "object_keyword",
            "fields.*"
          ]
        },
        "knn" : "true",
        "creation_date" : "1650362701743",
        "number_of_replicas" : "1",
        "uuid" : "GQk-py0DRdqT3TAYpONTrA",
        "version" : {
          "created" : "135247927"
        }
      }
    }
  }
}

Hi @hugok,

Threads will be spawned by the nmslib library outside OpenSearch heap.

One way to validate is the indexing time latency and CPU utilization. As you increase the number of threads(assuming your instance/machine has multiple cores), you should see indexing time reduced and cpu utilization spiked up.

1 Like

Thanks for the suggestion. I’ll take a look how long it takes to index same amount of document to compare with.