KNN Queries Are Slow And Not Cached

tobe · October 7, 2024, 3:02pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch Docker image: opensearchproject/opensearch:2.17.0

Describe the issue:
We have built a vector index and knn queries are quite slow (around 20 seconds). We checked the stats and the cache seems empty although “knn_query_with_filter_requests” increments with more queries being executed.

Configuration:
docker compose:

services:
  opensearch:
    restart: unless-stopped
    image: opensearchproject/opensearch:2.17.0
    ports:
      - 9200:9200 # REST API
      - 9600:9600 # Performance Analyzer
    environment:
      - discovery.type=single-node
      - cluster.name=opensearch-cluster
      - bootstrap.memory_lock=true # Disable JVM heap memory swapping  
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=XYZ
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    volumes:
      - opensearch-data:/usr/share/opensearch/data

docker compose override:

services:
  opensearch:
    environment:
      OPENSEARCH_JAVA_OPTS: -Xms32g -Xmx32g
    deploy:
      resources:
        limits:
          memory: 34g
          cpus: "20"

Mapping:

settings:
  default:
    index:
        number_of_shards: 2
        number_of_replicas: 2
        max_result_window: 3000000
        knn: true
    mappings:
      default:
        dynamic: false
        properties:
          "embedding":
            type: "knn_vector"
            dimension: 768
            method:
              engine: lucene
              name: hnsw
              space_type: l2
              parameters:
                ef_construction: 128
                m: 16

Opensearch has 32GB of memory. There are around 4 millions docs distributed over several indices.

Relevant Logs or Screenshots:

After warmup (k-NN plugin API - OpenSearch Documentation):

GET /_plugins/_knn/stats?pretty

{
  "_nodes": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "cluster_name": "opensearch-cluster",
  "circuit_breaker_triggered": false,
  "model_index_status": null,
  "nodes": {
    "xyz...": {
      "max_distance_query_with_filter_requests": 0,
      "graph_memory_usage_percentage": 0,
      "graph_query_requests": 0,
      "graph_memory_usage": 0,
      "cache_capacity_reached": false,
      "load_success_count": 0,
      "training_memory_usage": 0,
      "indices_in_cache": {},
      "script_query_errors": 0,
      "hit_count": 0,
      "knn_query_requests": 2174,
      "total_load_time": 0,
      "miss_count": 0,
      "min_score_query_requests": 0,
      "knn_query_with_filter_requests": 819,
      "training_memory_usage_percentage": 0,
      "max_distance_query_requests": 0,
      "lucene_initialized": true,
      "graph_index_requests": 0,
      "faiss_initialized": false,
      "load_exception_count": 0,
      "training_errors": 0,
      "min_score_query_with_filter_requests": 0,
      "eviction_count": 0,
      "nmslib_initialized": false,
      "script_compilations": 0,
      "script_query_requests": 0,
      "graph_stats": {
        "merge": {
          "current": 0,
          "total": 0,
          "total_time_in_millis": 0,
          "current_docs": 0,
          "total_docs": 0,
          "total_size_in_bytes": 0,
          "current_size_in_bytes": 0
        },
        "refresh": {
          "total": 0,
          "total_time_in_millis": 0
        }
      },
      "graph_query_errors": 0,
      "indexing_from_model_degraded": false,
      "graph_index_errors": 0,
      "training_requests": 0,
      "script_compilation_errors": 0
    }
  }
}

If we understand the docs correctly, the cache should not be empty after warmup:

The k-NN plugin builds a native library index of the vectors for each knn-vector field/Lucene segment pair during indexing, which can be used to efficiently find the k-nearest neighbors to a query vector during search. To learn more about Lucene segments, see the Apache Lucene documentation. These native library indexes are loaded into native memory during search and managed by a cache. To learn more about preloading native library indexes into memory, refer to the warmup API. Additionally, you can see which native library indexes are already loaded in memory. To learn more about this, see the stats API section.

I wanted to try the performance analyzer plugin but it is broken: [BUG] Performance Analyzer webserver on port 9600 not responding to any API calls (caused by JDK upgrade?) · Issue #545 · opensearch-project/performance-analyzer-rca · GitHub

Thanks for any hint.

system · December 6, 2024, 3:02pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Knn search is too slow OpenSearch troubleshoot	1	931	June 29, 2023
Why my OpenSearch vector search is slow OpenSearch	5	1299	January 18, 2024
Vector search is very slow OpenSearch	0	228	May 14, 2024
Problems with kNN-searches OpenSearch troubleshoot , configure	1	480	March 21, 2024
Trying to improve Opensearch performance on vector search k-NN	1	511	July 18, 2024

KNN Queries Are Slow And Not Cached

Related topics