Creating index for only exact KNN search

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 3.3

Describe the issue:

I want to create an index that will only be used for exact KNN search. While I plan to index millions of documents, at any given time I’ll only be searching over a subset of 5,000–10,000 documents (pre-filtering). Given this access pattern, I believe exact KNN search is sufficient — and by avoiding the overhead of an HNSW graph, I expected to reduce both index size and cost.

To test this, I created an index with the knn setting explicitly set to false:

PUT /test-exact-knn-index
{
  "settings": {
    "index": {
      "knn": false,
      "number_of_shards": 5,
      "number_of_replicas": 1
    }
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword",
        "doc_values": false
      },
      "text_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "doc_values": false
      },
      "text": {
        "type": "text"
      }
    }
  }
}

Contrary to my expectation, I noticed that the size of the index is larger than before. All the values below are excluding the replica size.

  • Size of index with knn setting true and in_memory mode: 164GB
  • Size of index with knn setting true and on_disk mode (with 32x compression): 105GB
  • Size of index with knn setting as false: 222GB

I initially noticed that the doc_values setting for the text_embedding field is set to true and thought that was the reason the size of the index is going up. I then tried to explicitly set it to false like above, but it still shows as true after the index is created.

Here are the index settings:

{
  "test-exact-knn-index": {
    "aliases": {},
    "mappings": {
      "properties": {
        "id": {
          "type": "keyword",
          "doc_values": false
        },
        "text": {
          "type": "text"
        },
        "text_embedding": {
          "type": "knn_vector",
          "doc_values": true,
          "dimension": 768
        }
      }
    },
    "settings": {
      "index": {
        "replication": {
          "type": "DOCUMENT"
        },
        "number_of_shards": "5",
        "provided_name": "test-exact-knn-index",
        "knn": "false",
        "creation_date": "1771958779417",
        "number_of_replicas": "1",
        "uuid": "ARKqYReGRgsEGZcTNhQfEg",
        "version": {
          "created": "137247498"
        }
      }
    }
  }
}

Questions:

  • Why does the index size increase when knn is set to false? Shouldn’t removing the HNSW graph reduce storage overhead?

  • Is this the correct approach for configuring an index for only exact KNN search?

  • Why is doc_values being forced to true on the knn_vector field even when explicitly set to false?

  • I also observed that enabling on_disk mode (with 32x compression) reduces the index size from 164 GB to 105 GB, with all other settings being the same. My understanding of on_disk mode is that it stores quantized vectors in memory and full-precision vectors on disk. If that’s the case, I would expect the total storage footprint to be slightly larger than in_memory mode, since it now stores both the quantized and full-precision copies. Why does on_disk mode result in a smaller index size compared to in_memory mode? Am I misunderstanding how on_disk mode manages vector storage?

@qpurtmin Thank you for the question, I’ll try to address these one at a time:

  1. Why does index size INCREASE when knn: false?
    When index.knn: false, OpenSearch uses FlatVectorFieldMapper.java which stores vectors as binary doc values (full-precision float32 for every document):
this.fieldType.setDocValuesType(DocValuesType.BINARY);

With HNSW (knn: true), the graph stores compressed connectivity info + quantized vectors. With knn: false, you get every vector stored at full precision in doc values. For 1024-dim float32 vectors = 4KB/doc × millions of docs this easily exceeds the HNSW graph size.

  1. Why is doc_values: false ignored on knn_vector fields?
    In KNNVectorFieldMapper.java, doc_values is forced to true for indices on OpenSearch ≥ 3.0.0 when using FlatVectorFieldMapper:
if (indexCreatedVersion.onOrAfter(Version.V_3_0_0) && hasDocValues.isConfigured() == false) {
      hasDocValues = Parameter.docValuesParam(m -> toType(m).hasDocValues, true);
}

The FlatVectorFieldMapper needs doc values because without an HNSW graph, the only way to retrieve and compare vectors is from doc values storage.

  1. Why does on_disk 32x compression only go from 164 GB to 105 GB (not ~5 GB)?
    On_disk mode stores both quantized and full-precision vectors for rescoring. From CompressionLevel.java:
x32(32, "32x", new RescoreContext(3.0f, false, true), Set.of(Mode.ON_DISK)),

The RescoreContext(3.0f, false, true), the true flag enables automatic rescoring. This means:

~5 GB of quantized vectors for HNSW traversal
~100 GB of full-precision vectors stored for rescore (accuracy improvement)
Total ≈ 105 GB

The rescore step re-ranks candidates from the quantized HNSW search using full-precision distances, which is why full-precision data must still be stored on disk.

To achieve what you are looking for you could use the automatic fallback, by setting index.knn.advanced.filtered_exact_search_threshold to a large value, so filtered queries always fall back to brute-force exact search via ExactSearcher.java, see following example:

Create index:

PUT /my-exact-knn-index
{
  "settings": {
    "index.knn": true,
    "index.knn.advanced.filtered_exact_search_threshold": 2147483647
  },
  "mappings": {
    "properties": {
      "my_vector": {
        "type": "knn_vector",
        "dimension": 128,
        "method": {
          "name": "hnsw",
          "engine": "faiss",
          "space_type": "l2"
        }
      },
      "category": {
        "type": "keyword"
      }
    }
  }
}

Index documents:

POST /my-exact-knn-index/_bulk
{ "index": { "_id": "1" } }
{ "my_vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2], "category": "A" }
{ "index": { "_id": "2" } }
{ "my_vector": [0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1, 0.8, 0.3, 0.6, 0.2, 0.9, 0.4, 0.7, 0.5, 0.1], "category": "B" }
{ "index": { "_id": "3" } }
{ "my_vector": [0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.9, 0.8], "category": "A" }
{ "index": { "_id": "4" } }
{ "my_vector": [0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3, 0.5, 0.2, 0.6, 0.1, 0.7, 0.8, 0.9, 0.4, 0.3], "category": "A" }

Query with filter:

GET /my-exact-knn-index/_search
{
  "query": {
    "knn": {
      "my_vector": {
        "vector": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.1, 0.2],
        "k": 3,
        "filter": {
          "term": {
            "category": "A"
          }
        }
      }
    }
  }
}

The key tradeoff is setting the threshold to Integer.MAX_VALUE means every filtered query is exact, which is great for accuracy, but query latency scales with the size of your filtered subset rather than O(log n) HNSW traversal.

Hope this helps