Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 3.3.2
Describe the issue:
I built a vector index (faiss engine, HNSW + BQ, 1024 dimension per vector) with memory_optimized_search enabled (totally over 10 billion vectors) and the index setting is shown as follows:
"settings": {
"index": {
"knn": true,
"knn.memory_optimized_search": true,
"replication.type": "SEGMENT",
"number_of_shards": 232,
"number_of_replicas": 0
}
}
After inserting all the data, I performed a force_merge to reduce segments to 1 per shard (approx. 43 million documents per shard). After that, when I searched the index with a query vector and an error “integer overflow” occurred
The full stack looks like:
Caused by: org.opensearch.core.common.io.stream.NotSerializableExceptionWrapper: arithmetic_exception: integer overflow
at java.lang.Math.toIntExact(Math.java:1374) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.MonotonicIntegerSequenceEncoder.encode(MonotonicIntegerSequenceEncoder.java:63) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissHNSW.load(FaissHNSW.java:77) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.binary.FaissBinaryHnswIndex.doLoad(FaissBinaryHnswIndex.java:51) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissIndex.load(FaissIndex.java:55) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissIdMapIndex.doLoad(FaissIdMapIndex.java:58) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissIndex.load(FaissIndex.java:55) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissMemoryOptimizedSearcher.<init>(FaissMemoryOptimizedSearcher.java:49) ~[?:?]
at org.opensearch.knn.memoryoptsearch.faiss.FaissMemoryOptimizedSearcherFactory.createVectorSearcher(FaissMemoryOptimizedSearcherFactory.java:39) ~[?:?]
at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsReader.lambda$getVectorSearcherSupplier$0(NativeEngines990KnnVectorsReader.java:372) ~[?:?]
at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsReader.loadMemoryOptimizedSearcherIfRequired(NativeEngines990KnnVectorsReader.java:325) ~[?:?]
at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsReader.trySearchWithMemoryOptimizedSearch(NativeEngines990KnnVectorsReader.java:252) ~[?:?]
at org.opensearch.knn.index.codec.KNN990Codec.NativeEngines990KnnVectorsReader.search(NativeEngines990KnnVectorsReader.java:196) ~[?:?]
at org.apache.lucene.codecs.perfield.PerFieldKnnVectorsFormat$FieldsReader.search(PerFieldKnnVectorsFormat.java:324) ~[lucene-core-10.3.1.jar:10.3.1 51190f35a16d2ce433139abfe0fd8365791b352a - 2025-10-02 09:50:16]
at org.opensearch.knn.index.query.memoryoptsearch.MemoryOptimizedKNNWeight.queryIndex(MemoryOptimizedKNNWeight.java:202) ~[?:?]
at org.opensearch.knn.index.query.memoryoptsearch.MemoryOptimizedKNNWeight.doANNSearch(MemoryOptimizedKNNWeight.java:100) ~[?:?]
at org.opensearch.knn.index.query.KNNWeight.approximateSearch(KNNWeight.java:505) ~[?:?]
at org.opensearch.knn.index.query.KNNWeight.searchLeaf(KNNWeight.java:336) ~[?:?]
at org.opensearch.knn.index.query.nativelib.NativeEngineKnnVectorQuery.searchLeaf(NativeEngineKnnVectorQuery.java:438) ~[?:?]
at org.opensearch.knn.index.query.nativelib.NativeEngineKnnVectorQuery.lambda$doSearch$0(NativeEngineKnnVectorQuery.java:272) ~[?:?]
The main issue is caused by MonotonicIntegerSequenceEncoder.encode function:
final long value = Math.toIntExact(input.readLong());
Is there any solution to this problem? It seems that it’s a bug when indexing large amounts of vectors with memory_optimized_search enabled.