I’ve been using a 1024-dim embedding model(bge-m3) with OpenSearch 2.14, and now I want to test OpenAI text-embedding-3-small (1536 dims).
I tried to switch by only changing the vector dimension, but the Lucene engine errors out, saying it only supports dimensions ≤ 1024.
I then tried the FAISS engine. I indexed new vectors produced by the OpenAI small model, but all searches return no results, even when I query with a vector that’s exactly the same as one I ingested.
My hunch is this might be because the search algorithm is using L2 while my use case expects cosine similarity.
I attempted to create a new index with space_type = cosinesimil and name = hnsw, but OpenSearch reported that this combination isn’t supported.
Question: Is this limitation due to being on 2.14? If so, what’s the correct vector_field configuration to test 1536-dim vector search on this version?
Configuration:
there was no results when I attempted vector search
(1. mapping options, 2. python client code)
@ji99999 According to OpenSearch release notes the cosine similarity for Faiss engine was introduced in version 2.19.0.
Cosine similarity support in the Faiss engine for k-NN and radial search eliminates the need for manual data normalization, offering benefits for use cases such as recommendation systems, fraud detection, and content-based search applications.
@ji99999 I think what you want is use cosine similarity with Faiss in 2.14. Currently there is no native way. What you would need to do is either switch to nmslib/lucene engine, or if you want to use Faiss then normaize the vector before ingestion and then use L2 or innerproduct