Inconsistent similarity scores using L2 space type and larger embedding model

alilafzi · October 17, 2024, 7:35pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue:

Hello,

I have been experimenting with using two different embedding models, one with a dimension of 384 and the other having a size of 768, for some vectors that we store in our opensearch cluster. I have seen expected results for our model with size 384, but unexpected results when I change from 384 to 768.

As a test, I am using a text that exists in the index as my query. Consequently, as you can see in the screenshot below, the first retrieved record has got a perfect similarity score of 1.0 with my query for both embedding models since an exact match is present in the index. However, the subsequent scores for the larger embedding model suddenly drop to very low values of 0.003 and so on. These numbers do not map to any meaningful value. For example, I calculated the normalized Euclidean distance between the embedding vector of the second record and that of the query and used the formula of 1/(1+distance) posted here (Exact k-NN with scoring script - OpenSearch Documentation) to get the expected score, which led to 0.96.

I would greatly appreciate it if someone could share any insight on this.

Thanks,

Configuration:

Relevant Logs or Screenshots:

Topic		Replies	Views
How knn score in cosinesimil space is being calculated? OpenSearch	2	63	April 2, 2025
Inconsistent results using KNN script score with Cosine Similarity k-NN	5	2361	October 15, 2021
Getting different results in vector scores? k-NN	6	287	September 22, 2024
Knn search is too slow OpenSearch troubleshoot	1	918	June 29, 2023
Less Relevant Result is getting higher score in neural search OpenSearch	0	54	August 11, 2024

Inconsistent similarity scores using L2 space type and larger embedding model

Related topics