Getting different results in vector scores?

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
using latest docker image: opensearchproject/opensearch:latest

Describe the issue:
We’re testing vector search and so far not happy with results, wondering what have we done wrong.

I have test document with content:

“Ebola Virus Disease (EVD) and encourage U.S. hospitals to prepare for managing patients with\r\nEbola and other infectious diseases. Every hospital should ensure that it can detect a patient with\r\nEbola, protect healthcare workers so they can safely care for the patient, and respond in a coordinated fashion. Many of the signs and symptoms of Ebola are non-specific and similar to those of many common”

and use OpenAI embedding model “text-embedding-3-large” to generate embedding vectors.

The problem is, that if I search for not related word, I get very similar score as to word that actually exists in the document:

“dog Rex”: score 0.5197706
“virus”: score 0.5711079

When I do manual cosine equality in C# code, I get quite different values:

“dog Rex”: score 0.07607428956253202
“virus”: score 0.24901766327076158

While not perfect, but that’s ~6 times difference, compared to 10% difference in Open Search.

My manual cosine equality function looks like this:

double CalculateCosineSimilarity(float[] vector1, float[] vector2)
{
    if (vector1.Length != vector2.Length)
    {
        throw new ArgumentException("Vectors must be of equal length.");
    }

    double dotProduct = vector1.Zip(vector2, (a, b) => a * b).Sum();
    double magnitude1 = Math.Sqrt(vector1.Sum(a => a * a));
    double magnitude2 = Math.Sqrt(vector2.Sum(b => b * b));
        
    return dotProduct / (magnitude1 * magnitude2);
}

Configuration:

I’ve created index as following (initially tried with ef_construction 128, later increased to 500):

PUT /my-index
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text"
      },
      "content_vector": {
        "type": "knn_vector",
        "dimension": 3072,
        "method": {
          "name": "hnsw",
          "space_type": "cosinesimil",
          "engine": "nmslib",
          "parameters": {
            "ef_construction": 500,
            "m": 16
          }
        }
      }
    }
  }
}

post document:

POST /my-index/_doc/1
{
  "content": "Ebola Virus Disease (EVD) and encourage U.S. hospitals to prepare for managing patients with\r\nEbola and other infectious diseases. Every hospital should ensure that it can detect a patient with\r\nEbola, protect healthcare workers so they can safely care for the patient, and respond in a coordinated fashion. Many of the signs and symptoms of Ebola are non-specific and similar to those of many common",
  "content_vector": [  ...omitted for brevity...  ]
}

search:

POST /my-index/_search
{
  "size": 10,
  "query": {
    "knn": {
      "content_vector": {
        "vector": [  ...omitted for brevity...  ],
        "k": 10
      }
    }
  }
}

Relevant Logs or Screenshots:

Seems that there’s not too much traffic or the question is too complex :slight_smile:

I did simplify the problem to the minimum - I try to index two simplest vectors - [0,1] and [0,-1]. If cosine similarity is a cosine over angle of vectors, angle is 180 degrees, cosine value should be -1, but I’m getting 0.33:

POST /my-index2/_search
{
  "size": 10,
  "query": {
    "knn": {
      "content_vector": {
        "vector": [0, 1],
        "k": 10
      }
    }
  }
}

result:

 "hits": [
      {
        "_index": "my-index2",
        "_id": "1",
        "_score": 0.9999999,
        "_source": {
          "content": "up",
          "content_vector": [
            0,
            1
          ]
        }
      },
      {
        "_index": "my-index2",
        "_id": "2",
        "_score": 0.33333334,
        "_source": {
          "content": "down",
          "content_vector": [
            0,
            -1
          ]
        }
      }
    ]

so either score is not a cosine similarity or I’m missing something very obvious.

@dziedrius let me try to ans this.

cosine_similarity = 1 - cos in Opensearch, you can read about this here: Approximate k-NN search - OpenSearch Documentation

Once we have the cosine_similarity we then convert it to Opesnearch score. Use the same documentation to how know how the scores are generated. In your case of score 0.99999 the way it is getting calculated is

Query Vector: [0,1]
Document Vector: [0, -1]

cos(theta) = -1/(1x1) = -1

cosine_similarity = 1 - (-1) = 2

Opensearch score = 1 /(1 + cosine_similarity) = 1/3 = 0.33333

I hope this clarifies your doubt.

@Navneet - thanks that explains the resulting scores.

Where I still struggle - the last step of normalization. It seems that lucene’s approach (2-d/2) would have better dynamic range - [0,1], while nmslib - [0.333, 1], hence wondering why they chose their approach?

Why dynamic range could be important - seems that at least some of the embedding models have limited range (discussion about it here: Some questions about text-embedding-ada-002’s embedding - API - OpenAI Developer Forum)