Less Relevant Result is getting higher score in neural search

Agnivesh · August 11, 2024, 3:26pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 2.13

Describe the issue:
was using neural search and querying Shoes but Jeans is getting more score than Women’s Shoes , While when I was trying the similarity score with model directly than I was getting more score for Women’s Shoes rather than Jeans, Same case is happening with other search and mostly smaller senatances are getting higher score than full santance

Configuration:
Model :- cohere.embed-multilingual-v3
Embedding field :-

"description_embedding": {
          "type": "knn_vector",
          "dimension": 1024,
          "method": {
            "engine": "lucene",
            "space_type": "l2",
            "name": "hnsw",
            "parameters": {}
          }
        }

 "name_embedding": {
          "type": "knn_vector",
          "dimension": 1024,
          "method": {
            "engine": "lucene",
            "space_type": "l2",
            "name": "hnsw",
            "parameters": {}
          }

Also I was doing couple of experiments and I was finding that with small data set results are good but with larger data set I’m facing this issue, So is there any way to handle that issue ?

Topic		Replies	Views
Inconsistent similarity scores using L2 space type and larger embedding model OpenSearch troubleshoot	0	146	October 17, 2024
Hybrid search and normalization processor k-NN	1	263	May 19, 2024
Searching for irrelevant data also returns results k-NN	2	508	October 27, 2023
How knn score in cosinesimil space is being calculated? OpenSearch	2	63	April 2, 2025
k-NN search with different ef_construction, ef_search and m return same results OpenSearch	5	141	January 18, 2025

Less Relevant Result is getting higher score in neural search

Related topics