Is it possible to index an array of knn_vectors in a single document?

spullara · March 29, 2022, 12:21am

I would like all the face embeddings that are in the document to be indexed.

jmazane · March 30, 2022, 8:41pm

I think https://forum.opensearch.org/t/getting-0-search-results-for-a-knn-query-and-score-0-results-from-script/9062/1 got deleted.

Could you repose?

spullara · March 31, 2022, 9:09pm

That one I figured out was an error on my part.

jmazane · April 6, 2022, 5:23pm

It is possible to have multiple knn_vector fields in a single document, but not an array of knn_vectors in a single field.

yeonghyeonKo · November 15, 2024, 5:59am

yes, until now (the latest version 2.18), we cannot ingest knn_vector type document in multi-fields.

for example, we are able to set index template like the below:

"TITLE_VECTOR_768": {
      "method": {
        "engine": "faiss",
        "space_type": "l2",
        "name": "hnsw",
        "parameters": {
          "ef_construction": 128,
          "m": 24
        }
      },
      "fields": {
        "compression_32x": {
          "method": {
            "engine": "faiss",
            "space_type": "l2",
            "name": "hnsw",
            "parameters": {
              "ef_construction": 128,
              "m": 24
            }
          },
          "type": "knn_vector",
          "dimension": 768
        }
      },
      "dimension": 768,
      "type": "knn_vector"
    },

but in the phase of indexing, it failed:

"type": "mapper_parsing_exception",
    "reason": "failed to parse field [TITLE_VECTOR_768.compression_32x] of type [knn_vector] in document with id '4416856'. Preview of field's value: 'null'",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Vector dimension mismatch. Expected: 768, Given: 0"
    }

If you use ingest pipeline with text_embedding processors like:

"processors": [
    {
      "text_embedding": {
        "model_id": "zjQEJJMByqw6QjR9eFPm",
        "field_map": {
          "DETAIL": "DETAIL_VECTOR_768",
          "TITLE": "TITLE_VECTOR_768"
        }
      }
    },
    {
      "text_embedding": {
        "model_id": "zjQEJJMByqw6QjR9eFPm",
        "field_map": {
          "DETAIL": "DETAIL_VECTOR_768.compression_32x",
          "TITLE": "TITLE_VECTOR_768.compression_32x"
        }
      }
    },

it also failed:

"root_cause": [
      {
        "type": "class_cast_exception",
        "reason": "class_cast_exception: class java.lang.Float cannot be cast to class java.util.Map (java.lang.Float and java.util.Map are in module java.base of loader 'bootstrap')"
      }
    ],
    "type": "class_cast_exception",
    "reason": "class_cast_exception: class java.lang.Float cannot be cast to class java.util.Map (java.lang.Float and java.util.Map are in module java.base of loader 'bootstrap')"

So if I were you I would have set multiple text_embedding processors and map them to multiple fields , not “multi-fields”.

yeonghyeonKo · November 15, 2024, 8:23am

Still, we can create index.mapping with “nested” type.

lucyy300 · November 16, 2024, 9:27am

Yes, you can have multiple knn_vector fields in a single document

Topic		Replies	Views
Multiple ingest pipelines for an index Machine Learning	2	944	February 19, 2024
Regarding storing vectors k-NN troubleshoot	3	316	February 6, 2024
Ingestion pipeline for a nested field OpenSearch troubleshoot	3	889	September 19, 2024
Neural search not working with nested vector field mappings OpenSearch releases , discuss , troubleshoot , configure	0	175	September 6, 2024
k-NN multiple field search in OpenSearch k-NN	6	2499	May 12, 2023

Is it possible to index an array of knn_vectors in a single document?

Related topics