Is it possible to index an array of knn_vectors in a single document?

I would like all the face embeddings that are in the document to be indexed.

Hi @spullara

I think https://forum.opensearch.org/t/getting-0-search-results-for-a-knn-query-and-score-0-results-from-script/9062/1 got deleted.

Could you repose?

That one I figured out was an error on my part.

It is possible to have multiple knn_vector fields in a single document, but not an array of knn_vectors in a single field.

1 Like

yes, until now (the latest version 2.18), we cannot ingest knn_vector type document in multi-fields.

for example, we are able to set index template like the below:

"TITLE_VECTOR_768": {
      "method": {
        "engine": "faiss",
        "space_type": "l2",
        "name": "hnsw",
        "parameters": {
          "ef_construction": 128,
          "m": 24
        }
      },
      "fields": {
        "compression_32x": {
          "method": {
            "engine": "faiss",
            "space_type": "l2",
            "name": "hnsw",
            "parameters": {
              "ef_construction": 128,
              "m": 24
            }
          },
          "type": "knn_vector",
          "dimension": 768
        }
      },
      "dimension": 768,
      "type": "knn_vector"
    },

but in the phase of indexing, it failed:

"type": "mapper_parsing_exception",
    "reason": "failed to parse field [TITLE_VECTOR_768.compression_32x] of type [knn_vector] in document with id '4416856'. Preview of field's value: 'null'",
    "caused_by": {
      "type": "illegal_argument_exception",
      "reason": "Vector dimension mismatch. Expected: 768, Given: 0"
    }

If you use ingest pipeline with text_embedding processors like:

"processors": [
    {
      "text_embedding": {
        "model_id": "zjQEJJMByqw6QjR9eFPm",
        "field_map": {
          "DETAIL": "DETAIL_VECTOR_768",
          "TITLE": "TITLE_VECTOR_768"
        }
      }
    },
    {
      "text_embedding": {
        "model_id": "zjQEJJMByqw6QjR9eFPm",
        "field_map": {
          "DETAIL": "DETAIL_VECTOR_768.compression_32x",
          "TITLE": "TITLE_VECTOR_768.compression_32x"
        }
      }
    },

it also failed:

"root_cause": [
      {
        "type": "class_cast_exception",
        "reason": "class_cast_exception: class java.lang.Float cannot be cast to class java.util.Map (java.lang.Float and java.util.Map are in module java.base of loader 'bootstrap')"
      }
    ],
    "type": "class_cast_exception",
    "reason": "class_cast_exception: class java.lang.Float cannot be cast to class java.util.Map (java.lang.Float and java.util.Map are in module java.base of loader 'bootstrap')"


So if I were you I would have set multiple text_embedding processors and map them to multiple fields , not “multi-fields”.

Still, we can create index.mapping with “nested” type.

Yes, you can have multiple knn_vector fields in a single document