Regarding storing vectors

Anusha64 · November 21, 2023, 11:03am

GET /my-nlp-index-new/_doc/1
below is the response
{
“_index”: “my-nlp-index-new”,
“_id”: “1”,
“_version”: 3,
“_seq_no”: 2,
“_primary_term”: 1,
“found”: true,
“_source”: {
“text”: “A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .”,
“id”: “4319130149.jpg”
}
}

I’m following this above blog,
in which embeddings are creating
{
“_index”: “my-nlp-index”,
“_id”: “1”,
“_version”: 1,
“_seq_no”: 0,
“_primary_term”: 1,
“found”: true,
“_source”: {
“passage_embedding”: [
0.04491629,
-0.34105563,
0.036822468,
-0.14139028,
…
],
“text”: “A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .”,
“id”: “4319130149.jpg”
}
}

but when I try not getting that “passage_embedding”

gaobinlong · November 23, 2023, 5:57am

Check that the default_pipeline of your index has been correctly set, the pipeline is required to generate text embedding:

GET /my-nlp-index-new/_settings

varun4996 · December 8, 2023, 8:11pm

Hi Anusha64,

The problem you are facing is because the ingestion processor is not getting triggered when you are adding a document in the index.

Below is the complete gist of the documentation you are following which will fix your problem.

Create a model group id .
Below is an example of it
POST /_plugins/_ml/model_groups/_register" -H "Content-Type:application/json" --data '{"name": "test_model_group_public","description": "This is a public model group" }'

Upload the model,
Below is an example of it
PUT /_plugins/_ml/models/_upload?pretty" --data '{ "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2","version": "1.0.1","model_format": "TORCH_SCRIPT","model_group_id":"<model group id>" }' -H "Content-Type:application/json"
Get the model id
Below is an example of it
GET _plugins/_ml/tasks/<model-id>
Create the ingest pipeline and add text_embedding processor in it. This processor is responsible for converting text to embedding against the model id provided in it.
Below is an example of it

PUT /_ingest/pipeline/nlp-ingest-pipeline
{
  "description": "A text embedding pipeline",
  "processors": [
    {
      "text_embedding": {
        "model_id": "<model Id>",
        "field_map": {
          "passage_text": "passage_embedding"
        }
      }
    }
  ]
}

nlp-ingest-pipeline is the ingest pipeline name.

Create an index for ingestion.
Below is an example of it

PUT /my-nlp-index-new
{
  "settings": {
    "index.knn": true,
    "default_pipeline": "nlp-ingest-pipeline" #ingest pipeline name 
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "text"
      },
      "passage_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "engine": "lucene",
          "space_type": "l2",
          "name": "hnsw",
          "parameters": {}
        }
      },
      "passage_text": {
        "type": "text"
      }
    }
  }
}

Ingest the document.
Below is the example of it

PUT /my-nlp-index-new/_doc/1
{
  "passage_text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  "id": "s1"
}

Now, when you will search the document you will find the passage_embedding in the result.

P.S. you can verify the default pipeline settings of the index by triggering below api endpoint
GET /my-nlp-index-new/_settings

Let me know if you have any questions.

system · February 6, 2024, 8:11pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Generating embeddings for arrays of objects OpenSearch index-management	2	73	March 4, 2025
Semantic field type not working OpenSearch	5	27	July 21, 2025
Neural search text_embedding pipeline error (null_pointer_exception) Machine Learning troubleshoot	2	559	April 20, 2024
Ingestion pipeline for a nested field OpenSearch troubleshoot	3	902	September 19, 2024
Neural search text_embedding pipeline error Machine Learning	1	51	April 12, 2025

Regarding storing vectors

Related topics