Regarding storing vectors

GET /my-nlp-index-new/_doc/1
below is the response
{
“_index”: “my-nlp-index-new”,
“_id”: “1”,
“_version”: 3,
“_seq_no”: 2,
“_primary_term”: 1,
“found”: true,
“_source”: {
“text”: “A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .”,
“id”: “4319130149.jpg”
}
}

I’m following this above blog,
in which embeddings are creating
{
“_index”: “my-nlp-index”,
“_id”: “1”,
“_version”: 1,
“_seq_no”: 0,
“_primary_term”: 1,
“found”: true,
“_source”: {
“passage_embedding”: [
0.04491629,
-0.34105563,
0.036822468,
-0.14139028,

],
“text”: “A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .”,
“id”: “4319130149.jpg”
}
}

but when I try not getting that “passage_embedding”

1 Like

Check that the default_pipeline of your index has been correctly set, the pipeline is required to generate text embedding:

GET /my-nlp-index-new/_settings

Hi Anusha64,

The problem you are facing is because the ingestion processor is not getting triggered when you are adding a document in the index.

Below is the complete gist of the documentation you are following which will fix your problem.

Create a model group id .
Below is an example of it
POST /_plugins/_ml/model_groups/_register" -H "Content-Type:application/json" --data '{"name": "test_model_group_public","description": "This is a public model group" }'

  1. Upload the model,
    Below is an example of it
    PUT /_plugins/_ml/models/_upload?pretty" --data '{ "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2","version": "1.0.1","model_format": "TORCH_SCRIPT","model_group_id":"<model group id>" }' -H "Content-Type:application/json"

  2. Get the model id
    Below is an example of it
    GET _plugins/_ml/tasks/<model-id>

  3. Create the ingest pipeline and add text_embedding processor in it. This processor is responsible for converting text to embedding against the model id provided in it.
    Below is an example of it

PUT /_ingest/pipeline/nlp-ingest-pipeline
{
  "description": "A text embedding pipeline",
  "processors": [
    {
      "text_embedding": {
        "model_id": "<model Id>",
        "field_map": {
          "passage_text": "passage_embedding"
        }
      }
    }
  ]
}

nlp-ingest-pipeline is the ingest pipeline name.

  1. Create an index for ingestion.
    Below is an example of it
PUT /my-nlp-index-new
{
  "settings": {
    "index.knn": true,
    "default_pipeline": "nlp-ingest-pipeline" #ingest pipeline name 
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "text"
      },
      "passage_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "engine": "lucene",
          "space_type": "l2",
          "name": "hnsw",
          "parameters": {}
        }
      },
      "passage_text": {
        "type": "text"
      }
    }
  }
}
  1. Ingest the document.
    Below is the example of it
PUT /my-nlp-index-new/_doc/1
{
  "passage_text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
  "id": "s1"
}

Now, when you will search the document you will find the passage_embedding in the result.

P.S. you can verify the default pipeline settings of the index by triggering below api endpoint
GET /my-nlp-index-new/_settings

Let me know if you have any questions.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.