Hi Anusha64,
The problem you are facing is because the ingestion processor is not getting triggered when you are adding a document in the index.
Below is the complete gist of the documentation you are following which will fix your problem.
Create a model group id .
Below is an example of it
POST /_plugins/_ml/model_groups/_register" -H "Content-Type:application/json" --data '{"name": "test_model_group_public","description": "This is a public model group" }'
-
Upload the model,
Below is an example of it
PUT /_plugins/_ml/models/_upload?pretty" --data '{ "name": "huggingface/sentence-transformers/all-MiniLM-L12-v2","version": "1.0.1","model_format": "TORCH_SCRIPT","model_group_id":"<model group id>" }' -H "Content-Type:application/json"
-
Get the model id
Below is an example of it
GET _plugins/_ml/tasks/<model-id>
-
Create the ingest pipeline and add text_embedding processor in it. This processor is responsible for converting text to embedding against the model id provided in it.
Below is an example of it
PUT /_ingest/pipeline/nlp-ingest-pipeline
{
"description": "A text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "<model Id>",
"field_map": {
"passage_text": "passage_embedding"
}
}
}
]
}
nlp-ingest-pipeline is the ingest pipeline name.
- Create an index for ingestion.
Below is an example of it
PUT /my-nlp-index-new
{
"settings": {
"index.knn": true,
"default_pipeline": "nlp-ingest-pipeline" #ingest pipeline name
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"passage_embedding": {
"type": "knn_vector",
"dimension": 768,
"method": {
"engine": "lucene",
"space_type": "l2",
"name": "hnsw",
"parameters": {}
}
},
"passage_text": {
"type": "text"
}
}
}
}
- Ingest the document.
Below is the example of it
PUT /my-nlp-index-new/_doc/1
{
"passage_text": "A West Virginia university women 's basketball team , officials , and a small gathering of fans are in a West Virginia arena .",
"id": "s1"
}
Now, when you will search the document you will find the passage_embedding in the result.
P.S. you can verify the default pipeline settings of the index by triggering below api endpoint
GET /my-nlp-index-new/_settings
Let me know if you have any questions.