Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 2.13.0
Describe the issue:
I’ve faced problems with configuring default_model_id along with using chunk_processor. My text field transformed to chunks through using Text chunking processor and those chunks are converted into vectors. Now I want to perform search on that nested vector field using neural search, but without required definition of model_id in every search request.
Configuration:
My ml-pipeline
PUT /_ingest/pipeline/ml-pipeline
{
"processors": [
{
"text_chunking": {
"algorithm": {
"fixed_token_length": {
"token_limit": 384,
"max_chunk_limit" : -1
}
},
"field_map": {
"text": "passage_chunk"
}
}
},
{
"text_embedding": {
"field_map": {
"passage_chunk": "chunk_passage_embedding"
},
"model_id": "{model_id}"
}
}
]
}
My index settings:
PUT /test-ml-index
{
"settings": {
"index.knn": true,
"default_pipeline": "ml-pipeline"
},
"mappings": {
"properties": {
"id": {
"type": "text"
},
"chunk_passage_embedding": {
"type": "nested",
"properties": {
"knn": {
"type": "knn_vector",
"dimension": 384,
"method": {
"engine": "lucene",
"space_type": "l2",
"name": "hnsw",
"parameters": {}
}
}
}
},
"text": {
"type": "text"
}
}
}
}
How I try to define default search model:
PUT /_search/pipeline/default_model_search_pipeline
{
"request_processors": [
{
"neural_query_enricher": {
"default_model_id": "{{model_id}}"
}
}
]
}
and then:
PUT /test-ml-index/_settings
{
"index.search.default_pipeline" : "default_model_search_pipeline"
}
Relevant Logs or Screenshots:
When I perform search request:
{
"query": {
"nested": {
"score_mode": "max",
"path": "chunk_passage_embedding",
"query": {
"neural": {
"chunk_passage_embedding.knn": {
"query_text": "cat",
"k": 100
}
}
}
}
}
}
I get the error:
{
"error": {
"root_cause": [
{
"type": "null_pointer_exception",
"reason": "modelId is marked non-null but is null"
}
],
"type": "null_pointer_exception",
"reason": "modelId is marked non-null but is null"
},
"status": 500
}