RAG processor fails when combined with ML inference query rewriting in search pipeline (OpenSearch 3.2.0)

crow_of_judgement · September 19, 2025, 8:33pm

Versions:
OpenSearch Version: 3.2.0
OpenSearch Dashbiards Version: 3.2.0
OpenSearch Security Version: 3.2.0.0

Docker container on Docker Desktop 4.36.0 (175267)
Windows 10 (10.0.19045 N/A Build 19045)

Describe the issue:
I’m trying to combine two processors inside a search pipeline:

ML inference request processor - rewrites the user’s natural question into a more search-friendly query.
RAG response processor - uses the search results as context and generates a final answer with an LLM.

The reasoning is:

When users ask a question, the query text isn’t always ideal for document retrieval (for example, “What were the latest film nominations?” vs. a keyword query like film nominations 2024).
In a typical RAG setup, you want the agent/LLM to see the original question for generation, but the search engine to see a refined query for better recall.
So the ML inference processor takes the original question, rewrites it into a compact search query, and replaces the query_text field.
The RAG processor should then pick up the retrieved “sources” and generate the final answer based on them.

In principle this would let OpenSearch handle the whole flow: user question → query rewriting → document retrieval → answer generation.

However, when I try to run both processors in the same pipeline, the RAG step fails with runtime errors.

With both processors active, I see this:

{
  "error": {
    "root_cause": [
      {
        "type": "class_cast_exception",
        "reason": "class org.opensearch.searchpipelines.questionanswering.generative.ext.GenerativeQAParameters cannot be cast to class java.util.Map (org.opensearch.searchpipelines.questionanswering.generative.ext.GenerativeQAParameters is in unnamed module of loader java.net.URLClassLoader @3609b8f2; java.util.Map is in module java.base of loader 'bootstrap')"
      }
    ],
    "type": "class_cast_exception",
    "reason": "class org.opensearch.searchpipelines.questionanswering.generative.ext.GenerativeQAParameters cannot be cast to class java.util.Map (org.opensearch.searchpipelines.questionanswering.generative.ext.GenerativeQAParameters is in unnamed module of loader java.net.URLClassLoader @3609b8f2; java.util.Map is in module java.base of loader 'bootstrap')"
  },
  "status": 500
}

With ?verbose_pipeline=true I also get JSON generation errors:

{
  "error": {
    "root_cause": [
      {
        "type": "exception",
        "reason": "com.fasterxml.jackson.core.JsonGenerationException: Can not start an object, expecting field name (context: Object)"
      }
    ],
    "type": "exception",
    "reason": "com.fasterxml.jackson.core.JsonGenerationException: Can not start an object, expecting field name (context: Object)",
    "caused_by": {
      "type": "json_generation_exception",
      "reason": "Can not start an object, expecting field name (context: Object)",
      "suppressed": [
        {
          "type": "illegal_state_exception",
          "reason": "Failed to close the XContentBuilder",
          "caused_by": {
            "type": "i_o_exception",
            "reason": "Unclosed object or array found"
          }
        }
      ]
    }
  },
  "status": 500
}

The combined (RAG + ML inference) search pipeline looks like this:

{
  "vector_enhanced_rag_pipeline": {
    "request_processors": [
      {
        "ml_inference": {
          "model_id": "crVrY5kBiXJL4rgg9_KU",
          "function_name": "remote",
          "model_input": """{ "parameters": { "messages": [ { "role": "system", "content": "This is the users question, return me ONLY the keyword-like query string for Google search. YOU RETURN ONLY THE QUERY STRING." }, { "role": "user", "content": "${input_map.user_query}" } ] } }""",
          "input_map": [
            {
              "user_query": "query.neural.text_vector.query_text"
            }
          ],
          "output_map": [
            {
              "query.neural.text_vector.query_text": "inference_results[0].output[0].dataAsMap.choices[0].message.content"
            }
          ],
          "full_response_path": true,
          "tag": "enhanced_rag_pipeline",
          "description": "RAG pipeline with query rewriting",
          "ignore_missing": false
        }
      }
    ],
    "response_processors": [
      {
        "retrieval_augmented_generation": {
          "tag": "enhanced_rag_pipeline",
          "description": "RAG pipeline with query rewriting",
          "model_id": "crVrY5kBiXJL4rgg9_KU",
          "context_field_list": [
            "text_string"
          ],
          "system_prompt": "You are a helpful assistant.",
          "user_instructions": "Answer concisely in less than 100 words."
        }
      }
    ]
  }
}

And the query is this:

GET /vector_test_index/_search?search_pipeline=vector_enhanced_rag_pipeline&verbose_pipeline=true
{
  "query": {
    "neural": {
      "text_vector": {
        "query_text": "What were the latest film nominations?",
        "model_id": "bbVrY5kBiXJL4rgg5fLn",
        "min_score": 0.6
      }
    }
  },
  "ext": {
    "generative_qa_parameters": {
      "llm_model": "gpt-5-nano",
      "llm_question": "What were the latest film nominations?"
    }
  },
  "_source": "text_string",
  "size": 5
}

What I expected

The ML inference processor would intercept the query, rewrite query.neural.text_vector.query_text into a keyword-like search query, then the “sources” search will happend after which the RAG processor would run as usual with better retrieval context.

What happens instead

The ML inference alone works. The RAG processor alone works.
But as soon as the RAG response processor is added to the ML inference request processor, execution fails with casting or JSON serialization errors.

Questions

Is this combination of request + response processors supported in OpenSearch 3.2.0?
Is this a limitation of the current RAG implementation?
Or does it look like a bug where the RAG processor cannot consume the pipeline output after ML inference rewrites the query, (or other cause…)?

Any guidance would help - I want to know if this should work, or if I need to treat it as an unsupported workflow for now.

Please help me understand whether there is a problem with my implementation (or idea) or if it is some kind of bug or unsupported operation.

Configuration:
Here is the “vanilla” RAG pipeline that works fine:

{
  "vector_rag_pipeline": {
    "response_processors": [
      {
        "retrieval_augmented_generation": {
          "tag": "rag_pipeline",
          "description": "Retrieval Augmented Generation pipeline",
          "model_id": "P7VTY5kBiXJL4rggY-w1",
          "context_field_list": [
            "text_string"
          ],
          "system_prompt": "You are a helpful assistant",
          "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
        }
      }
    ]
  }
}

Here is the “query_rewrite” pipeline with only the ML inference processor to update the query that also works as expected:

{
  "query_rewrite_pipeline": {
    "request_processors": [
      {
        "ml_inference": {
          "model_id": "crVrY5kBiXJL4rgg9_KU",
          "function_name": "remote",
          "model_input": """{ "parameters": { "messages": [ { "role": "system", "content": "This is the users question, return me ONLY the keyword-like query string for Google search. YOU RETURN ONLY THE QUERY STRING." }, { "role": "user", "content": "${input_map.user_query}" } ] } }""",
          "input_map": [
            {
              "user_query": "query.neural.text_vector.query_text"
            }
          ],
          "output_map": [
            {
              "query.neural.text_vector.query_text": "inference_results[0].output[0].dataAsMap.choices[0].message.content"
            }
          ],
          "full_response_path": true,
          "tag": "query_rewriter",
          "description": "Pipeline with ML inference query rewriting",
          "ignore_missing": false
        }
      }
    ]
  }
}

And this is the working index mappings:

{
  "vector_test_index": {
    "mappings": {
      "properties": {
        "guid": {
          "type": "keyword"
        },
        "text_string": {
          "type": "text"
        },
        "text_vector": {
          "type": "knn_vector",
          "dimension": 384,
          "method": {
            "engine": "faiss",
            "space_type": "cosinesimil",
            "name": "hnsw",
            "parameters": {}
          }
        }
      }
    }
  }
}

system · November 18, 2025, 8:33pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Not able to create a RAG Search Pipeline after following the documentation OpenSearch discuss , troubleshoot , configure	1	308	February 20, 2024
[Feedback] Conversational Search and Retrieval Augmented Generation Using Search Pipeline - Experimental Release General Feedback discuss	12	1749	March 30, 2024
Adding search query term to the input for ml_inference processor Machine Learning	4	113	March 3, 2025
Not able to enable the feature rag_pipeline_feature_enabled in ml plugin Machine Learning	3	556	February 28, 2024
IllegalStateException when executing RAG agent Machine Learning discuss , troubleshoot , configure	2	77	September 23, 2025

RAG processor fails when combined with ML inference query rewriting in search pipeline (OpenSearch 3.2.0)

Related topics