IllegalStateException when executing RAG agent

yash_chawla · July 25, 2025, 2:17pm

Versions (relevant - OpenSearch - 2.19.0/Dashboard- 2.19.0/Server OS - Windows/Browser - Chrome):

Describe the issue:
I’m trying to integrate Vertex AI’s Gemini 2.5 Flash Lite model with OpenSearch’s ML plugin and use it with the Assistant Toolkit (for chat). I’ve successfully registered the model and it returns responses, but the response_filter doesn’t seem to apply when using function_name: "remote". The full response is returned instead of just the expected string in dataAsMap.response, causing agent/RAG flows to fail with a JSON parsing error.

Has anyone faced this or found a clean workaround?

{
  "status": 500,
  "error": {
    "type": "IllegalStateException",
    "reason": "System Error",
    "details": "Expected BEGIN_ARRAY but was STRING at line 1 column 1 path $\nSee https://github.com/google/gson/blob/main/Troubleshooting.md#unexpected-json-structure"
  }
}

Configuration:
Following are the steps I used to deploy my ML model using Google Vertex AI:

Whitelisting URLs

PUT /_cluster/settings
{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://.*-aiplatform\\.googleapis\\.com/.*$",
      "^https://.*\\.googleapis\\.com/.*$"
    ]
  }
}

Register Model group

POST /_plugins/_ml/model_groups/_register
{
    "name": "remote_model_group_gcp",
    "description": "This is an example group for embedding models"
}

Register Text embedding model

POST /_plugins/_ml/models/_register
{
    "name": "vertexAI: model to generate embeddings",
    "function_name": "remote",
    "model_group_id": "8RArPZgBfc6G9xlsMOnf",
    "description": "test vertexAI model",
    "connector": {
        "name": "VertexAI Connector",
        "description": "The connector to public vertexAI model service for text embedding",
        "version": 1,
        "protocol": "http",
        "parameters": {
            "project": "id",
            "model_id": "text-embedding-004"
        },
        "credential": {
            "vertexAI_token": "TOKEN"
        },
        "actions": [
            {
                "action_type": "predict",
                "method": "POST",
                "url": "https://us-central1-aiplatform.googleapis.com/v1/projects/${parameters.project}/locations/us-central1/publishers/google/models/${parameters.model_id}:predict",
                "headers": {
                    "Authorization": "Bearer ${credential.vertexAI_token}"
                },
                "request_body": "{\"instances\": [{ \"content\": \"${parameters.prompt}\"}]}"
            }
        ]
    }
}

Deploy text embedding model

POST /_plugins/_ml/models/mgabQJgBS0JZCYs7gaWy/_deploy

Test text embedding model (successful)

POST /_plugins/_ml/models/mgabQJgBS0JZCYs7gaWy/_predict
{
  "parameters": {
    "prompt": "Hello World form vertex AI!"
  }
}

Create an index

PUT /my_documents_index
{
  "settings": {
    "index.knn": true
  },
  "mappings": {
    "properties": {
      "text_content": {
        "type": "text"
      },
      "text_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "nmslib"
        }
      }
    }
  }
}

Create ingest pipeline based on text embedding

PUT /_ingest/pipeline/vertexai_embedding_pipeline
{
  "description": "An ingest pipeline to generate Vertex AI embeddings for text_content",
  "processors": [
    {
      "ml_inference": {
        "model_id": "mgabQJgBS0JZCYs7gaWy",
        "input_map": [
          {
            "prompt": "text_content" 
          }
        ],
        "output_map": [
          {
            "text_embedding": "predictions[0].embeddings.values"
          }
        ]
      }
    }
  ]
}

Add data into the index

PUT /my_documents_index/_doc/1?pipeline=vertexai_embedding_pipeline
{
  "text_content": "The quick brown fox jumps over the lazy dog."
}

Register LLM model

POST /_plugins/_ml/models/_register
{
  "name": "vertexai-gemini-2.5-flash-lite-chat",
  "function_name": "remote",
  "description": "Vertex AI Gemini 2.5 Flash Lite via streamGenerateContent",
  "connector": {
    "name": "vertexai-gemini-connector-v2",
    "description": "Connector to Vertex AI Gemini 2.5 Flash Lite",
    "version": 1,
    "protocol": "http",
    "parameters": {
      "project": "quiet-vector-466915-r8",
      "model_id": "gemini-2.5-flash-lite",
      "region": "global"
    },
    "credential": {
      "vertexAI_token": "ya29.a0AS3H6NwVkkJWEwikLeOCjVYzBpK6MYe_3R6vf88nnvjrJMcsJN0zkMKO8BzIgxVYWh1_wukgLiIAo-5oDDoCYBoy_6ywDlq_lH03t3ADPaolsXB6YgtvtkK55tfx9hM7s9fOI5ZR1pE7f2iOPfj-Gc4dPI6aWppO4_euUrPK32ucAwaCgYKATQSARYSFQHGX2Mi8jkdCIxBqa-KT_q15IOGgw0181"
    },
    "input_mapping": {
      "prompt": "$.parameters.input"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "https://aiplatform.googleapis.com/v1/projects/${parameters.project}/locations/${parameters.region}/publishers/google/models/${parameters.model_id}:generateContent",
        "headers": {
          "Authorization": "Bearer ${credential.vertexAI_token}",
          "Content-Type": "application/json"
        },
        "request_body": """{
          "contents": [
            {
              "role": "user",
              "parts": [
                { "text": "${parameters.prompt}" }
              ]
            }
          ],
          "generationConfig": {
            "temperature": 1,
            "maxOutputTokens": 65535,
            "topP": 0.95,
            "thinkingConfig": {
              "thinkingBudget": 0
            }
          },
          "safetySettings": [
            {
              "category": "HARM_CATEGORY_HATE_SPEECH",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_HARASSMENT",
              "threshold": "OFF"
            }
          ]
        }""",
    
        "post_process_function": "return params.candidates != null && params.candidates.length > 0 && params.candidates[0].content.parts != null && params.candidates[0].content.parts.length > 0 && params.candidates[0].content.parts[0].text != null ? params.candidates[0].content.parts[0].text : null;"
      }
    ]
  }
}

Deploy LLM model

POST /_plugins/_ml/models/ngagQJgBS0JZCYs7zKUo/_deploy

Register agent

POST _plugins/_ml/agents/_register
{
  "name": "Chat Agent with Gemini",
  "type": "conversational",
  "description": "this is a test agent",
  "app_type": "os_chat",
  "llm": {
    "model_id": "QCPfQpgBw3GTzPbPu0LX",
    "parameters": {
      "max_iteration": 5,
      "response_filter": "$",
      "message_history_limit": 5,
      "disable_trace": false
    }
  },
  "memory": {
    "type": "conversation_index"
  },
  "tools": [
    {
      "type": "PPLTool",
      "parameters": {
        "model_id": "QCPfQpgBw3GTzPbPu0LX",
        "model_type": "CLAUDE",
        "execute": true
      },
      "include_output_in_agent_response": true
    },
    {
      "type": "VisualizationTool",
      "parameters": {
        "index": ".kibana"
      },
      "include_output_in_agent_response": true
    },
    {
      "type": "VectorDBTool",
      "name": "population_data_knowledge_base",
      "description": "This tool provide population data of US cities.",
      "parameters": {
        "input": "${parameters.question}",
        "index": "test_population_data",
        "source_field": [
          "population_description"
        ],
        "model_id": "QyPhQpgBw3GTzPbPzELR",
        "embedding_field": "population_description_embedding",
        "doc_size": 3
      }
    },
    {
      "type": "VectorDBTool",
      "name": "stock_price_data_knowledge_base",
      "description": "This tool provide stock price data.",
      "parameters": {
        "input": "${parameters.question}",
        "index": "test_stock_price_data",
        "source_field": [
          "stock_price_history"
        ],
        "model_id": "QyPhQpgBw3GTzPbPzELR",
        "embedding_field": "stock_price_history_embedding",
        "doc_size": 3
      }
    },
    {
      "type": "CatIndexTool",
      "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size). \nIt takes 2 optional arguments named `index` which is a comma-delimited list of one or more indices to get information from (default is an empty list meaning all indices), and `local` which means whether to return information from the local node only instead of the cluster manager node (default is false)."
    },
    {
      "type": "SearchAnomalyDetectorsTool"
    },
    {
      "type": "SearchAnomalyResultsTool"
    },
    {
      "type": "SearchMonitorsTool"
    },
    {
      "type": "SearchAlertsTool"
    }
  ]
}

Set it as root agent using id

PUT .plugins-ml-config/_doc/os_chat
{
    "type":"os_chat_root_agent",
    "configuration":{
        "agent_id": "qAalQJgBS0JZCYs7F6WZ"
    }
}

Test agent using _execute (failed)

POST /_plugins/_ml/agents/qAalQJgBS0JZCYs7F6WZ/_execute
{
  "parameters": {
    "question": "What is the quick brown fox?"
  }
}

Relevant Logs or Screenshots:

{
  "status": 500,
  "error": {
    "type": "IllegalStateException",
    "reason": "System Error",
    "details": "Expected BEGIN_ARRAY but was STRING at line 1 column 1 path $\nSee https://github.com/google/gson/blob/main/Troubleshooting.md#unexpected-json-structure"
  }
}

Response from LLM model _predict endpoint:

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "candidates": [
              {
                "content": {
                  "role": "model",
                  "parts": [
                    {
                      "text": "The capital of France is **Paris**."
                    }
                  ]
                },
                "finishReason": "STOP",
                "avgLogprobs": -0.009188520722091198
              }
            ],
            "usageMetadata": {
              "promptTokenCount": 7,
              "candidatesTokenCount": 8,
              "totalTokenCount": 15,
              "trafficType": "ON_DEMAND",
              "promptTokensDetails": [
                {
                  "modality": "TEXT",
                  "tokenCount": 7
                }
              ],
              "candidatesTokensDetails": [
                {
                  "modality": "TEXT",
                  "tokenCount": 8
                }
              ]
            },
            "modelVersion": "gemini-2.5-flash-lite",
            "createTime": "2025-07-25T14:13:26.009420Z",
            "responseId": "BpGDaMxJwYfowQ-NoZK5Dw"
          }
        }
      ],
      "status_code": 200
    }
  ]
}

yash_chawla · July 25, 2025, 4:31pm

The problem always seems to be with the response_filter in my connector no matter how many times I modify it, the output remains the same all the time.

Topic		Replies	Views
Not able to enable the feature rag_pipeline_feature_enabled in ml plugin Machine Learning	3	527	February 28, 2024
[Feedback] Conversational Search and Retrieval Augmented Generation Using Search Pipeline - Experimental Release General Feedback discuss	12	1653	March 30, 2024
[Issue] NullPointerException when executing agent tool with remote LLM connector on OCI OpenSearch 2.18 Machine Learning	4	56	August 20, 2025
Help Needed: IllegalArgumentException When Executing ML Agent with Local LLM (Deepseek R1) OpenSearch Dashboards configure	0	18	July 24, 2025
VectorDBTool Missing for ML Agent Machine Learning	3	62	April 4, 2025

IllegalStateException when executing RAG agent

Related topics