IllegalStateException when executing RAG agent

Versions (relevant - OpenSearch - 2.19.0/Dashboard- 2.19.0/Server OS - Windows/Browser - Chrome):

Describe the issue:
I’m trying to integrate Vertex AI’s Gemini 2.5 Flash Lite model with OpenSearch’s ML plugin and use it with the Assistant Toolkit (for chat). I’ve successfully registered the model and it returns responses, but the response_filter doesn’t seem to apply when using function_name: "remote". The full response is returned instead of just the expected string in dataAsMap.response, causing agent/RAG flows to fail with a JSON parsing error.

Has anyone faced this or found a clean workaround?

{
  "status": 500,
  "error": {
    "type": "IllegalStateException",
    "reason": "System Error",
    "details": "Expected BEGIN_ARRAY but was STRING at line 1 column 1 path $\nSee https://github.com/google/gson/blob/main/Troubleshooting.md#unexpected-json-structure"
  }
}

Configuration:
Following are the steps I used to deploy my ML model using Google Vertex AI:

Whitelisting URLs

PUT /_cluster/settings
{
  "persistent": {
    "plugins.ml_commons.trusted_connector_endpoints_regex": [
      "^https://.*-aiplatform\\.googleapis\\.com/.*$",
      "^https://.*\\.googleapis\\.com/.*$"
    ]
  }
}

Register Model group

POST /_plugins/_ml/model_groups/_register
{
    "name": "remote_model_group_gcp",
    "description": "This is an example group for embedding models"
}

Register Text embedding model

POST /_plugins/_ml/models/_register
{
    "name": "vertexAI: model to generate embeddings",
    "function_name": "remote",
    "model_group_id": "8RArPZgBfc6G9xlsMOnf",
    "description": "test vertexAI model",
    "connector": {
        "name": "VertexAI Connector",
        "description": "The connector to public vertexAI model service for text embedding",
        "version": 1,
        "protocol": "http",
        "parameters": {
            "project": "id",
            "model_id": "text-embedding-004"
        },
        "credential": {
            "vertexAI_token": "TOKEN"
        },
        "actions": [
            {
                "action_type": "predict",
                "method": "POST",
                "url": "https://us-central1-aiplatform.googleapis.com/v1/projects/${parameters.project}/locations/us-central1/publishers/google/models/${parameters.model_id}:predict",
                "headers": {
                    "Authorization": "Bearer ${credential.vertexAI_token}"
                },
                "request_body": "{\"instances\": [{ \"content\": \"${parameters.prompt}\"}]}"
            }
        ]
    }
}

Deploy text embedding model

POST /_plugins/_ml/models/mgabQJgBS0JZCYs7gaWy/_deploy

Test text embedding model (successful)

POST /_plugins/_ml/models/mgabQJgBS0JZCYs7gaWy/_predict
{
  "parameters": {
    "prompt": "Hello World form vertex AI!"
  }
}

Create an index

PUT /my_documents_index
{
  "settings": {
    "index.knn": true
  },
  "mappings": {
    "properties": {
      "text_content": {
        "type": "text"
      },
      "text_embedding": {
        "type": "knn_vector",
        "dimension": 768,
        "method": {
          "name": "hnsw",
          "space_type": "l2",
          "engine": "nmslib"
        }
      }
    }
  }
}

Create ingest pipeline based on text embedding

PUT /_ingest/pipeline/vertexai_embedding_pipeline
{
  "description": "An ingest pipeline to generate Vertex AI embeddings for text_content",
  "processors": [
    {
      "ml_inference": {
        "model_id": "mgabQJgBS0JZCYs7gaWy",
        "input_map": [
          {
            "prompt": "text_content" 
          }
        ],
        "output_map": [
          {
            "text_embedding": "predictions[0].embeddings.values"
          }
        ]
      }
    }
  ]
}

Add data into the index

PUT /my_documents_index/_doc/1?pipeline=vertexai_embedding_pipeline
{
  "text_content": "The quick brown fox jumps over the lazy dog."
}

Register LLM model

POST /_plugins/_ml/models/_register
{
  "name": "vertexai-gemini-2.5-flash-lite-chat",
  "function_name": "remote",
  "description": "Vertex AI Gemini 2.5 Flash Lite via streamGenerateContent",
  "connector": {
    "name": "vertexai-gemini-connector-v2",
    "description": "Connector to Vertex AI Gemini 2.5 Flash Lite",
    "version": 1,
    "protocol": "http",
    "parameters": {
      "project": "quiet-vector-466915-r8",
      "model_id": "gemini-2.5-flash-lite",
      "region": "global"
    },
    "credential": {
      "vertexAI_token": "ya29.a0AS3H6NwVkkJWEwikLeOCjVYzBpK6MYe_3R6vf88nnvjrJMcsJN0zkMKO8BzIgxVYWh1_wukgLiIAo-5oDDoCYBoy_6ywDlq_lH03t3ADPaolsXB6YgtvtkK55tfx9hM7s9fOI5ZR1pE7f2iOPfj-Gc4dPI6aWppO4_euUrPK32ucAwaCgYKATQSARYSFQHGX2Mi8jkdCIxBqa-KT_q15IOGgw0181"
    },
    "input_mapping": {
      "prompt": "$.parameters.input"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "https://aiplatform.googleapis.com/v1/projects/${parameters.project}/locations/${parameters.region}/publishers/google/models/${parameters.model_id}:generateContent",
        "headers": {
          "Authorization": "Bearer ${credential.vertexAI_token}",
          "Content-Type": "application/json"
        },
        "request_body": """{
          "contents": [
            {
              "role": "user",
              "parts": [
                { "text": "${parameters.prompt}" }
              ]
            }
          ],
          "generationConfig": {
            "temperature": 1,
            "maxOutputTokens": 65535,
            "topP": 0.95,
            "thinkingConfig": {
              "thinkingBudget": 0
            }
          },
          "safetySettings": [
            {
              "category": "HARM_CATEGORY_HATE_SPEECH",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
              "threshold": "OFF"
            },
            {
              "category": "HARM_CATEGORY_HARASSMENT",
              "threshold": "OFF"
            }
          ]
        }""",
    
        "post_process_function": "return params.candidates != null && params.candidates.length > 0 && params.candidates[0].content.parts != null && params.candidates[0].content.parts.length > 0 && params.candidates[0].content.parts[0].text != null ? params.candidates[0].content.parts[0].text : null;"
      }
    ]
  }
}

Deploy LLM model

POST /_plugins/_ml/models/ngagQJgBS0JZCYs7zKUo/_deploy

Register agent

POST _plugins/_ml/agents/_register
{
  "name": "Chat Agent with Gemini",
  "type": "conversational",
  "description": "this is a test agent",
  "app_type": "os_chat",
  "llm": {
    "model_id": "QCPfQpgBw3GTzPbPu0LX",
    "parameters": {
      "max_iteration": 5,
      "response_filter": "$",
      "message_history_limit": 5,
      "disable_trace": false
    }
  },
  "memory": {
    "type": "conversation_index"
  },
  "tools": [
    {
      "type": "PPLTool",
      "parameters": {
        "model_id": "QCPfQpgBw3GTzPbPu0LX",
        "model_type": "CLAUDE",
        "execute": true
      },
      "include_output_in_agent_response": true
    },
    {
      "type": "VisualizationTool",
      "parameters": {
        "index": ".kibana"
      },
      "include_output_in_agent_response": true
    },
    {
      "type": "VectorDBTool",
      "name": "population_data_knowledge_base",
      "description": "This tool provide population data of US cities.",
      "parameters": {
        "input": "${parameters.question}",
        "index": "test_population_data",
        "source_field": [
          "population_description"
        ],
        "model_id": "QyPhQpgBw3GTzPbPzELR",
        "embedding_field": "population_description_embedding",
        "doc_size": 3
      }
    },
    {
      "type": "VectorDBTool",
      "name": "stock_price_data_knowledge_base",
      "description": "This tool provide stock price data.",
      "parameters": {
        "input": "${parameters.question}",
        "index": "test_stock_price_data",
        "source_field": [
          "stock_price_history"
        ],
        "model_id": "QyPhQpgBw3GTzPbPzELR",
        "embedding_field": "stock_price_history_embedding",
        "doc_size": 3
      }
    },
    {
      "type": "CatIndexTool",
      "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size). \nIt takes 2 optional arguments named `index` which is a comma-delimited list of one or more indices to get information from (default is an empty list meaning all indices), and `local` which means whether to return information from the local node only instead of the cluster manager node (default is false)."
    },
    {
      "type": "SearchAnomalyDetectorsTool"
    },
    {
      "type": "SearchAnomalyResultsTool"
    },
    {
      "type": "SearchMonitorsTool"
    },
    {
      "type": "SearchAlertsTool"
    }
  ]
}

Set it as root agent using id

PUT .plugins-ml-config/_doc/os_chat
{
    "type":"os_chat_root_agent",
    "configuration":{
        "agent_id": "qAalQJgBS0JZCYs7F6WZ"
    }
}

Test agent using _execute (failed)

POST /_plugins/_ml/agents/qAalQJgBS0JZCYs7F6WZ/_execute
{
  "parameters": {
    "question": "What is the quick brown fox?"
  }
}

Relevant Logs or Screenshots:

{
  "status": 500,
  "error": {
    "type": "IllegalStateException",
    "reason": "System Error",
    "details": "Expected BEGIN_ARRAY but was STRING at line 1 column 1 path $\nSee https://github.com/google/gson/blob/main/Troubleshooting.md#unexpected-json-structure"
  }
}

Response from LLM model _predict endpoint:

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "candidates": [
              {
                "content": {
                  "role": "model",
                  "parts": [
                    {
                      "text": "The capital of France is **Paris**."
                    }
                  ]
                },
                "finishReason": "STOP",
                "avgLogprobs": -0.009188520722091198
              }
            ],
            "usageMetadata": {
              "promptTokenCount": 7,
              "candidatesTokenCount": 8,
              "totalTokenCount": 15,
              "trafficType": "ON_DEMAND",
              "promptTokensDetails": [
                {
                  "modality": "TEXT",
                  "tokenCount": 7
                }
              ],
              "candidatesTokensDetails": [
                {
                  "modality": "TEXT",
                  "tokenCount": 8
                }
              ]
            },
            "modelVersion": "gemini-2.5-flash-lite",
            "createTime": "2025-07-25T14:13:26.009420Z",
            "responseId": "BpGDaMxJwYfowQ-NoZK5Dw"
          }
        }
      ],
      "status_code": 200
    }
  ]
}

The problem always seems to be with the response_filter in my connector no matter how many times I modify it, the output remains the same all the time.