[Issue] NullPointerException when executing agent tool with remote LLM connector on OCI OpenSearch 2.18

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

  • Cluster running on: Oracle Cloud (OCI)
  • OpenSearch: Version: 2.18
  • Tested: Opensearch Dashboard (Dev tools)
  • Browser: Edge

Describe the issue:

When testing OpenSearch ML agents on Oracle Cloud (OCI) with OpenSearch version 2.18, executing the TransferQuestionToPPLAndExecuteTool agent tool results in a 500 error with a NullPointerException. The error message is:

{"status": 500, "error": { "type": "NullPointerException", "reason": "System Error", "details": "Cannot invoke \"java.lang.CharSequence.length()\" because \"this.text\" is null" }}

Configuration:

To Reproduce

  1. Configure the cluster and ML Commons plugin with the following persistent settings:
PUT _cluster/settings
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99",
        "rag_pipeline_feature_enabled": "true",
        "memory_feature_enabled": "true",
        "allow_registering_model_via_local_file": "true",
        "allow_registering_model_via_url": "true",
        "model_auto_redeploy.enable":"true",
        "model_auto_redeploy.lifetime_retry_times": 10
      }
    }
  }
}
  1. Register a model group, remote connector, and model using the provided API calls.
POST /_plugins/_ml/model_groups/_register
{
  "name": "leeuw_model_group",
  "description": "Leeuw model group"
}

Create a remote connector

POST _plugins/_ml/connectors/_create
{
   "name": "llam_connector",
   "description": "llam_connector",
   "version": 2,
   "protocol": "oci_sigv1",
   "parameters": {
       "endpoint": "inference.generativeai.us-chicago-1.oci.oraclecloud.com",
       "auth_type": "resource_principal"
   },
   "credential": {
   },
   "actions": [
     {
       "action_type": "predict",
       "method": "POST",
       "url": "https://${parameters.endpoint}/20231130/actions/chat",
       "request_body": "{\"compartmentId\":\"ocid1.compartment.oc1..aaaaaaaay47stvmccb7qfj6eh354divek7muoj3qne2xggh3gowf3z42m2ua\",\"servingMode\":{\"modelId\":\"meta.llama-3.1-70b-instruct\",\"servingType\":\"ON_DEMAND\"},\"chatRequest\":{\"maxTokens\":600,\"temperature\":1,\"frequencyPenalty\":0,\"presencePenalty\":0,\"topP\":0.75,\"topK\":-1,\"isStream\":false,\"apiFormat\":\"GENERIC\",\"messages\":[{\"role\":\"USER\",\"content\":[{\"type\":\"TEXT\",\"text\":\"${parameters.prompt}\"}]}]}}",
       
        "post_process_function": "def text = params['chatResponse']['choices'][0]['message']['content'][0]['text'].replace('\n', '\\\\n').replace('\"','');\n return '{\"name\":\"response\",\"dataAsMap\":{\"inferenceResponse\":{\"generatedTexts\":[{\"text\":\"' + text + '\"}]}}}'"
     }
   ]
 }

Register the model

POST /_plugins/_ml/models/_register
{
  "name": "llam_model",
  "function_name": "remote",
  "model_group_id": "B7jC6JYBnWn4H7zgW6Yh",
  "description": "llama genai model",
  "connector_id": "Qbh_6ZYBnWn4H7zgBaaX"
}
  1. Deploy the model and register an agent with the LLM and tool configuration as shown in the logs above.
POST /_plugins/_ml/models/Q7h_6ZYBnWn4H7zgIabh/_deploy
POST /_plugins/_ml/models/CrjC6JYBnWn4H7zg_Kai/_predict
{
  "parameters": {
    "prompt": "how are you?"
  }
}

Register the agent

POST _plugins/_ml/agents/_register
{
  "name": "Chat Agent with Llama",
  "type": "flow",
  "description": "this is a test agent",
  "app_type": "os_chat",
  "llm": {
    "model_id": "CrjC6JYBnWn4H7zg_Kai",
    "parameters": {
      "max_iteration": 5,
      "message_history_limit": 10,
      "disable_trace": false
    }
  },
  "memory": {
    "type": "demo"
  },
  "tools": [
    {
      "type": "PPLTool",
      "name": "TransferQuestionToPPLAndExecuteTool",
      "description": "Use this tool to transfer natural language to generate PPL and execute PPL to query inside. Use this tool after you know the index name, otherwise, call IndexRoutingTool first. The input parameters are: {index:IndexName, question:UserQuestion}.",
      "parameters": {
        "model_id": "CrjC6JYBnWn4H7zg_Kai",
        "model_type": "FINETUNE",
        "execute": true
      }
    },
    {
      "type": "CatIndexTool",
      "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid,primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size). \nIt  takes 2 optional arguments named `index` which is a comma-delimited list of one or more indexes to  get information from (default is an empty list meaning all indexes), and `local` which means whether to return information from the local node only instead of the cluster manager node (default is false)."
    }
  ]
}
  1. Run the following predict and agent tool execution requests:
POST /_plugins/_ml/models/CrjC6JYBnWn4H7zg_Kai/_predict
{
  "parameters": {
    "prompt": "how are you?"
  }
}

POST /_plugins/_ml/agents/H7hh6ZYBnWn4H7zgwKaZ/_execute
{
  "parameters": {
    "index"   : "opensearch_dashboards_sample_data_flights",
    "question": "how many flights are there?"
  }
}

Relevant Logs or Screenshots:

Similar Issue. While executing Agent. What I found is that ML common plugins memory index got corrupted. Indexes ( .plugins-ml-memory-meta & .plugins-ml-memory-messages) are not accessible to agent hence failing.

[2025-06-20T08:53:07,302][DEBUG][o.o.m.t.MLTaskRunner     ] [ip-10-37-4-61] Execute ML request 32de7f34-eede-4b26-a4b6-045d672d3bae locally on node xpuzY8k1RimsoxphGsYRiQ
[2025-06-20T08:53:07,304][DEBUG][o.o.m.e.a.a.MLAgentExecutor] [ip-10-37-4-61] Completed Get Agent Request, Agent id:ZKxtjJcBI4w65yUkV-_b
[2025-06-20T08:53:07,311][ERROR][o.o.m.e.a.a.MLAgentExecutor] [ip-10-37-4-61] Failed to read conversation memory
java.lang.NullPointerException: null
        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903) ~[?:?]
        at org.opensearch.ml.engine.memory.MLMemoryManager.createInteraction(MLMemoryManager.java:110) ~[?:?]
        at org.opensearch.ml.engine.memory.ConversationIndexMemory.save(ConversationIndexMemory.java:116) ~[?:?]
        at org.opensearch.ml.engine.algorithms.agent.MLAgentExecutor.saveRootInteractionAndExecute(MLAgentExecutor.java:300) ~[?:?]
        at org.opensearch.ml.engine.algorithms.agent.MLAgentExecutor.lambda$execute$2(MLAgentExecutor.java:234) ~[?:?]

Can we delete these indexes to reset ? Is there any other ways to reset memory for Agents ?

PS: Observation, it does not matter which agent in used, even after creating new agents it seems there is issue with the Memory access process.

Could you please cut an issue : GitHub · Where software is built

please provide details as much as possible so that we can reproduce the issue. Also have you tried with updated cluster versions, like 2.19 or 3.0?

I am using OpenSearch 2.19.1