ML -predict endpoint keeps timing out

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
my opensearch version is 2.16

Describe the issue:
I have a FastAPI client running a self hosted LLM (ollama) when i query the endpoint using the ml plugin i keep getting a timeout. I have tried to increase the values in the client settings while creating the connector but that seems to not have any effect.

Configuration:
My current configuration for the client is

POST /_plugins/_ml/connectors/_create
{
  "name": "FastAPI Assistant",
  "description": "Connects to local FastAPI instance for data analysis",
  "version": 1,
  "protocol": "http",
  "parameters": {
    "host": "127.0.0.1",
    "port": "8000"
  },
  "client_config": {
    "max_connection": 3,
    "connection_timeout": 1600,
    "read_timeout": 1800,
    "retry_backoff_millis": 1800000,
    "retry_timeout_seconds": 1800
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "url": "http://${parameters.host}:${parameters.port}/llm",
      "headers": {
        "Content-Type": "application/json"
      },
      "request_body": "{ \"input\": \"${parameters.input}\" }"
    }
  ]
}

Relevant Logs or Screenshots:
But when i run queries such i keep getting a timeout error in about 60 seconds

{
  "error": {
    "root_cause": [
      {
        "type": "status_exception",
        "reason": "Error communicating with remote model: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.\nConsider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.\nIncreasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.\nIf the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests."
      }
    ],

here is a sample query i run:

POST /_plugins/_ml/models/0TA-xJYBWWVe7dZYCvWz/_predict
{
  "parameters": {
    "input": "hello"
  }
}
1 Like