Connector for vertex ai models in conversational search

Hello All,

I am trying to create a connector Google Vertex AI models like chat-bison and text-bison. There are connector blueprints for OpenAI, Amazon BedRock etc. Can I invoke Google Vertex AI models using the connector? I tried to create one but the conversational search fails with the following error.

“Error from remote service: {\n "error": {\n "code": 400,\n "message": "1 instance(s) is allowed per prediction. Actual: 6",\n "status": "INVALID_ARGUMENT"\n }\n}\n”

The connector payload is given below:

{
    "name": "Vertex AI Chat Connector",
    "description": "The connector to Google Vertex AI",
    "version": 2,
    "protocol": "http",
    "parameters": {
        "endpoint": "<ENDPOINT>",
        "project": "<PROJECT>",
        "location" : "<LOCATION>",
        "model": "text-bison@002",
        "temperature": 0.2
    },
    "credential": {
        "VertexAI_Key": "<VERTEX_AI_KEY>"
    },
    "actions": [
        {
            "action_type": "predict",
            "method": "POST",
            "url": "https://${parameters.endpoint}/v1/projects/${parameters.project}/locations/${parameters.location}/publishers/google/models/${parameters.model}:predict",
            "headers": {
                "Authorization": "Bearer ${credential.VertexAI_Key}"
            },
            "request_body": "{\"instances\":${parameters.messages},\"parameters\":{\"temperature\":${parameters.temperature},\"maxOutputTokens\":256,\"topK\":40,\"topP\":0.95}}"
        }
    ]
}

Any help would be highly appreciated.

Thanks.

The error message says that you passed 6 messages (instances), but you can only pass 1 at a time. How are you testing your connector? Using the CLI (curl)?

I am using conversational search feature of OpenSearch. Followed the below steps:

  1. Created a connector to Google Vertex AI PaLM 2 for text foundation model
  2. Registered a new model in OpenSearch with the connector id
  3. Deployed the model in OpenSearch
  4. Created a new search pipeline with response processor retrieval_augmented_generation
  5. Performed a conversational search by specifying the search pipeline created above with the below ext object
"ext": {
        "generative_qa_parameters": {
            "llm_model": "text-bison@002",
            "llm_question": "which modules were loaded?",
            "context_size": 1,
            "timeout": 30
        }
    }

I explicitly specified the context size = 1 so only one document is sent to the model. The conversational search API returns the error below.

“Error from remote service: {\n "error": {\n "code": 400,\n "message": "1 instance(s) is allowed per prediction. Actual: 6",\n "status": "INVALID_ARGUMENT"\n }\n}\n”

Can you share your search payload in details?