Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 2.9
Describe the issue:
I am trying to use a customer connector to have a model running in sagemaker do vector encoding. I can’t find anywhere documented what the inference endpoint needs to return to opensearch for either creating a vector of a search term, or creating vectors via a pipeline when ingesting data.
I have had the encoder return a list which fails, and also tried a json array that has the vector and that also fails. I am not sure what exactly needs to be returned from the model and how / what needs to be done at the connector level.
Configuration:
Relevant Logs or Screenshots:
My inference code is:
import json
from sentence_transformers import SentenceTransformer
def model_fn(model_dir):
# Load model from HuggingFace Hub
model = SentenceTransformer(model_dir)
return model
def predict_fn(data, model):
# Tokenize sentences
print(data)
input_texts = data.pop("inputs", data)
embeddings_sentence_transformer = model.encode(input_texts, normalize_embeddings=True)
return {"vectors": json.dumps(embeddings_sentence_transformer.tolist())}
I am trying to call the model via
POST /_plugins/_ml/models/gwq5Go4BcDeh12u6M7Fa/_predict
{
"parameters": {"inputs": "hello world"},
"return_number": true,
"target_response": ["sentence_embedding"]
}
and get:
Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_NUMBER]
my connector code is:
"name": "multilingual-e5-large",
"description": "multilingual-e5-large",
"version": 1,
"protocol": "aws_sigv4",
"credential": {
"roleArn": "arn:aws:iam::xxxxx:role/opensearch-sagemaker-role"
},
"parameters": {
"region": "us-east-1",
"service_name": "sagemaker"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/xxxxxxx/invocations",
"headers": {
"content-type": "application/json"
},
"post_process_function": "return params['vectors']",
"request_body": "{\"inputs\":\"${parameters.inputs}\"}"
}
]