What does a model have to return from sagemaker for opensearch to be able to use it?

jtrollin · March 7, 2024, 9:11pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 2.9

Describe the issue:
I am trying to use a customer connector to have a model running in sagemaker do vector encoding. I can’t find anywhere documented what the inference endpoint needs to return to opensearch for either creating a vector of a search term, or creating vectors via a pipeline when ingesting data.

I have had the encoder return a list which fails, and also tried a json array that has the vector and that also fails. I am not sure what exactly needs to be returned from the model and how / what needs to be done at the connector level.

Configuration:

Relevant Logs or Screenshots:
My inference code is:

import json
from sentence_transformers import SentenceTransformer

def model_fn(model_dir):
  # Load model from HuggingFace Hub
    model = SentenceTransformer(model_dir)
    return model

def predict_fn(data, model):
    # Tokenize sentences
    print(data)
    
    input_texts = data.pop("inputs", data)
    embeddings_sentence_transformer = model.encode(input_texts, normalize_embeddings=True)

    return {"vectors": json.dumps(embeddings_sentence_transformer.tolist())}

I am trying to call the model via

POST /_plugins/_ml/models/gwq5Go4BcDeh12u6M7Fa/_predict
{
  "parameters": {"inputs": "hello world"},
  "return_number": true,
  "target_response": ["sentence_embedding"]
}

and get:

Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_NUMBER]

my connector code is:

"name": "multilingual-e5-large",
  "description": "multilingual-e5-large",
  "version": 1,
  "protocol": "aws_sigv4",
  "credential": {
    "roleArn": "arn:aws:iam::xxxxx:role/opensearch-sagemaker-role"
  },
  "parameters": {
    "region": "us-east-1",
    "service_name": "sagemaker"
  },
  "actions": [
    {
      "action_type": "predict",
      "method": "POST",
      "url": "https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/xxxxxxx/invocations",
      "headers": {
        "content-type": "application/json"
      },
      "post_process_function": "return params['vectors']",
      "request_body": "{\"inputs\":\"${parameters.inputs}\"}"
    }
  ]

dhrubo · March 7, 2024, 9:27pm

Hi @jtrollin ,

Could you please take a look at this doc and see if that solves your issue?

Thanks
Dhrubo

jtrollin · March 9, 2024, 12:38am

I think I am close, but now I get this error from opensearch

"reason": "arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object[] to the type of the destination array, java.lang.Number"

I am forcing the array to be of float32 in the inference, but no matter what I do, the connector does not see the array as the right type

def predict_fn(data, model):
	# Tokenize sentences
	print(data)

	#input_texts = data.pop("inputs", data)
	#data = json.loads(input_texts)
 
	# Iterate through the JSON array
	print(data)

	x = 0
	for item in data:
		data[x] = "query: " + data[x]
		x = x + 1

	print(data)
	
	results = model.encode(data, normalize_embeddings=True)
	print(results)
	print(results.__class__)
	print(results.dtype)
	returnVal = results.astype('float32')
	print(returnVal)
	print(returnVal.__class__)
	print(returnVal.dtype)
	print("before return")
	return {"inference_results": returnVal}

the returnVal.dtype is float32 right before the return, any ideas what I am doing wrong?

dhrubo · March 9, 2024, 12:54am

What do we have in data and model? Sorry I didn’t quite follow your predict_fn. Are you facing any issue with connector to get embeddings?

jtrollin · March 9, 2024, 1:11am

data is what is passed from the connector so like [“hello world”, “Goodbye world”]

the model is the intfloat/multilingual-e5-large · Hugging Face model

so the connector passes the values in the inference fine, I get the embeddings from the model, as when I print out what the embeddings look like they match what the doc you sent me says it should look like, but then when the connector (I am assuming it is the connector) gets the response from the inference then it throws the error about the arraycopy. I just don’t know what is going on there that is causing it.

jtrollin · March 9, 2024, 2:07am

This is the actual stack trace, looking at the code I think the issue is the examples you showed me are for 2.12, looking at the 2.9 code, it doesn’t seem to expect more than one encoding coming back, that code has changed a lot. Let me upgrade and see what happens

[2024-03-09T00:21:30,653][WARN ][r.suppressed             ] [0cf0d01da504393454e1bf71adc34484] path: __PATH__ params: {pretty=true, model_id=jQqVII4BcDeh12u6s7He}
java.lang.ArrayStoreException: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object[] to the type of the destination array, java.lang.Number
	at __PATH__(ArrayList.java:400)
	at org.opensearch.ml.common.connector.MLPostProcessFunction.lambda$buildModelTensorList$0(MLPostProcessFunction.java:50)
	at __PATH__(ArrayList.java:1511)
	at org.opensearch.ml.common.connector.MLPostProcessFunction.lambda$buildModelTensorList$1(MLPostProcessFunction.java:44)
	at org.opensearch.ml.engine.utils.ScriptUtils.executeBuildInPostProcessFunction(ScriptUtils.java:29)
	at org.opensearch.ml.engine.algorithms.remote.ConnectorUtils.processOutput(ConnectorUtils.java:160)
	at org.opensearch.ml.engine.algorithms.remote.AwsConnectorExecutor.invokeRemoteModelInManagedService(AwsConnectorExecutor.java:141)
	at org.opensearch.ml.engine.algorithms.remote.RemoteConnectorExecutor.preparePayloadAndInvokeRemoteModel(RemoteConnectorExecutor.java:79)
	at org.opensearch.ml.engine.algorithms.remote.RemoteConnectorExecutor.executePredict(RemoteConnectorExecutor.java:49)
	at org.opensearch.ml.engine.algorithms.remote.RemoteModel.predict(RemoteModel.java:56)
	at org.opensearch.ml.task.MLPredictTaskRunner.lambda$predict$5(MLPredictTaskRunner.java:219)
	at org.opensearch.ml.model.MLModelManager.trackPredictDuration(MLModelManager.java:1170)
	at org.opensearch.ml.task.MLPredictTaskRunner.predict(MLPredictTaskRunner.java:219)
	at org.opensearch.ml.task.MLPredictTaskRunner.lambda$executeTask$4(MLPredictTaskRunner.java:194)
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:858)
	at __PATH__(ThreadPoolExecutor.java:1136)
	at __PATH__(ThreadPoolExecutor.java:635)
	at __PATH__(Thread.java:833)

jtrollin · March 11, 2024, 1:04pm

upgraded to 2.11 (newest version on AWS) and got the connector and model to work, however query does not work, nor does trying to use a pipeline to index a document. Both produce:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "Invalid JSON in payload"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Invalid JSON in payload"
  },
  "status": 400
}

With no errors in the logs. It doesn’t seem to get the the inference at all, like the connector fails, but there is nothing in the logs as to why… Looking at the code, I don’t think it ever worked until 2.12 (not sure if it even works there as I can’t deploy 2.12 to test)

dhrubo · March 11, 2024, 4:04pm

Could you please write in details how can I reproduce this issue in my end?

qmauret · April 23, 2024, 1:11pm

I am struggling with SageMaker connector too with some errors common to the ones of @jtrollin.
I have this model (dangvantuan/sentence-camembert-large · Hugging Face) hosted on an endpoint.

The model accepts the following JSON input :
{ "inputs": ["Hello world"] }

The model returns the following JSON output ((3-dimensional array of 1024 items))
[[[[...], [...], [...]]]]

Performing predict request works well

POST /_plugins/_ml/models/xaXuCo8B_2CaR-HdFLr4/_predict
{
  "parameters": {
    "input": "Hello world"
  }
}

and returns this :

{
  "inference_results": [
    {
      "output": [
        {
          "name": "response",
          "dataAsMap": {
            "response": [
              [
                [
                  [...],
                  [...],
                  [...]
                ]
              ]
            ]
          }
        }
      ],
      "status_code": 200
    }
  ]
}

I have tried all the following connector configurations without success :

Configuration : No pre_process_function | No post_process_function :
Response : illegal_argument_exception → Invalid JSON in payload
Configuration : No pre_process_function | post_process_function (default) :
Response : array_store_exception → array_store_exception: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object to the type of the destination array, java.lang.Number
Configuration : No pre_process_function | post_process_function (custom)
Response : illegal_argument_exception → Invalid JSON in payload
Configuration : pre_process_function (default) | No post_process_function
Response : illegal_argument_exception → Invalid JSON in payload
Configuration : pre_process_function (custom) | No post_process_function
Response : illegal_state_exception → failed while calling model, check error log for details
Configuration : pre_process_function (custom) | post_process_function (custom)
Response : illegal_state_exception → failed while calling model, check error log for details
Configuration : pre_process_function (default) | post_process_function (default)
Response : array_store_exception → array_store_exception: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object to the type of the destination array, java.lang.Number
Configuration : pre_process_function (default) | post_process_function (custom)
Response : illegal_argument_exception → Invalid JSON in payload
Configuration : pre_process_function (custom) | post_process_function (default)
Response : array_store_exception → array_store_exception: arraycopy: element type mismatch: can not cast one of the elements of java.lang.Object to the type of the destination array, java.lang.Number

Legend

pre_process_function (default)
"pre_process_function": "connector.pre_process.default.embedding"
post_process_function (default)
"post_process_function": "connector.post_process.default.embedding"
pre_process_function (custom)

"pre_process_function": """
    StringBuilder builder = new StringBuilder();
    builder.append("\"");
    String first = params.text_docs[0];
    builder.append(first);
    builder.append("\"");
    def parameters = "{" +"\"input\":" + builder + "}";
    return  "{" +"\"parameters\":" + parameters + "}";"""

post_process_function (custom)

"post_process_function": """
      def name = "sentence_embedding";
      def dataType = "FLOAT32";
      if (params.inference_results == null || params.inference_results.length == 0) {
        return params.message;
      }
      def shape = [params.inference_results[0].output[0].dataAsMap.response[0][0][0][0].length];
      def json = "{" +
                 "\"name\":\"" + name + "\"," +
                 "\"data_type\":\"" + dataType + "\"," +
                 "\"shape\":" + shape + "," +
                 "\"data\":" + params.inference_results[0].output[0].dataAsMap.response[0][0][0][0] +
                 "}";
      return json;
    """

jtrollin · April 23, 2024, 2:14pm

What version of OpenSearch are you using? I am not sure it it supports the return being an array of arrays until 2.13 when document chunking was added.

So for example you return data should look like
[[xx.x,xx.x,xx.x,xx.x]] where it is a single array of float32 values that make up a single vector.

qmauret · April 23, 2024, 2:17pm

I’m on 2.11 the latest version available on AWS.
Is it possible to modify SageMaker model output ?

jtrollin · April 23, 2024, 2:40pm

I would change your model to return a single vector then, that should work. my model returns:

[[0.22301740944385529, -0.247028186917305, -0.03395825996994972, 0.0799521654844284, ...]]

and my inference code returns that as a float32 and it all works.

manjesh80 · May 13, 2024, 4:10pm

Yes … returning of [ [ x,x,x,]] will work … you can use the sample from the blog … and change the code to return just embeddings Deploy BGE Embedding Models via AWS Sagemaker | by Dominik Müller | Medium

POST /_plugins/_ml/connectors/_create
{
“name”: “use-case-default-parameters”,
“description”: “description”,
“version”: 1,
“protocol”: “aws_sigv4”,
“credential”: {
“roleArn”: “arn:aws:iam::XXXXXXX:role/XXXX”
},
“parameters”: {
“region”: “us-east-1”,
“service_name”: “sagemaker”
},
“actions”: [
{
“action_type”: “predict”,
“method”: “POST”,
“headers”: {
“content-type”: “application/json”
},
“url”: “https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/bge-base-en/invocations”,
“request_body”: “”“{ “inputs”: “${parameters.inputText}” }”“”,
“pre_process_function”: “”"
StringBuilder builder = new StringBuilder();
builder.append(“"”);
String first = params.text_docs[0];
builder.append(first);
builder.append(“"”);
def parameters = “{” + “"inputText":” + builder + “}”;
return “{” +“"parameters":” + parameters + “}”;“”",
“post_process_function”: “connector.post_process.default.embedding”
}
]
}

system · July 12, 2024, 4:11pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How deployed the model via AWS SageMaker to AWS Opensearch? Machine Learning	2	532	April 20, 2024
How to register sparse encoding model in AWS OpenSearch Machine Learning troubleshoot	15	1920	April 21, 2024
How does opensearch fit into other models Machine Learning	2	198	July 30, 2024
How to use external model as a uploaded model OpenSearch	1	335	October 19, 2023
[Feedback] Machine Learning Model Serving Framework - Experimental Release General Feedback releases	48	3010	July 12, 2023

What does a model have to return from sagemaker for opensearch to be able to use it?

Related topics