Error when deploying pretrained model: Unknown builtin op: aten::scaled_dot_product_attention

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.17.1

Describe the issue:
I am trying to load a pretrained model into OpenSearch. But I am getting this error:

Unknown builtin op: aten::scaled_dot_product_attention.

I am using PyTorch 2.5.1

This has also been reported by @jdomkline here: Pretrained Model Download/Register Fails for TorchScript

Relevant Logs or Screenshots:

2025-04-07 11:38:24 [2025-04-07T09:38:24,477][ERROR][o.o.m.e.a.DLModel        ] [opensearch-node1] Failed to deploy model IkGdD5YBDWLL4mueWK9o
2025-04-07 11:38:24 ai.djl.engine.EngineException: 
2025-04-07 11:38:24 Unknown builtin op: aten::scaled_dot_product_attention.
2025-04-07 11:38:24 Here are some suggestions: 
2025-04-07 11:38:24     aten::_scaled_dot_product_attention
2025-04-07 11:38:24 
2025-04-07 11:38:24 The original call is:
2025-04-07 11:38:24   File "code/__torch__/transformers/models/bert/modeling_bert.py", line 165
2025-04-07 11:38:24     x1 = torch.view(_36, [_38, int(_39), 12, 32])
2025-04-07 11:38:24     value_layer = torch.permute(x1, [0, 2, 1, 3])
2025-04-07 11:38:24     attn_output = torch.scaled_dot_product_attention(query_layer, key_layer, value_layer, attention_mask)
2025-04-07 11:38:24                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
2025-04-07 11:38:24     attn_output0 = torch.transpose(attn_output, 1, 2)
2025-04-07 11:38:24     input = torch.reshape(attn_output0, [_26, _27, 384])
2025-04-07 11:38:24 
2025-04-07 11:38:24     at ai.djl.pytorch.jni.PyTorchLibrary.moduleLoad(Native Method) ~[pytorch-engine-0.28.0.jar:?]
2025-04-07 11:38:24     at ai.djl.pytorch.jni.JniUtils.loadModule(JniUtils.java:1756) ~[pytorch-engine-0.28.0.jar:?]
2025-04-07 11:38:24     at ai.djl.pytorch.engine.PtModel.load(PtModel.java:99) ~[pytorch-engine-0.28.0.jar:?]
2025-04-07 11:38:24     at ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:166) ~[api-0.28.0.jar:?]
2025-04-07 11:38:24     at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174) ~[api-0.28.0.jar:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:217) ~[opensearch-ml-algorithms-2.17.1.0.jar:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) [opensearch-ml-algorithms-2.17.1.0.jar:?]
2025-04-07 11:38:24     at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) [?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.loadModel(DLModel.java:252) [opensearch-ml-algorithms-2.17.1.0.jar:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.initModel(DLModel.java:142) [opensearch-ml-algorithms-2.17.1.0.jar:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.MLEngine.deploy(MLEngine.java:125) [opensearch-ml-algorithms-2.17.1.0.jar:?]
2025-04-07 11:38:24     at org.opensearch.ml.model.MLModelManager.lambda$deployModel$52(MLModelManager.java:1084) [opensearch-ml-2.17.1.0.jar:2.17.1.0]
2025-04-07 11:38:24     at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.ml.model.MLModelManager.lambda$retrieveModelChunks$73(MLModelManager.java:1710) [opensearch-ml-2.17.1.0.jar:2.17.1.0]
2025-04-07 11:38:24     at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1005) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
2025-04-07 11:38:24     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
2025-04-07 11:38:24     at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2025-04-07 11:38:24 [2025-04-07T09:38:24,485][ERROR][o.o.m.m.MLModelManager   ] [opensearch-node1] Failed to retrieve model IkGdD5YBDWLL4mueWK9o
2025-04-07 11:38:24 org.opensearch.ml.common.exception.MLException: Failed to deploy model IkGdD5YBDWLL4mueWK9o
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:300) ~[?:?]
2025-04-07 11:38:24     at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.loadModel(DLModel.java:252) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.initModel(DLModel.java:142) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.MLEngine.deploy(MLEngine.java:125) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.model.MLModelManager.lambda$deployModel$52(MLModelManager.java:1084) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.ml.model.MLModelManager.lambda$retrieveModelChunks$73(MLModelManager.java:1710) [opensearch-ml-2.17.1.0.jar:2.17.1.0]
2025-04-07 11:38:24     at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1005) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.17.1.jar:2.17.1]
2025-04-07 11:38:24     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
2025-04-07 11:38:24     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
2025-04-07 11:38:24     at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2025-04-07 11:38:24 Caused by: ai.djl.engine.EngineException: 
2025-04-07 11:38:24 Unknown builtin op: aten::scaled_dot_product_attention.
2025-04-07 11:38:24 Here are some suggestions: 
2025-04-07 11:38:24     aten::_scaled_dot_product_attention
2025-04-07 11:38:24 
2025-04-07 11:38:24 The original call is:
2025-04-07 11:38:24   File "code/__torch__/transformers/models/bert/modeling_bert.py", line 165
2025-04-07 11:38:24     x1 = torch.view(_36, [_38, int(_39), 12, 32])
2025-04-07 11:38:24     value_layer = torch.permute(x1, [0, 2, 1, 3])
2025-04-07 11:38:24     attn_output = torch.scaled_dot_product_attention(query_layer, key_layer, value_layer, attention_mask)
2025-04-07 11:38:24                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
2025-04-07 11:38:24     attn_output0 = torch.transpose(attn_output, 1, 2)
2025-04-07 11:38:24     input = torch.reshape(attn_output0, [_26, _27, 384])
2025-04-07 11:38:24 
2025-04-07 11:38:24     at ai.djl.pytorch.jni.PyTorchLibrary.moduleLoad(Native Method) ~[?:?]
2025-04-07 11:38:24     at ai.djl.pytorch.jni.JniUtils.loadModule(JniUtils.java:1756) ~[?:?]
2025-04-07 11:38:24     at ai.djl.pytorch.engine.PtModel.load(PtModel.java:99) ~[?:?]
2025-04-07 11:38:24     at ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:166) ~[?:?]
2025-04-07 11:38:24     at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:217) ~[?:?]
2025-04-07 11:38:24     at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) ~[?:?]
2025-04-07 11:38:24     ... 14 more

I see here that scaled_dot_product_attention is part of nn.functional, but it gets directly called from torch. Is this something coming from a Java package used by OpenSearch?

Hi @drjz , have you tried to downgrade the torch to 1.13 or 2.0?

The error message looks like new torch operator not supported by old djl version.

Fixed by downgrading pytorch and transformers versions.Find more details here: [BUG] Custom Model Upload Failure in OpenSearch 2.11 with Sparse Model · Issue #1710 · opensearch-project/ml-commons · GitHub