Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.15
Describe the issue:
Since the original issue is closed and it no longer accepts new replies, I write this topic.
Failed to deploy model 0UZj05QBIjmJRhPKICkD
ai.djl.engine.EngineException:
Unknown builtin op: aten::scaled_dot_product_attention.
Here are some suggestions:
aten::_scaled_dot_product_attention
The original call is:
File "code/__torch__/transformers/models/bert/modeling_bert/___torch_mangle_8118.py", line 38
x1 = torch.view(_10, [_12, int(_13), 12, 64])
value_layer = torch.permute(x1, [0, 2, 1, 3])
attn_output = torch.scaled_dot_product_attention(query_layer, key_layer, value_layer, attention_mask)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
attn_output0 = torch.transpose(attn_output, 1, 2)
input = torch.reshape(attn_output0, [_0, _1, 768])
at ai.djl.pytorch.jni.PyTorchLibrary.moduleLoad(Native Method) ~[pytorch-engine-0.28.0.jar:?]
I found the similar issue in Github and one of the comments from this says (link: RuntimeError: Unknown builtin op: aten::scaled_dot_product_attention. · Issue #134 · YuliangXiu/ECON · GitHub):
You shouldn’t use PyTorch 1.13.1; you need at least PyTorch 2.0 or higher because the scaled_dot_product_attention function is only available in version 2.0. Additionally, you need to update your CUDA and PyTorch3D versions so that they are compatible with each other, and that should fix the issue.