Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch 2.18.0
Describe the issue:
Recieving exception
java.lang.NoClassDefFoundError: Could not initialize class ai.djl.onnxruntime.engine.OrtNDManager
when trying to deploy any model.
Steps to reproduce
- Start OpenSearch container
- Send PUT request for settings as described bellow
- Create a model group
- Register a model.
- Try to deploy the model
Configuration:
{
“persistent”: {
“plugins.ml_commons.allow_registering_model_via_url”: “true”,
“plugins.ml_commons.allow_registering_model_via_local_file”: “true”,
“plugins.ml_commons.only_run_on_ml_node”: “false”,
“plugins.ml_commons.model_access_control_enabled”: “true”,
“plugins.ml_commons.native_memory_threshold”: “99”,
“plugins.ml_commons.ml_task_timeout_in_seconds”: 86400
}
}
Relevant Logs or Screenshots:
org.opensearch.ml.common.exception.MLException: Failed to deploy model bmgPH5MBg54l5JEH4HHq
2024-11-12T06:36:06.465261764Z at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:300) ~[?:?]
2024-11-12T06:36:06.465290017Z at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) ~[?:?]
2024-11-12T06:36:06.465298997Z at org.opensearch.ml.engine.algorithms.DLModel.loadModel(DLModel.java:252) ~[?:?]
2024-11-12T06:36:06.465315384Z at org.opensearch.ml.engine.algorithms.DLModel.initModel(DLModel.java:142) ~[?:?]
2024-11-12T06:36:06.465328678Z at org.opensearch.ml.engine.MLEngine.deploy(MLEngine.java:139) ~[?:?]
2024-11-12T06:36:06.465340068Z at org.opensearch.ml.model.MLModelManager.lambda$deployModel$55(MLModelManager.java:1119) ~[?:?]
2024-11-12T06:36:06.465354686Z at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.18.0.jar:2.18.0]
2024-11-12T06:36:06.465366304Z at org.opensearch.ml.model.MLModelManager.lambda$retrieveModelChunks$76(MLModelManager.java:1745) [opensearch-ml-2.18.0.0.jar:2.18.0.0]
2024-11-12T06:36:06.465372894Z at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.18.0.jar:2.18.0]
2024-11-12T06:36:06.465410614Z at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.18.0.jar:2.18.0]
2024-11-12T06:36:06.465420897Z at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1005) [opensearch-2.18.0.jar:2.18.0]
2024-11-12T06:36:06.465426898Z at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.18.0.jar:2.18.0]
2024-11-12T06:36:06.465432256Z at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
2024-11-12T06:36:06.465437439Z at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
2024-11-12T06:36:06.465442602Z at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
2024-11-12T06:36:06.465447681Z Caused by: java.lang.NoClassDefFoundError: Could not initialize class ai.djl.onnxruntime.engine.OrtNDManager
2024-11-12T06:36:06.465453020Z at ai.djl.onnxruntime.engine.OrtEngine.newBaseManager(OrtEngine.java:134) ~[?:?]
2024-11-12T06:36:06.465458265Z at ai.djl.onnxruntime.engine.OrtEngine.newModel(OrtEngine.java:122) ~[?:?]
2024-11-12T06:36:06.465465176Z at ai.djl.Model.newInstance(Model.java:99) ~[?:?]
2024-11-12T06:36:06.465471135Z at ai.djl.repository.zoo.BaseModelLoader.createModel(BaseModelLoader.java:196) ~[?:?]
2024-11-12T06:36:06.465478645Z at ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:159) ~[?:?]
2024-11-12T06:36:06.465486239Z at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174) ~[?:?]
2024-11-12T06:36:06.465493584Z at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:217) ~[?:?]
2024-11-12T06:36:06.465501271Z at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) ~[?:?]
2024-11-12T06:36:06.465509578Z ... 14 more
2024-11-12T06:36:06.465515775Z Caused by: java.lang.ExceptionInInitializerError: Exception ai.djl.engine.EngineException: Failed to save pytorch index file [in thread "opensearch[opensearch-node][opensearch_ml_deploy][T#11]"]
2024-11-12T06:36:06.465531614Z at ai.djl.pytorch.jni.LibUtils.downloadPyTorch(LibUtils.java:429) ~[?:?]
2024-11-12T06:36:06.465537169Z at ai.djl.pytorch.jni.LibUtils.findNativeLibrary(LibUtils.java:314) ~[?:?]
2024-11-12T06:36:06.465541845Z at ai.djl.pytorch.jni.LibUtils.getLibTorch(LibUtils.java:93) ~[?:?]
2024-11-12T06:36:06.465549594Z at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:81) ~[?:?]
2024-11-12T06:36:06.465556029Z at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53) ~[?:?]
2024-11-12T06:36:06.465560281Z at ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41) ~[?:?]
2024-11-12T06:36:06.465564295Z at ai.djl.engine.Engine.getEngine(Engine.java:190) ~[?:?]
2024-11-12T06:36:06.465567895Z at ai.djl.engine.Engine.getInstance(Engine.java:145) ~[?:?]
2024-11-12T06:36:06.465571526Z at ai.djl.onnxruntime.engine.OrtEngine.getAlternativeEngine(OrtEngine.java:75) ~[?:?]
2024-11-12T06:36:06.465575023Z at ai.djl.ndarray.BaseNDManager.<init>(BaseNDManager.java:64) ~[?:?]
2024-11-12T06:36:06.465578727Z at ai.djl.onnxruntime.engine.OrtNDManager.<init>(OrtNDManager.java:42) ~[?:?]
2024-11-12T06:36:06.465582400Z at ai.djl.onnxruntime.engine.OrtNDManager.<init>(OrtNDManager.java:35) ~[?:?]
2024-11-12T06:36:06.465586103Z at ai.djl.onnxruntime.engine.OrtNDManager$SystemManager.<init>(OrtNDManager.java:177) ~[?:?]
2024-11-12T06:36:06.465589900Z at ai.djl.onnxruntime.engine.OrtNDManager.<clinit>(OrtNDManager.java:37) ~[?:?]
2024-11-12T06:36:06.465593945Z at ai.djl.onnxruntime.engine.OrtEngine.newBaseManager(OrtEngine.java:134) ~[?:?]
2024-11-12T06:36:06.465598042Z at ai.djl.onnxruntime.engine.OrtEngine.newModel(OrtEngine.java:122) ~[?:?]
2024-11-12T06:36:06.465602263Z at ai.djl.Model.newInstance(Model.java:99) ~[?:?]
2024-11-12T06:36:06.465610332Z at ai.djl.repository.zoo.BaseModelLoader.createModel(BaseModelLoader.java:196) ~[?:?]
2024-11-12T06:36:06.465617667Z at ai.djl.repository.zoo.BaseModelLoader.loadModel(BaseModelLoader.java:159) ~[?:?]
2024-11-12T06:36:06.465621402Z at ai.djl.repository.zoo.Criteria.loadModel(Criteria.java:174) ~[?:?]
2024-11-12T06:36:06.465624630Z at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:217) ~[?:?]
2024-11-12T06:36:06.465627568Z at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) ~[?:?]
2024-11-12T06:36:06.465630557Z ... 14 more