How can we deploy ML model (.zip) to nodes locally, not via SSL or the firewall

yeonghyeonKo · August 19, 2024, 1:32am

@zane_neo Before I’ve tried inject configuration files for ML Model in spec.nodePools[].env, it’s necessary to check how many and what kind of files should be included. So I tested two environments each,

one is using a public internet environment tested at home and
the other is still a restricted environment because of closed network in the company.
(There should be proxy issue when using Java Process to download files)

First, when I tried to register/deploy ML model in a public internet env, I can’t see any error or exception about SSL so that the model is successfully deployed to all nodes in the OpenSearch Cluster. Inside the pod(running for a ML node), there are newly added folders(pytorch and tokenizers) in opensearch/data/ml_cache including the below files.

$ tree

├── pytorch
│   ├── 1.13.1-cpu-precxx11-linux-x86_64
│   │   ├── 0.28.0-libdjl_torch.so
│   │   ├── libc10.so
│   │   ├── libgomp-a34b3233.so.1
│   │   ├── libstdc++.so.6
│   │   ├── libtorch.so
│   │   └── libtorch_cpu.so
│   └── 1.13.1.txt
└── tokenizers
    └── 0.19.1-0.28.0-linux-x86_64
        └── libtokenizers.so

So I extracted two folders with all files from container to localhost, and then sent them the Closed-Network env. Finally I used the below codes to put them in containers from localhost.

$ docker cp pytorch/ opensearch-node1:/usr/share/opensearch/data/ml_cache
$ docker cp pytorch/ opensearch-node2:/usr/share/opensearch/data/ml_cache
$ docker cp tokenizers/ opensearch-node1:/usr/share/opensearch/data/ml_cache
$ docker cp tokenizers/ opensearch-node2:/usr/share/opensearch/data/ml_cache

With this method, we can easily pre-include torch files with a new docker image like:

FROM opensearchproject/opensearch:2.16.0

# Switch to root user for installation
USER root

# Set the working directory
WORKDIR /usr/share/opensearch

# Copy the necessary files
COPY ./pytorch /data/ml_cache
COPY ./tokenizers /data/ml_cache

# Grant execute permissions to the script
RUN chmod +x /data/ml_cache/pytorch
RUN chmod +x /data/ml_cache/tokenizers

# Switch back to the default non-root user
USER opensearch

In this scenario, without using spec.nodePools[].env, I could deployed ML models in the Closed-Network environment.

Topic		Replies	Views
Model deployment failure with ml-commons plugin in internet disabled environment Machine Learning discuss , troubleshoot , configure , install	3	1220	November 12, 2023
Errors when deploy ML Models to Opensearch cluster OpenSearch	1	289	July 24, 2024
Cannot upload ML model Machine Learning	3	777	June 18, 2023
Could not upload model to opensearch cluster Machine Learning	2	988	August 8, 2023
Not able to Register the model even after following documentation Machine Learning all-clients , discuss , troubleshoot , configure , install	2	881	November 25, 2023

How can we deploy ML model (.zip) to nodes locally, not via SSL or the firewall

Related topics