@zane_neo Before I’ve tried inject configuration files for ML Model in spec.nodePools[].env
, it’s necessary to check how many and what kind of files should be included. So I tested two environments each,
- one is using a public internet environment tested at home and
- the other is still a restricted environment because of closed network in the company.
(There should be proxy issue when using Java Process to download files)
First, when I tried to register/deploy ML model in a public internet env, I can’t see any error or exception about SSL so that the model is successfully deployed to all nodes in the OpenSearch Cluster. Inside the pod(running for a ML node), there are newly added folders(pytorch and tokenizers) in opensearch/data/ml_cache
including the below files.
$ tree
├── pytorch
│ ├── 1.13.1-cpu-precxx11-linux-x86_64
│ │ ├── 0.28.0-libdjl_torch.so
│ │ ├── libc10.so
│ │ ├── libgomp-a34b3233.so.1
│ │ ├── libstdc++.so.6
│ │ ├── libtorch.so
│ │ └── libtorch_cpu.so
│ └── 1.13.1.txt
└── tokenizers
└── 0.19.1-0.28.0-linux-x86_64
└── libtokenizers.so
So I extracted two folders with all files from container to localhost, and then sent them the Closed-Network env. Finally I used the below codes to put them in containers from localhost.
$ docker cp pytorch/ opensearch-node1:/usr/share/opensearch/data/ml_cache
$ docker cp pytorch/ opensearch-node2:/usr/share/opensearch/data/ml_cache
$ docker cp tokenizers/ opensearch-node1:/usr/share/opensearch/data/ml_cache
$ docker cp tokenizers/ opensearch-node2:/usr/share/opensearch/data/ml_cache
With this method, we can easily pre-include torch files with a new docker image like:
FROM opensearchproject/opensearch:2.16.0
# Switch to root user for installation
USER root
# Set the working directory
WORKDIR /usr/share/opensearch
# Copy the necessary files
COPY ./pytorch /data/ml_cache
COPY ./tokenizers /data/ml_cache
# Grant execute permissions to the script
RUN chmod +x /data/ml_cache/pytorch
RUN chmod +x /data/ml_cache/tokenizers
# Switch back to the default non-root user
USER opensearch
In this scenario, without using spec.nodePools[].env
, I could deployed ML models in the Closed-Network environment.