lx9182  
                
                  
                    August 22, 2023,  7:14am
                   
                  1 
               
             
            
              Versions  (relevant - OpenSearch/Dashboard/Server OS/Browser):
Describe the issue :
Configuration :docker-compose.yml
opensearch-dashboards:http://opensearch-node:9200 ”
upload model requesthttps://github.com/opensearch-project/ml-commons/raw/2.x/ml-algorithms/src/test/resources/org/opensearch/ml/engine/algorithms/text_embedding/all-MiniLM-L6-v2_torchscript_sentence-transformer.zip?raw=true ”
Relevant Logs or Screenshots :
             
            
              
            
           
          
            
              
                ylwu  
              
                  
                    August 22, 2023,  5:24pm
                   
                  2 
               
             
            
              I think that error is related with the cluster setup. Can you explain the detail steps to reproduce the error?
BTW, 2.4 is very old version. Have you tried the latest version 2.9 which we GA released this feature?
Some useful links
  
  
    
      ---
version: '3'
services:
  opensearch-node1:
    image: opensearchproject/opensearch:2
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node1
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true  # along with the memlock settings below, disables swapping
      - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m    # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536  # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
        hard: 65536
show original 
   
  
    
    
  
  
 
  
  
    
      Model serving framework (released in 2.4) supports running NLP models inside OpenSearch cluster. 
It only supports text embedding NLP model now. This document will show you some examples of how to upload
and run text embedding models via ml-commons REST APIs. We use [Huggingface](https://huggingface.co/) models to build these examples.
Read [ml-commons doc](https://opensearch.org/docs/latest/ml-commons-plugin/model-serving-framework/) to learn more details.
We build examples with this Huggingface sentence transformers model [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
From 2.5, we support uploading [torchscipt](https://pytorch.org/docs/stable/jit.html) and [ONNX](https://onnx.ai/) model.
Note: 
- This doc doesn't include how to trace models to torchscript/ONNX.
- Model serving framework is experimental feature. If you see any bug or have any suggestion, feel free to cut Github issue.
# 0. Prepare cluster
We suggest to start dedicated ML node to separate ML workloads from data nodes. From 2.5, ml-commons will run ML tasks on ML nodes only by default.
If you want to run some testing models on data node, you can disable this cluster setting `plugins.ml_commons.only_run_on_ml_node`.
```
PUT /_cluster/settings
show original 
   
  
    
    
  
  
 
  
  
    
      # Topic
This doc explains how to use remote inference feature in ml-commons (This doc works for OpenSearch 2.9+).
As remote inference needs credential to ML service, always use this feature on security enabled cluster to protect your credential.
# Background
In [OpenSearch 2.4](https://opensearch.org/blog/opensearch-2-4-is-available-today/), ml-commons released an experimental model-serving framework which allows user to [upload text 
embedding model](https://github.com/opensearch-project/ml-commons/blob/main/docs/model_serving_framework/text_embedding_model_examples.md) and run it inside OpenSearch cluster. 
We call such models as local model. 
As model generally takes a lot of resources like memory/CPU, we suggest user always run model on [dedicated ML node](https://opensearch.org/docs/latest/ml-commons-plugin/index/#ml-node) for production environment.
[GPU acceleration](https://github.com/opensearch-project/ml-commons/blob/main/docs/model_serving_framework/GPU_support.md) also supported.
For some use case, user may prefer to run model outside OpenSearch cluster, for example
- User already have ML model running outside OpenSearch. For example, they already have model running on Amazon Sagemaker.
- User have big ML models which can't run inside OpenSearch cluster, like LLM.
- User prefer to use public ML service like OpenAI, Cohere, Anthropic etc.
In [OpenSearch 2.9](https://opensearch.org/blog/introducing-opensearch-2.9.0/), ml-commons introduces a virtual model to represent a model running outside OpenSearch cluster. We call such model as remote model.
To support remote model, ml-commons introduces a general connector concept which will defines protocol between ml-commons and external ML service. Remote model can leverage connector to communicate with external ML services.
show original 
   
  
    
    
  
  
 
             
            
              
            
           
          
            
              
                system  
              
                  
                    October 21, 2023,  5:25pm
                   
                  3 
               
             
            
              This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.