[Feedback] Neural Search plugin - experimental release

synthetic_queries.zip contains pickle files. Every pickle file consists of list of lines with 3 fields (probability , query and passage ) . An example line can be:

{‘probability’: 0.4134162664413452, ‘query’: ‘what does skin boils look like’, ‘passage’: ‘<|startoftext|> A boil is a red, swollen, painful bump under the skin. It often looks like an overgrown pimple. Boils are often caused by infected hair follicles. Bacteria from the infection form an abscess, or pocket of pus. A boil can become large and cause severe pain. Boils most often happen where there is hair and rubbing. QRY:’}

GPT model is to generate this kind of synthetic query on the given corpus. And then with this synthetic query we can fine tune BERT model.

But if we want to upload any pre-trained model without fine tuning we can do that too.

To upload a hugginface model (without training with synthetic queries) from local file we can start from here.

upload_model expects path of two files:

  1. Zip file containing torchScript (trace or script) model file of the transformer model (.pt) and a tokenizer file. An example of NLP pretrained torch script model zip file
  2. Model config file (json format) with having these request fields

@johnt FYI, we released some pre-trained huggingface sentence transformer models in OpenSearch 2.6. You can find the model list here Pretrained models - OpenSearch documentation. Hope that can save you some effort to trace the model.

1 Like

@ylwu Thank you. I’ll past that information to my colleagues. Thank you for all the assistance.

@ylwu Thanks for the new feature in 2.6. It compacts model downloading, tracing, and uploading in one call.
I am using the Python client to upload, then load a model.

client.transport.perform_request(
                        method="POST",
                        url="/_plugins/_ml/models/_upload",
                        body={
                                "name": "huggingface/sentence-transformers/all-mpnet-base-v2",
                                "version": "1.0.0",
                                "model_format": "TORCH_SCRIPT"
                                },
                    )
model_upload_task_status_object = client.transport.perform_request(
                        method="GET",
                        url="/_plugins/_ml/tasks/qnMs64YB6n-KyvL9ORfR",
                    )
client.transport.perform_request(
                        method="POST",
                        url=f"/_plugins/_ml/models/{model_upload_task_status_object['model_id']}/_load",
                    )

However, I get in the logs {<task_id>=LOAD_FAILED} multiple times. Then auto-retries until it loads the model, making loading the model really slow.
Is that an expected behavior? if not, how to solve this?
Thanks!

@abdullah-alnahas , Do you see any error message from the task or can you share any error log? You can use get task API to get task API - OpenSearch documentation

@dhrubo Can you help take a look? Does opensearch-py-ml support uploading pre-trained model from our mode repo?

Edit: @abdullah-alnahas if version 1.0.0 can’t work, can you try 1.0.1 ?

Hi @abdullah-alnahas,

What error are you seeing when model loading is failing? Can you please check this cluster setting if anything is applicable for you?

Also these are the pre-trained model lists where you can see model version is 1.0.1. Can you please try that?

You can use opensearch-py-ml plugin to perform requests to ml-commons plugin. Here is a notebook

That being said, opensearch-py-ml doesn’t support uploading pre-trained model from our mode repo yet.

The machine learning plugin intermittently fails to load the model. When successful, the loading process is not instantaneous and the plugin attempts to load the model multiple times before it is loaded successfully. Logs from a successful load attempt and a failed load attempt are provided below for reference.

[opensearch] Will load model on these nodes: eHXE2qZYTW6y6djX5COymQ
[opensearch] Access denied during loading cudart library.
[opensearch] Downloading https://publish.djl.ai/pytorch/1.12.1/cpu-precxx11/linux-x86_64/native/lib/libgomp-a34b3233.so.1.gz ...
[opensearch] Downloading https://publish.djl.ai/pytorch/1.12.1/cpu-precxx11/linux-x86_64/native/lib/libc10.so.gz ...
[opensearch] Downloading https://publish.djl.ai/pytorch/1.12.1/cpu-precxx11/linux-x86_64/native/lib/libtorch_cpu.so.gz ...
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Running full sweep
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOAD_FAILED}
[opensearch] Refresh model state: {q3Ms64YB6n-KyvL9Phcb=LOADING}
[opensearch] Downloading https://publish.djl.ai/pytorch/1.12.1/cpu-precxx11/linux-x86_64/native/lib/libtorch.so.gz ...
[opensearch] Downloading https://publish.djl.ai/pytorch/1.12.1/cpu-precxx11/linux-x86_64/native/lib/libstdc%2B%2B.so.6.gz ...
opensearch               | OpenJDK 64-Bit Server VM warning: You have loaded library /usr/share/opensearch/data/djl/pytorch/1.12.1-cpu-precxx11-linux-x86_64/libtorch_cpu.so which might have disabled stack guard. The VM will try to fix the stack guard now.
opensearch               | It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
[opensearch] Downloading jni https://publish.djl.ai/pytorch/1.12.1/jnilib/0.19.0/linux-x86_64/cpu-precxx11/libdjl_torch.so to cache ...
[opensearch] Number of inter-op threads is 1
[opensearch] Number of intra-op threads is 1
[opensearch] Extracting native/lib/linux-x86_64/libtokenizers.so to cache ...
[opensearch] Model q3Ms64YB6n-KyvL9Phcb is successfully loaded on 1 devices
[opensearch] load model done with state: LOADED, model id: q3Ms64YB6n-KyvL9Phcb
[opensearch] load model task done rHNc64YB6n-KyvL9YBev

However, most of the time it doesn’t work. I am getting several errors. One of them is as follows.

{'task_type': 'UPLOAD_MODEL',
 'function_name': 'TEXT_EMBEDDING',
 'state': 'FAILED',
 'worker_node': ['5ZWIwnnLRCeduZOow3fBrQ'],
 'create_time': 1679043105053,
 'last_update_time': 1679043105133,
 'error': 'Native Memory Circuit Breaker is open, please check your resources!',
 'is_async': True}

That happens although I have allocated enough memory.
More context:

GET _cat/nodes?v=true&h=name,node*,heap*

Gives

name       id   node.role node.roles            heap.current heap.percent heap.max
opensearch 5ZWI dim       data,ingest,master,ml      661.6mb           16      4gb

Another error I get on a different try is:

Downloading: 100% |========================================| all-mpnet-base-v2.zip   ] [opensearch] 
opensearch               | [2023-03-17T12:21:27,008][ERROR][o.o.m.m.MLModelManager   ] [opensearch] Failed to index chunk file
opensearch               | java.security.PrivilegedActionException: null
opensearch               | 	at java.security.AccessController.doPrivileged(AccessController.java:573) ~[?:?]
opensearch               | 	at org.opensearch.ml.engine.ModelHelper.downloadAndSplit(ModelHelper.java:147) [opensearch-ml-algorithms-2.6.0.0.jar:?]
opensearch               | 	at org.opensearch.ml.model.MLModelManager.uploadModel(MLModelManager.java:268) [opensearch-ml-2.6.0.0.jar:2.6.0.0]
opensearch               | 	at org.opensearch.ml.model.MLModelManager.lambda$uploadModelFromUrl$3(MLModelManager.java:241) [opensearch-ml-2.6.0.0.jar:2.6.0.0]
opensearch               | 	at org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:80) [opensearch-2.6.0.jar:2.6.0]
opensearch               | 	at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.6.0.jar:2.6.0]
opensearch               | 	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-2.6.0.jar:2.6.0]
opensearch               | 	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.6.0.jar:2.6.0]
opensearch               | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
opensearch               | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
opensearch               | 	at java.lang.Thread.run(Thread.java:833) [?:?]
opensearch               | Caused by: java.nio.file.NoSuchFileException: /usr/share/opensearch/data/djl/models_cache/upload/emt474YB_2fbBQq1ySIS/1.0.0/huggingface/sentence-transformers/all-mpnet-base-v2.zip
opensearch               | 	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
opensearch               | 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
opensearch               | 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
opensearch               | 	at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55) ~[?:?]
opensearch               | 	at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148) ~[?:?]
opensearch               | 	at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99) ~[?:?]
opensearch               | 	at java.nio.file.Files.readAttributes(Files.java:1851) ~[?:?]
opensearch               | 	at java.util.zip.ZipFile$Source.get(ZipFile.java:1264) ~[?:?]
opensearch               | 	at java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709) ~[?:?]
opensearch               | 	at java.util.zip.ZipFile.<init>(ZipFile.java:243) ~[?:?]
opensearch               | 	at java.util.zip.ZipFile.<init>(ZipFile.java:172) ~[?:?]
opensearch               | 	at java.util.zip.ZipFile.<init>(ZipFile.java:143) ~[?:?]
opensearch               | 	at org.opensearch.ml.engine.ModelHelper.verifyModelZipFile(ModelHelper.java:174) ~[?:?]
opensearch               | 	at org.opensearch.ml.engine.ModelHelper.lambda$downloadAndSplit$2(ModelHelper.java:154) ~[?:?]
opensearch               | 	at java.security.AccessController.doPrivileged(AccessController.java:569) ~[?:?]
opensearch               | 	... 10 more

How might these issues be addressed?

I have replied, however, unfortunately, my message got hidden by Akismet!
Basically, I am getting Native Memory Circuit Breaker is open, please check your resources! error message when I try to load a model. Although I have allocated more than enough memory to the cluster.

GET _cat/nodes?v=true&h=name,node*,heap*

name       id   node.role node.roles            heap.current heap.percent heap.max
opensearch 5ZWI dim       data,ingest,master,ml      442.2mb            7      6gb

@abdullah-alnahas Can you try to increase threshold or disable the native memory circuit breaker ? Refer to ml-commons/text_embedding_model_examples.md at 2.x · opensearch-project/ml-commons · GitHub

PUT _cluster/settings
{
  "persistent" : {
    "plugins.ml_commons.native_memory_threshold" : 100 
  }
}

I am using the settings here,
I did what you recommended. However, I am getting this error

Downloading: 100% |========================================| all-MiniLM-L6-v2.zip    ] [opensearch-ml1] 
opensearch-ml1           | [2023-03-20T15:34:30,583][ERROR][o.o.m.m.MLModelManager   ] [opensearch-ml1] Failed to index chunk file
opensearch-ml1           | java.security.PrivilegedActionException: null
opensearch-ml1           | 	at java.security.AccessController.doPrivileged(AccessController.java:573) ~[?:?]
opensearch-ml1           | 	at org.opensearch.ml.engine.ModelHelper.downloadAndSplit(ModelHelper.java:147) [opensearch-ml-algorithms-2.6.0.0.jar:?]
opensearch-ml1           | 	at org.opensearch.ml.model.MLModelManager.uploadModel(MLModelManager.java:268) [opensearch-ml-2.6.0.0.jar:2.6.0.0]
opensearch-ml1           | 	at org.opensearch.ml.model.MLModelManager.lambda$uploadModelFromUrl$3(MLModelManager.java:241) [opensearch-ml-2.6.0.0.jar:2.6.0.0]
opensearch-ml1           | 	at org.opensearch.action.ActionListener$1.onResponse(ActionListener.java:80) [opensearch-2.6.0.jar:2.6.0]
opensearch-ml1           | 	at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.6.0.jar:2.6.0]
opensearch-ml1           | 	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:806) [opensearch-2.6.0.jar:2.6.0]
opensearch-ml1           | 	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.6.0.jar:2.6.0]
opensearch-ml1           | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
opensearch-ml1           | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
opensearch-ml1           | 	at java.lang.Thread.run(Thread.java:833) [?:?]
opensearch-ml1           | Caused by: java.nio.file.NoSuchFileException: /usr/share/opensearch/data/djl/models_cache/upload/YRae_4YBucno7NaswoIO/1.0.1/huggingface/sentence-transformers/all-MiniLM-L6-v2.zip
opensearch-ml1           | 	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) ~[?:?]
opensearch-ml1           | 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106) ~[?:?]
opensearch-ml1           | 	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
opensearch-ml1           | 	at sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55) ~[?:?]
opensearch-ml1           | 	at sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148) ~[?:?]
opensearch-ml1           | 	at sun.nio.fs.LinuxFileSystemProvider.readAttributes(LinuxFileSystemProvider.java:99) ~[?:?]
opensearch-ml1           | 	at java.nio.file.Files.readAttributes(Files.java:1851) ~[?:?]
opensearch-ml1           | 	at java.util.zip.ZipFile$Source.get(ZipFile.java:1264) ~[?:?]
opensearch-ml1           | 	at java.util.zip.ZipFile$CleanableResource.<init>(ZipFile.java:709) ~[?:?]
opensearch-ml1           | 	at java.util.zip.ZipFile.<init>(ZipFile.java:243) ~[?:?]
opensearch-ml1           | 	at java.util.zip.ZipFile.<init>(ZipFile.java:172) ~[?:?]
opensearch-ml1           | 	at java.util.zip.ZipFile.<init>(ZipFile.java:143) ~[?:?]
opensearch-ml1           | 	at org.opensearch.ml.engine.ModelHelper.verifyModelZipFile(ModelHelper.java:174) ~[?:?]
opensearch-ml1           | 	at org.opensearch.ml.engine.ModelHelper.lambda$downloadAndSplit$2(ModelHelper.java:154) ~[?:?]
opensearch-ml1           | 	at java.security.AccessController.doPrivileged(AccessController.java:569) ~[?:?]
opensearch-ml1           | 	... 10 more
opensearch-ml1           | [2023-03-20T15:34:30,633][WARN ][o.o.m.t.MLTaskManager    ] [opensearch-ml1] Can't find task in cache: gAKe_4YBx-QKt29eu4bK
opensearch-ml1           | [2023-03-20T15:38:38,643][INFO ][o.o.j.s.JobSweeper       ] [opensearch-ml1] Running full sweep
opensearch-node2         | [2023-03-20T15:38:38,805][INFO ][o.o.j.s.JobSweeper       ] [opensearch-node2] Running full sweep
opensearch-node1         | [2023-03-20T15:38:38,907][INFO ][o.o.j.s.JobSweeper       ] [opensearch-node1] Running full sweep

How can I solve this?

I think we need to reproduce the error first to deep dive why. Can you share the steps to reproduce this error?

Quick guess, according to this config, your JVM heap is set to a max of 512 MB.
I haven’t done a deep dive of the neural plugin yet, but I am guessing it might be using the heap at some point, and you have a breaker enabled to a percentage of the heap, or it needs more than 512 MB.

This all might be entirely false if it doesn’t use the JVM heap at all, but try increasing the xmx value (or the minimal value as well) and check if it works.

As a reminder, increasing that setting requires at the very least a rolling reboot of the nodes, so plan accordingly.

@hagayg Thanks for your reply. Actually, I have increased the heap to 2G, but still getting the error.
@ylwu I am sharing the steps.
On my ubuntu 22.04 machine I am executing the following command:
sudo sysctl -w vm.max_map_count=262144
The content of my docker-compose.yml is as follows.

version: '3'
services:
  opensearch-node1:
    image: opensearchproject/opensearch:2
    container_name: opensearch-node1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node1
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping
      - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m # minimum and maximum Java heap size, recommend setting both to 50% of system RAM
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 262144 # maximum number of open files for the OpenSearch user, set to at least 262144 on modern systems
        hard: 262144
    volumes:
      - opensearch-data1:/usr/share/opensearch/data
    ports:
      - 9200:9200
      - 9600:9600 # required for Performance Analyzer
    networks:
      - opensearch-net
  opensearch-node2:
    image: opensearchproject/opensearch:2
    container_name: opensearch-node2
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node2
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true
      - OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 262144
        hard: 262144
    volumes:
      - opensearch-data2:/usr/share/opensearch/data
    networks:
      - opensearch-net
  opensearch-ml1:
    image: opensearchproject/opensearch:2
    container_name: opensearch-ml1
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-ml1
      - node.roles=ml
      - discovery.seed_hosts=opensearch-node1,opensearch-node2
      - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2
      - bootstrap.memory_lock=true
      - OPENSEARCH_JAVA_OPTS=-Xms2048m -Xmx2048m
    security_opt:
      - seccomp:unconfined
      - apparmor:unconfined
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 262144
        hard: 262144
    volumes:
      - opensearch-ml1:/usr/share/opensearch/data
    deploy:
      resources:
        limits:
          memory: 4g
    networks:
      - opensearch-net
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:2
    container_name: opensearch-dashboards
    ports:
      - 5601:5601
    expose:
      - '5601'
    environment:
      OPENSEARCH_HOSTS: https://opensearch-node1:9200
    networks:
      - opensearch-net

volumes:
  opensearch-data1:
  opensearch-data2:
  opensearch-ml1:


networks:
  opensearch-net:

I am running the following Python code after docker-compose up.

from opensearchpy import OpenSearch
cluster_url = 'https://localhost:9200'
username = 'admin'
password = 'admin'
client = OpenSearch(
        hosts=[cluster_url],
        http_auth=(username, password),
        verify_certs=False)
client.ping() # True
client.transport.perform_request(
                        method="PUT",
                        url="/_cluster/settings",
                        body={
                        "persistent" : {
                            "plugins.ml_commons.native_memory_threshold" : 95 
                        }
                        },
                    )
upload_model_status = client.transport.perform_request(
                        method="POST",
                        url="/_plugins/_ml/models/_upload",
                        body={
                                #"name": "huggingface/sentence-transformers/all-mpnet-base-v2",
                                "name": "huggingface/sentence-transformers/all-MiniLM-L6-v2",
                                "version": "1.0.1",
                                "model_format": "TORCH_SCRIPT"
                                },
                        params={"request_timeout": 1000}
                    )
model_upload_task_status_object = client.transport.perform_request(
                        method="GET",
                        url=f"/_plugins/_ml/tasks/{upload_model_status['task_id']}",
                        
                    )

Then I get the error I mentioned above.

Could you increase it significantly more to rule that setting out? Say, to 32 GB?
Loaded models can take quite a bit of RAM at times. Also, could you give the size of your current circuit breakers? They’re found under _cluster/settings

I don’t have that much memory on my machine.
However, I believe that the error message [ERROR][o.o.m.m.MLModelManager ] [opensearch-ml1] Failed to index chunk file is not connected to this issue.
Also, running GET _cat/nodes?v=true&h=name,node*,heap* gives the following.

name             id   node.role node.roles                                        heap.current heap.percent heap.max
opensearch-ml1   8Nl1 -         ml                                                     416.4mb           20      2gb
opensearch-node1 tNKA dimr      cluster_manager,data,ingest,remote_cluster_client        181mb           35    512mb
opensearch-node2 hX-6 dimr      cluster_manager,data,ingest,remote_cluster_client      204.2mb           39    512mb

Can’t reproduce the error. I guess that maybe caused by sync up job which regularly sync up loaded model and clean up any stale model cache files. Maybe the model file removed before the uploading starts. Can you try to disable the sync up job or increase the interval? Refer to ML Commons cluster settings - OpenSearch documentation

Set the interval as 0 will disable the sync up job.

PUT _cluster/settings
{
  "persistent" : {
    "plugins.ml_commons.sync_up_job_interval_in_seconds": 0
  }
}

Hi,
I have a problem with updates, I upload the model to opensearch and i created the pipeline like this

{
  "description": "An example neural search pipeline",
  "processors" : [
    {
      "text_embedding": {
        "model_id": model_id,
        "field_map": {
           "search": "search_vector"
        }
      }
    }
  ]
}

it seems to work with inserts and search.
The search is like this:

POST  index/_search
{
      _source: {
        exclude: ["search_vector"],
      },
      size: 30,
      query: {
        neural: {
          search_vector: {
            query_text: keywords,
            model_id: this.modelId,
            k: 30,
          },
        },
      },
    };

the insert is like this:

POST index/_doc/${uuid()}
{
     ...item,
    "search"
}

they both work, on insert the mapping works and inserts search_vector as a 384 dimensional vector.
unfortunately the update fails, try to update like this:

POST index/_update/${id}
{
    "doc": {
                   ...item, // this is a subset of the entered item
                   "search",
    }
}

i have this error

{
    "error": {
        "root_cause": [
            {
                "type": "mapper_parsing_exception",
                "reason": "failed to parse field [search_vector] of type [knn_vector] in document with id 'd1ee90fe-1478-4b87-a198-b269be86eec4'. Preview of field's value: 'null'"
            }
        ],
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [search_vector] of type [knn_vector] in document with id 'd1ee90fe-1478-4b87-a198-b269be86eec4'. Preview of field's value: 'null'",
        "caused_by": {
            "type": "illegal_argument_exception",
            "reason": "Vector dimension mismatch. Expected: 384, Given: 768"
        }
    },
    "status": 400
}

@alessnid which version of Opensearch you are using? Can you try with opensearch version >=2.5 I remember this was a bug with 2.4 version of Opensearch.

@Navneet I’m already using version 2.5