Versions OpenSearch version 3.2.0
Describe the issue: When running OpenSearch process inside a Docker image on Windows, and making an attempt to register a zipped ML model, sometimes it goes through, but then it fails during the deployment process, and other times it fails outright with the message “Model content changed”. Configuring the instance to run with heightened verbosity with “logger.org.opensearch.ml : TRACE” did not yield much more helpful information. I did make sure to include the required “model_content_hash_value” field with the manually calculated sha256 hash (calculated both inside windows OS AND through the use of a WSL terminal) and both values caused the same error to occur. Bear in mind that this is NOT the same as the MacOS issue discussed previously. I was hoping for a clean resolution but I am unsure as to what the issue is and what the “changed” hash is due to it not being printed. The zipped model contains the following files:
- config.json
- model.pt
- model_config.json
- sentencepiece.bpe.model
- special_tokens_map.json
- tokenizer.json
- tokenizer_config.json
and the zipped archive contents are flat (no sub-folders).
Configuration:
(jvm.options):
-Xms6g
-Xmx6g
(opensearch.yml):
plugins.security.disabled: true
plugins.security.ssl.http.enabled: false
plugins.security.ssl.transport.enabled: false
(App’s security has been temporarily disabled for dev purposes. Http is used instead for dev)
plugins.ml_commons.allow_registering_model_via_url: true
plugins.ml_commons.only_run_on_ml_node: false
(Uses “normal” nodes to run ml tasks)
Relevant Logs or Screenshots:
Docker debug (TRACE level logs for opensearch instance:)
[2025-09-05T08:57:29,905][ERROR][o.o.m.m.MLModelManager ] [ba1c7cf4301f] Model content hash can’t match original hash value [2025-09-05T08:57:29,905][DEBUG][o.o.m.m.MLModelCacheHelper] [ba1c7cf4301f] removing model fwELGZkBpgN9ztMCbCq9 from cache [2025-09-05T08:57:30,074][DEBUG][o.o.n.r.t.AverageCpuUsageTracker] [ba1c7cf4301f] Recording cpu usage: 26% [2025-09-05T08:57:30,085][DEBUG][o.o.m.m.MLModelCacheHelper] [ba1c7cf4301f] Setting the auto deploying flag for Model fwELGZkBpgN9ztMCbCq9 [2025-09-05T08:57:30,085][WARN ][o.o.m.a.d.TransportDeployModelOnNodeAction] [ba1c7cf4301f] Model deployment failed on local node: H0buCavoTtGs17zU4o6kuw. Sending FAILED message to coordinating node H0buCavoTtGs17zU4o6kuw with error: model content changed [2025-09-05T08:57:30,086][DEBUG][o.o.t.TransportService ] [ba1c7cf4301f] Action: cluster:admin/opensearch/mlinternal/forward [2025-09-05T08:57:30,086][DEBUG][o.o.m.a.f.TransportForwardAction] [ba1c7cf4301f] receive forward request: DEPLOY_MODEL_DONE [2025-09-05T08:57:30,086][DEBUG][o.o.m.t.MLTaskManager ] [ba1c7cf4301f] add task error: taskId: gAEVGZkBpgN9ztMCZipO, workerNodeId: H0buCavoTtGs17zU4o6kuw, error: model content changed [2025-09-05T08:57:30,086][DEBUG][o.o.m.t.MLTaskManager ] [ba1c7cf4301f] remove ML task from cache gAEVGZkBpgN9ztMCZipO [2025-09-05T08:57:30,086][ERROR][o.o.m.a.f.TransportForwardAction] [ba1c7cf4301f] deploy model failed on all nodes, model id: fwELGZkBpgN9ztMCbCq9