Model weights for sparse encoders

asfoorial · October 17, 2023, 12:52pm

Hi all,

Could you please provide us with sparse encoders weights , I remember in the related issue it was stated that the weights will be released. Can you also provide links to existing models we can try with the sparse encoding feature?

Thanks

modelcollapse · October 18, 2023, 12:40am

We apologize for the late delivery, please wait for some extra days util we deal all the internal open-source release clearance. The weights are on their way.

asfoorial · October 18, 2023, 6:04am

Is there any other models that we can try out for now? I tried this Intel/bert-base-uncased-sparse-90-unstructured-pruneofa · Hugging Face but I could not upload it and got the below response

POST /_plugins/_ml/models/_register
{
“name”: “Intel/bert-large-uncased-sparse-90-unstructured-pruneofa”,
“version”: “1.0.1”,
“model_group_id”: “XL4wQYsBDV2us3MwXfy9”,
“model_format”: “TORCH_SCRIPT”
}

{
“task_type”: “REGISTER_MODEL”,
“function_name”: “TEXT_EMBEDDING”,
“state”: “FAILED”,
“worker_node”: [
“H-LKxNKoT52yZttSBnoLdA”
],
“create_time”: 1697608915265,
“last_update_time”: 1697608916467,
“error”: “This model is not in the pre-trained model list, please check your parameters.”,
“is_async”: true
}

xinyual · October 19, 2023, 8:26am

Thanks for your attention on sparse encoding! We allow customers to use their own model. But we need you to obey the request body requirement of registering model. You need to give the url of your torch script artifact zip file. Also, we will need to offer the hash value of the artifact. Inside your zip file, we will need to contain the torch script pt file and tokenizer json file. I see your model name and I think it is currently a pytorch bin file. So you need to convert it to torch script and register your own model like:
{
“name”: “amazon/neural-sparse/opensearch-neural-sparse-encoding-v1”,
“version”: “1.0.0”,
“description”: “This is a neural sparse encoding model: It transfers text into sparse vector, and then extract nonzero index and value to entry and weights. It serves in both ingestion and search”,
“model_format”: “TORCH_SCRIPT”,
“function_name”: “SPARSE_ENCODING”,
“model_content_hash_value”: “9a41adb6c13cf49a7e3eff91aef62ed5035487a6eca99c996156d25be2800a9a”,
“url”: “https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1/1.0.0/torch_script/opensearch-neural-sparse-encoding-doc-v1-1.0.0-torch_script.zip”
}
This is one of the pretrained model. We provide bi-encoder model, doc model and tokenizer model.
We already release the model so you can use it. We will update our documentation of pretrained model and how to upload your own sparse encoding model. Thanks.

asfoorial · October 19, 2023, 9:13am

Thank you so much. It works now. Do you have any guide to finetune this model?

asfoorial · October 19, 2023, 8:14pm

I got nice results, however indexing was quite slow. is there anyway to make it faster?

xinyual · October 19, 2023, 11:55pm

Currently we will not release any guidance about finetuning. We may release it in the future.

xinyual · October 19, 2023, 11:55pm

What do you mean about indexing? Do you use ingestion pipeline?

asfoorial · October 20, 2023, 4:46am

I did use a pipeline as per the documentation.

PUT /_ingest/pipeline/got-pipeline
{
“description”: “An sparse encoding ingest pipeline”,
“processors”: [
{
“sparse_encoding”: {
“model_id”: “DtpBSYsBgoSEiZHXptMY”,
“field_map”: {
“content”: “embedding”
}
}
}
]
}

PUT got_sparse
{
“mappings”: {
“properties”: {
“content”: {
“type”: “text”
},
“embedding”: {
“type”: “rank_features”
}
}
},
“settings”:
{
“index”: {
“replication”: {
“type”: “DOCUMENT”
},
“number_of_shards”: “1”,
“default_pipeline”: “got-pipeline”,
“number_of_replicas”: “1”
}
}

}

PUT got_sparse/_doc/1
{
“content”: “”“kills Theon Greyjoy, and prepares to strike down Bran. However, the Night King is ambushed and killed by Arya Stark with the Valyrian steel dagger that Bran had previously given her (“The Spoils of War”), which causes both him and the other White Walkers to shatter and results in the complete obliteration of the Army of the Dead.”“”
}

The sometimes take over 300 seconds and sometimes 500 seconds. The I tried the same with a dense vector (using sentence-transformers/msmarco-distilbert-cos-v5 model) as shown below.

PUT got_index_dense_v512_2/_doc/1
{
“content”: “”“kills Theon Greyjoy, and prepares to strike down Bran. However, the Night King is ambushed and killed by Arya Stark with the Valyrian steel dagger that Bran had previously given her (“The Spoils of War”), which causes both him and the other White Walkers to shatter and results in the complete obliteration of the Army of the Dead.”“”
}

The above took 160 seconds!

Also one other thing I noticed is that if I try direct inference using ml-common API then the sparse model gives me results faster! The below is using the sparse model and it returned results in 378.

POST /_plugins/_ml/models/DtpBSYsBgoSEiZHXptMY/_predict
{
“text_docs”:[ “”“kills Theon Greyjoy, and prepares to strike down Bran. However, the Night King is ambushed and killed by Arya Stark with the Valyrian steel dagger that Bran had previously given her (“The Spoils of War”), which causes both him and the other White Walkers to shatter and results in the complete obliteration of the Army of the Dead.”“”]
}

The I tried the same with the sbert model and it returned in 444 seconds!

POST /_plugins/_ml/models/Ado3SYsBgoSEiZHXidNO/_predict
{
“text_docs”:[ “”“kills Theon Greyjoy, and prepares to strike down Bran. However, the Night King is ambushed and killed by Arya Stark with the Valyrian steel dagger that Bran had previously given her (“The Spoils of War”), which causes both him and the other White Walkers to shatter and results in the complete obliteration of the Army of the Dead.”“”]
}

Am I doing something wrong above? Is there anything I can do to make it go faster?

Note that this is currently running on a single CPU-only machine.

modelcollapse · October 23, 2023, 1:11am

Can you provide some detail numbers such as latency or throughput?

Since sparse encoder will conduct deep language model inference at the ingestion time, the computation cost will be high, usually we employ extra ML nodes for such computation.

BTW, would you share your hardware configuration?

xinyual · October 23, 2023, 1:20am

Our encoding model is twice larger as bert base model. So it would be larger latency compared with distilbert-cos-v5 model. But it’s unusual to take so much time to inference.

asfoorial · October 24, 2023, 4:53am

What is the max text length can the OpenSearch sparse model handle? I know for instance that bert based models can handle 512 tokens at a time. The remaining tokens will be ignored. Is it the case for this model?

xinyual · October 24, 2023, 5:13am

Yes. We will truncate the sentence into 512 tokens.

asfoorial · October 24, 2023, 5:53am

Is that something automatically done behind the scene. I mean if I have long text, say 3000 tokens, then what would the sparse vector represent in this case? Would it represent the the whole 3000 tokens or only the first 512 and the remaining are ignored?

xinyual · October 24, 2023, 7:42am

Yes. Only the first 512 and the remaining are ignored. We do it inside our model.

asfoorial · October 24, 2023, 5:45pm

The current test I did was on a laptop but I am planning to test it on a bigger cluster.

I was expecting some slowness compared to a normal Lucene index but the difference in ingestion time is too big. Lucene is done in seconds while sparse fields take several minutes for the same data.

xinyual · October 25, 2023, 4:36am

I understand your scenario. Sparse encoding will still need to inference data using model like text embedding so it costs lots of time. But if you chose the option: amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1 model in ingestion and amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1 model in search, you can get accuracy as semantic search using text embedding while small latency like Lucene. Please see model list here Pretrained models - OpenSearch documentation

jeff7981 · December 9, 2023, 5:50am

The amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1(https://artifacts.opensearch.org/models/ml-models/amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1/1.0.0/torch_script/opensearch-neural-sparse-tokenizer-v1-1.0.0.zip) failed the deployment. I checked the zip file, it doesn’t contain the pt file.

Can you please check and upload a correct model?

xinyual · December 9, 2023, 2:19pm

That’s because tokenizer only contains json file for tokenizing. I think you may use the wrong function name. You should use “SPARSE_TOKENIZE” when using tokenizer model.

jeff7981 · December 11, 2023, 6:10am

you are right. I used the wrong function name. Thanks.

Topic		Replies	Views
How to register sparse encoding model in AWS OpenSearch Machine Learning troubleshoot	15	1853	April 21, 2024
The new sparse encoding model is not deployable Machine Learning	4	81	October 30, 2024
[RFC] neural sparse models improvement plan General Feedback	6	306	July 17, 2024
Deploying huggingface_transformers on opensearch cluster Machine Learning troubleshoot	1	51	December 8, 2024
Not able to Register the model even after following documentation Machine Learning all-clients , discuss , troubleshoot , configure , install	2	845	November 25, 2023

Model weights for sparse encoders

Related topics