Is there a way to create Sparse Neural index that uses raw vectors?

dswitzer2 · March 18, 2024, 1:48pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
OpenSearch v2.12
CentoS 7

Describe the issue:
Is it possible to create an index for sparse neural searching, but without using the pipeline to generate the vectors at index time?

The inference step when indexing content makes things very slow to index.

I’m trying to figure out if there’s a way I can cache the vectors so that when I need to re-build an index I can just use the cached vectors. That way I can quickly re-build or re-index our data.

Here’s the workflow I’m curious if I can achieve:

Run text through a sparse model to create the index (such as with the _simulate endpoint).
Take the vectors generated and store them outside of OpenSearch along with our original data.
When indexing an article, pass in the pre-generated vectors (along with all the other index information), skipping the pipeline inference step.

This would provide a few benefits:

We often have to update an document in an index and not all of the information being converted to vectors has changed. If we handle the vectorization outside of the index step, we can skip creating vectors for data that has not changed.
We often re-build indices. With non-ML indexes, we can generally all of our indexes in minutes (we’re not dealing with millions of docs < 200,000). However, it generally takes use 2-4 seconds per document once we enable the ML pipeline. That makes re-building an index not practical. However, if we could just cache the vectors with our main document source data, we believe that the re-index period would be much closer to what it is now.
It gives us a strategy for migrating models. If we can split the vectorization process from the index, if we want to switch models we can run a job to build up to create new vectors based on the new model and once we have all the documents completed, we can just re-index using the new vector information. While we could do this by versioning our indices, we like being able to do this outside of index process.

I’ve looked through the documentation and I’ve been searching, but I don’t see any examples of how this could be done.

I did find that someone wanted to do the same kind of thing for queries:

github.com/opensearch-project/neural-search

[FEATURE] Support for vectors as parameters in the neural search query

opened 09:59PM - 21 Feb 24 UTC

brusic

untriaged enhancement

### Is your feature request related to a problem? [Neural sparse search](http…s://opensearch.org/docs/latest/search-plugins/neural-sparse-search/) Currently the neural search query only accepts the model id alongside the text to be encoded, which requires a model to be registered into a pipeline. The query should also support passing in the vector directly, bypassing the pipeline phase. It can be beneficial for clients to do the encoding for several reasons: ad hoc analysis, unit testing, custom/unsupported models. ### What solution would you like? Accept a vector, similar to [knn search](https://opensearch.org/docs/latest/search-plugins/knn/index/) ``` GET /my-nlp-index/_search { "_source": { "excludes": [ "passage_embedding" ] }, "query": { "neural_sparse": { "passage_embedding": { "vector": ['a':2, 'b':3, 'c':5, 'd':6], "k": 5 } } } } ``` ### What alternatives have you considered? `rank_features` is a close alternative, but can only rank (boost) other query clauses. ### Do you have any additional context? ES will soon have a [weighted_tokens query](https://github.com/elastic/elasticsearch/blob/main/docs/reference/query-dsl/weighted-tokens-query.asciidoc), which is analogous to their [text_expansion query](https://github.com/elastic/elasticsearch/blob/main/docs/reference/query-dsl/text-expansion-query.asciidoc).

That seems to be targeted for OpenSearch 2.14.

Is there a way to accomplish this?

dswitzer2 · March 18, 2024, 9:50pm

To answer my own question, it appears you can use the results from the Predict/Simulate endpoint APIs to generate the embeddings and then just pass the embedded vector values in when indexing your document instead of using the pipeline.

Is there any reason this might be a bad idea?

xinyual · March 19, 2024, 2:37am

Hi @dswitzer2 , if you want to ingest data without pipeline, you can ingest like:
Create index without pipeline:

PUT /demo_index/
{
  "mappings": {
      "properties": {
        "passage_text": {
            "type": "text"
        },
        "passage_sparse": {
            "type": "rank_features"
        }
    }
  }
}

Index data:

POST /demo_index/_doc/1
{
"passage_text": "hello demo",  
 "passage_sparse": {"demo": 0.5, "hello": 1.0}
}
```,

then you can search this index using neural sparse clause.

dswitzer2 · March 19, 2024, 1:20pm

Yes, I can confirm this works.

xinyual · March 19, 2024, 2:31pm

For your idea here, this is actually what we do in the ingestion pipeline. If you want to store a copy outside the opensearch, this is a good solution.

dswitzer2 · March 19, 2024, 2:46pm

Thanks for confirming!

When we re-index content a lot of time the data being embedded may not have changed, so being able to cache our embedding will provide us a boost. This will be especially useful when migrating between environments so we can skip the inference process completely when we re-index content into a fresh environment.

system · May 18, 2024, 2:46pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Predict API slower than index pipeline Machine Learning	8	317	May 19, 2024
How should we handle updating ML/Sparse Encoding Models? Machine Learning	3	279	May 19, 2024
Implementing RAG workflow without semantic search Machine Learning	2	128	January 27, 2025
Performance and scaling of ML models and dense vector data Machine Learning discuss	6	714	May 12, 2023
Reindex a knn_vector field and transform it using painless scripting OpenSearch	1	378	May 22, 2024

Is there a way to create Sparse Neural index that uses raw vectors?

Related topics