[RFC] Search Pipelines

We have an experimental release of Search Pipelines in OpenSearch 2.8 that starts at query submission and ends at result fetching. In between, we see any number of possibilities like query rewriting, analyzers, reranking, etc. that can be integrated as processors in the pipeline. We want to hear more from the community as we start to evolve the code in the repo where this issue lives into the pipeline: [RFC] Search pipelines · Issue #80 · opensearch-project/search-processor · GitHub.

If you want to go directly to the code the search pipeline plugin is here OpenSearch/server/src/main/java/org/opensearch/search/pipeline at main · opensearch-project/OpenSearch · GitHub and the core processors are here
OpenSearch/modules/search-pipeline-common at main · opensearch-project/OpenSearch · GitHub

Looking forward to good discussion and more possibilities with this feature.

1 Like

Looks like we’ve received some input - hopefully the community continues to add useful feedback.

@markcohen - how long will the RFC be open?

I’d like to keep this open until mid-March. We are planning to build and release the initial set of features in 2.7.0 in mid-April. Also, the project is beginning to be tracked here: [META] Search Pipelines · Issue #6278 · opensearch-project/OpenSearch · GitHub for additional input/insight for the community.

1 Like

@kris, the first PR was just merged into OpenSearch/main and we closed the RFC issue. If it makes sense to close out this RFC now, please go ahead.

Thanks!

thanks @markcohen - I’ll close this thread

@markcohen - opened it back up at your request

Thanks @kris. For folks interested, currently the best place to see most work happening in Search Pipelines is here: Search Applications Vertical · GitHub.

Also, [RFC]: Search Phase Injector Processor · Issue #152 · opensearch-project/neural-search · GitHub is a great example of another phase being included in search pipelines.

There’s more we can do here and we want input/questions/PRs/issues for Search Pipelines. Documentation is coming soon.

The base plugin is a good place to start: OpenSearch/SearchPipelinePlugin.java at 69f4ac189c83270265a0c178a642e8d0de3edd90 · opensearch-project/OpenSearch · GitHub

We’re targeting to close this forum post on the original search pipelines implementation by 2023-07-15.

1 Like

I did a quick demo in a lightning talk at Haystack 2023. It starts around 24:30 at Haystack US 2023 - Lightning Talks - YouTube.

The scripting processor that we shipped in 2.8 doesn’t work the way it did in that demo (since that was a hacky implementation that only worked with query_string queries).

Still, the demo (I hope) conveys the general idea of the search pipelines feature and its motivation.

1 Like

We are keeping this open for ideas and input after the OpenSearch 2.9 release which includes Search Pipelines. Here are some processors on our roadmap. We would be happy to take contributions of any kind (PRs for new processors, feedback on the ones we have in the roadmap, and anything in between):

If you do want to create or request a new processor, please create an issue here: Sign in to GitHub · GitHub.

Looking forward to more ideas and discussion.

@markcohen what about Processors that modify both request and response ("bracket processor") · Issue #6722 · opensearch-project/OpenSearch · GitHub?