[Feedback] Conversational Search and Retrieval Augmented Generation Using Search Pipeline - Experimental Release

austinlee · September 27, 2023, 8:06pm

In OpenSearch 2.10, we launched two new features that bring capabilities of GenAI to OpenSearch. The first one is Memory and it is meant to serve as a building block for search applications and agents to store and retrieve conversational history. The second is a new search processor that handles RAG (Retrieval Augmented Generation) which combines search results and large language models and conversational memory to answer users’ questions. RAG in OpenSearch relies on the remote inference framework and the connector feature. When you put all of these pieces together to have conversations over your data, we also recommend that you try it using Hybrid Search combining both BM25 and KNN to get the most out of OpenSearch.

We are looking forward to getting the community’s feedback on these features. We are excited to make them available in the 2.10 release and have people try out conversational search. We think this new mode of interacting with the data enables users to get better search results. Please, try it out and help us make it even better.

For a more detailed discussion on this, you can check out our RFC - https://github.com/opensearch-project/ml-commons/issues/1150.

ramda · October 8, 2023, 4:20pm

How can I build a RAG with a model other than Open AI, Cohere and Sagemaker.
Can i use hugging face transformer or BERT model for predicting sentences? without having any hugging face key

i am using version 2.10

If yes how to build a http connector for it? or how can i load the model

I tried this below one

POST /_plugins/_ml/models/_upload

{
  "name": "huggingface/TheBloke/vicuna-13B-1.1-GPTQ",
  "version": "1.0.1",
  "model_format": "TORCH_SCRIPT"
}

But after having the model, i get this below error. Can i load any other LLM model for RAG?

{
    "error": {
        "root_cause": [
            {
                "type": "m_l_exception",
                "reason": "plugins.ml_commons.rag_pipeline_feature_enabled is not enabled."
            }
        ],
        "type": "m_l_exception",
        "reason": "plugins.ml_commons.rag_pipeline_feature_enabled is not enabled."
    },
    "status": 500
}

I have already enabled rag, still i get the above error

{
  "persistent": {
    "plugins.ml_commons.rag_pipeline_feature_enabled": "true"
  }
}

ramda · October 12, 2023, 1:56pm

This is fixed by adding trusted endpoint

sribalajivelan · November 15, 2023, 6:48am

Some examples are wrong in the conversational search documentation; please correct them.

Extra space on OpenAI Connector request_body in temperature
Awesome Screenshot
The model group’s creation endpoint is wrong

POST /_plugins/_ml/model_groups/_register

https://www.awesomescreenshot.com/image/44260054?key=33f05fb84a8c6a60a640407a927edb23

moritalous · November 17, 2023, 3:23pm

It would be nice to have a RAG that uses the results of the Highlight as well as the _source.

rawwar · November 17, 2023, 11:05pm

@sribalajivelan , these two are fixed now.

andy7t · March 15, 2024, 3:56pm

I’m attempting to use the Anthrophic Claude via Bedrock connector with Conversational Search.
When defining the blueprint the Connector specifies that the prompt is populated with ${parameters.inputs}. (ml-commons/docs/remote_inference_blueprints/bedrock_connector_anthropic_claude_blueprint.md at 2.x · opensearch-project/ml-commons · GitHub)

I have defined my pipeline:

PUT /_search/pipeline/rag_pipeline2
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "claude_rag",
        "description": "Demo pipeline Using Claude Connector",
        "model_id": "eFbAQo4BZvVG6ToieVK8",
        "context_field_list": ["text"],
  			"llm_model": "anthropic.claude-v2",
        "system_prompt": "You are a helpful assistant",
        "user_instructions": "Generate a concise and informative answer in less than 100 words for the given question"
      }
    }
  ]
}

When using the RAG pipeline with a simple search I get the error:

GET /tweet-index/_search?search_pipeline=rag_pipeline2
{
	"query" : {
    "match": {
      "text": "simple"
    }
	},
	"ext": {
		"generative_qa_parameters": {
			"llm_question": "Was Abraham Lincoln a good politician"
			}
	}
}

    "type": "illegal_argument_exception",
    "reason": "Some parameter placeholder not filled in payload: inputs"

I cannot find how to instruct the search pipeline to populate the ‘inputs’ field.
I have also tried changing the Bedrock connector to use ${parameters.messages} as the OpenAI connector uses, but this then returns

    "type": "illegal_argument_exception",
    "reason": "Invalid JSON in payload"

How do you customise a search / search pipeline to format the prompt and match the input params of the connector?

Chris1 · March 26, 2024, 2:49pm

I tried it with an own internal server that offers an OpenAi-compatible http interface to a model. I added it to the trusted connector endpoints, thus the server Url is accepted now. Nevertheless I get now the error that it is a private IP address:
[ERROR][o.o.m.e.h.MLHttpClientFactory] [port-4106] Remote inference host name has private ip address: serv-3329
[ERROR][o.o.m.e.a.r.HttpJsonConnectorExecutor] [port-4106] Fail to execute http connector
java.lang.IllegalArgumentException: serv-3329

How can I connect to my model server now?

chanan · March 29, 2024, 5:28pm

Hi,

I had (almost) no problem using this from the opensearch dashboard web ui / Dev Tools. But now I am trying it out via the Javascript SDK and running into a problem once I try to send a query:

GenerativeQAResponseProcessor failed in precessing response

In my server logs I see:

java.lang.IllegalArgumentException: Invalid payload: "{ "model": "gpt-3.5-turbo", "messages": [{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"Generate a concise and informative answer in less than 100 words for the given question"},{"role":"user","content":"QUESTION: Tell me about 10Fold"},{"role":"user","content":"ANSWER:"}], "temperature": 0 }"

I see I have something wrong in my setup, I just can’t figure out what

austinlee · March 30, 2024, 6:58pm

Are you saying this happens with a connector that works when you use it from the OS dashboards?

austinlee · March 30, 2024, 7:00pm

Can you comment on this issue - [BUG] externally hosted model can not have a private ip address · Issue #2142 · opensearch-project/ml-commons · GitHub

austinlee · March 30, 2024, 7:05pm

For now, you can specify “bedrock” as a model prefix when creating a search pipeline:

github.com

opensearch-project/ml-commons/blob/main/docs/tutorials/conversational_search/conversational_search_with_Cohere_Command.md#22-search

# Topic

This tutorial explains how to use conversational search with the Cohere Command model. For more information, see [Conversational search](https://opensearch.org/docs/latest/search-plugins/conversational-search/).

Note: Replace the placeholders that start with `your_` with your own values.

# Steps

## 0. Preparation

Ingest test data:
```
POST _bulk
{"index": {"_index": "qa_demo", "_id": "1"}}
{"text": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."}
{"index": {"_index": "qa_demo", "_id": "2"}}
{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."}
{"index": {"_index": "qa_demo", "_id": "3"}}
{"text": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."}
{"index": {"_index": "qa_demo", "_id": "4"}}

This file has been truncated. show original

You won’t have to do that starting OpenSearch 2.13.

chanan · March 30, 2024, 9:27pm

Sorry for the confusion, no I am recreating the same setup as the tutorial, here is my code:

export async function recreateIndex() {
  console.log("--- Recreating index ---");
  const client = getClient();

  try {
    const response1 = await client.indices.delete({
      index,
    });
    console.log("response1 :>>", JSON.stringify(response1.body, null, 2));
  } catch (error) {
    // Ignore if index does not exist
  }

  const response2 = await client.http.post({
    path: "/_plugins/_ml/connectors/_create",
    body: {
      name: "OpenAI Chat Connector",
      description: "The connector to public OpenAI model service for GPT 3.5",
      version: 2,
      protocol: "http",
      parameters: {
        endpoint: "api.openai.com",
        model: "gpt-3.5-turbo",
        temperature: 0,
      },
      credential: {
        openAI_key: "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",
      },
      actions: [
        {
          action_type: "predict",
          method: "POST",
          url: "https://${parameters.endpoint}/v1/chat/completions",
          headers: {
            Authorization: "Bearer ${credential.openAI_key}",
          },
          request_body:
            '"{ "model": "${parameters.model}", "messages": ${parameters.messages}, "temperature": ${parameters.temperature} }"',
        },
      ],
    },
  });
  const { connector_id } = response2.body;
  console.log("connector_id :>> ", connector_id);

  const response3 = await client.http.post({
    path: "/_plugins/_ml/models/_register",
    body: {
      name: "openAI-gpt-3.5-turbo",
      function_name: "remote",
      description: "test model",
      connector_id,
    },
  });
  const { model_id } = response3.body;
  console.log("model_id :>> ", model_id);

  const response4 = await client.http.post({
    path: `/_plugins/_ml/models/${model_id}/_deploy`,
  });
  console.log("response4 :>> ", response4.body);

  const response5 = await client.http.put({
    path: "/_search/pipeline/rag_pipeline",
    body: {
      response_processors: [
        {
          retrieval_augmented_generation: {
            tag: "openai_pipeline_demo",
            description: "Demo pipeline Using OpenAI Connector",
            model_id,
            context_field_list: ["textContent"],
            system_prompt: "You are a helpful assistant",
            user_instructions:
              "Generate a concise and informative answer in less than 100 words for the given question",
          },
        },
      ],
    },
  });
  console.log("response5 :>>", JSON.stringify(response5.body, null, 2));

  const response6 = await client.http.put({
    path: `/${index}`,
    body: {
      settings: {
        "index.search.default_pipeline": "rag_pipeline",
      },
      mappings: {
        properties: {
          textContent: {
            type: "text",
          },
        },
      },
    },
  });
  console.log("response6 :>>", JSON.stringify(response6.body, null, 2));
}

Topic		Replies	Views
How best to do RAG/Chat search now? OpenSearch	0	159	April 26, 2024
[Feedback] Neural Search plugin - experimental release General Feedback releases	42	3631	July 18, 2023
Not able to create a RAG Search Pipeline after following the documentation OpenSearch discuss , troubleshoot , configure	1	280	February 20, 2024
[Feedback] Machine Learning Model Serving Framework - Experimental Release General Feedback releases	48	3009	July 12, 2023
Building Applications with Neural Searches using Gen AI and OpenSearch Machine Learning	3	279	February 12, 2024

[Feedback] Conversational Search and Retrieval Augmented Generation Using Search Pipeline - Experimental Release

Related topics