400 'illegal_argument_exception', 'explicit index in bulk is not allowed'

Folks, I am trying to use opensearch and facing the following error.

Error

RequestError(400, 'illegal_argument_exception', 'explicit index in bulk is not allowed')

Setup

OpenSearch Versions tested: 1.3.17, 2.18.0, 2.9.0

Code

from langchain_community.vectorstores.opensearch_vector_search import OpenSearchVectorSearch
from langchain_core.documents import Document
from langchain_openai import AzureOpenAIEmbeddings

data = [
    {
        "category": "category",
        "sub_category": "sub_category",
        "intent": "intent",
        "intent_description": "intent_description",
        "sample_queries": "sample_queries",
        "sop": "sop",
        "functions": "functions"
    }
]

documents = []
for item in data:

    embedding_content = f"{item['intent']}" + "\n" + f"{item['intent_description']}" + "\n" + f"{item['sample_queries']}"
    doc = Document(page_content=embedding_content)

    doc.metadata = item

    documents.append(doc)

embedding = AzureOpenAIEmbeddings(
    openai_api_key=settings.open_ai.key,
    deployment=settings.open_ai.deployment,
    model=settings.open_ai.model,
    azure_endpoint=settings.open_ai.api_base
)
esHost = settings.es.host + ":" + settings.es.port

OpenSearchVectorSearch.from_documents(documents=documents,
    embedding=embedding,
    opensearch_url=esHost,
    index_name="some_index")

Using the following version of langchain version:

langchain_community===0.0.27


Stack Trace:

File "/app/commands/migrations/sop_elastic_migration.py", line 137, in __migrate_from_csv
  OpenSearchVectorSearch.from_documents(documents=documents,
File "/usr/local/lib/python3.11/site-packages/langchain_core/vectorstores.py", line 550, in from_documents
  return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/opensearch_vector_search.py", line 843, in from_texts
  return cls.from_embeddings(
         ^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/opensearch_vector_search.py", line 967, in from_embeddings
  _bulk_ingest_embeddings(
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/opensearch_vector_search.py", line 138, in _bulk_ingest_embeddings
  bulk(client, requests, max_chunk_bytes=max_chunk_bytes)
File "/usr/local/lib/python3.11/site-packages/opensearchpy/helpers/actions.py", line 407, in bulk
  for ok, item in streaming_bulk(
File "/usr/local/lib/python3.11/site-packages/opensearchpy/helpers/actions.py", line 327, in streaming_bulk
  for data, (ok, info) in zip(
File "/usr/local/lib/python3.11/site-packages/opensearchpy/helpers/actions.py", line 263, in _process_bulk_chunk
  for item in gen:
File "/usr/local/lib/python3.11/site-packages/opensearchpy/helpers/actions.py", line 204, in _process_bulk_chunk_error
  raise error
File "/usr/local/lib/python3.11/site-packages/opensearchpy/helpers/actions.py", line 247, in _process_bulk_chunk
  resp = client.bulk("\n".join(bulk_actions) + "\n", *args, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/opensearchpy/client/utils.py", line 179, in _wrapped
  return func(*args, params=params, headers=headers, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/opensearchpy/client/__init__.py", line 411, in bulk
  return self.transport.perform_request(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/opensearchpy/transport.py", line 409, in perform_request
  raise e
File "/usr/local/lib/python3.11/site-packages/opensearchpy/transport.py", line 370, in perform_request
  status, headers_response, data = connection.perform_request(
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/opensearchpy/connection/http_urllib3.py", line 266, in perform_request
  self._raise_error(
File "/usr/local/lib/python3.11/site-packages/opensearchpy/connection/base.py", line 301, in _raise_error
  raise HTTP_EXCEPTIONS.get(status_code, TransportError)(
sop_tests_1   | opensearchpy.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'explicit index in bulk is not allowed')

Logs from Opensearch Container

opensearch_1  | [2025-01-28T11:49:56,281][INFO ][o.o.p.PluginsService     ] [ce040dadc9a7] PluginService:onIndexModule index:[some_index/3uAdT1nxT8KAnYo3Wb_ozw]
opensearch_1  | [2025-01-28T11:49:56,292][INFO ][o.o.c.m.MetadataCreateIndexService] [ce040dadc9a7] [some_index] creating index, cause [api], templates [], shards [1]/[1]
opensearch_1  | [2025-01-28T11:49:56,307][INFO ][o.o.p.PluginsService     ] [ce040dadc9a7] PluginService:onIndexModule index:[some_index/3uAdT1nxT8KAnYo3Wb_ozw]
opensearch_1  | [2025-01-28T11:49:56,315][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [ce040dadc9a7] Detected cluster change event for destination migration
opensearch_1  | [2025-01-28T11:49:56,357][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [ce040dadc9a7] Detected cluster change event for destination migration

I solved this by adding "rest.action.multi.allow_explicit_index=true" as environment variable.

Dockerfile:

version: '3'
services:
  redis:
    image: redis:6.2-alpine
    restart: always
    ports:
      - '6379:6379'
    command: redis-server --loglevel warning
    volumes:
      - cache:/data
  opensearch:
    image: opensearchproject/opensearch:2.9.0
    environment:
      - "network.host=0.0.0.0"
      - "http.cors.enabled=true"
      - "http.cors.allow-origin=*"
      - "rest.action.multi.allow_explicit_index=true"
      - "discovery.type=single-node"
      - "plugins.security.disabled=true"
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "DISABLE_SECURITY_DASHBOARDS_PLUGIN=true"
      - "OPENSEARCH_INITIAL_ADMIN_PASSWORD=Razorp@y123"
      - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m"
    ports:
      - 9200:9200
      - 9600:9600

You probably want to set the configuration option rest.action.multi.allow_explicit_index to true. Normally _bulk APi endpoints are used with newline delineated JSON which specifies an index on a line as a part of the payload.