Opensearch 2.13 Text chunking test error

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
opensearch version : 2.13

Describe the issue:
Text chunking is in progress by referring to the link below.
(Text chunking - OpenSearch Documentation)

An error occurs during Step 3.

Step 1: Create a pipeline

PUT _ingest/pipeline/text-chunking-embedding-ingest-pipeline
{
  "description": "A text chunking and embedding ingest pipeline",
  "processors": [
    {
      "text_chunking": {
        "algorithm": {
          "fixed_token_length": {
            "token_limit": 10,
            "overlap_rate": 0.2,
            "tokenizer": "standard"
          }
        },
        "field_map": {
          "passage_text": "passage_chunk"
        }
      }
    },
    {
      "text_embedding": {
        "model_id": "GkyElI8BKGaMwVo6PeZn",
        "field_map": {
          "passage_chunk": "passage_chunk_embedding"
        }
      }
    }
  ]
}

Step 2: Create an index for ingestion

PUT testindex
{
  "settings": {
    "index": {
      "knn": true,
      "default_pipeline": "text-chunking-embedding-ingest-pipeline"
    }
  },
  "mappings": {
    "properties": {
      "passage_text": {
        "type": "text"
      },
      "passage_chunk_embedding": {
        "type": "nested",
        "properties": {
          "knn": {
            "type": "knn_vector",
            "dimension": 768
          }
        }
      }
    }
  }
}

Step 3: Ingest documents into the index

POST testindex/_doc?pipeline=text-chunking-embedding-ingest-pipeline
{
  "passage_text": "This is an example document to be chunked. The document contains a single paragraph, two sentences and 24 tokens by standard tokenizer in OpenSearch."
}

The following error occurs:

{
  "error": {
    "root_cause": [
      {
        "type": "index_not_found_exception",
        "reason": "no such index [testindex]",
        "index": "testindex",
        "index_uuid": "lE1DNT22ShW-eD_0gSkxUg"
      }
    ],
    "type": "index_not_found_exception",
    "reason": "no such index [testindex]",
    "index": "testindex",
    "index_uuid": "lE1DNT22ShW-eD_0gSkxUg"
  },
  "status": 404
}

The model used “GkyElI8BKGaMwVo6PeZn” is a TEXT_EMBEDDING model

Is there something I did wrong?

Hi, can you show me the list of indices?

ex. GET _cat/indices?v&s=index:desc

There isn’t any error I could find. Your version is adequate(>2.13) to use Text Chunking processor.

Hi

Searching for “GET _cat/indices?v&s=index:desc” results in:

health status index                                       uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   testindex                                   lE1DNT22ShW-eD_0gSkxUg   1   1          0            0       416b           208b

This happens even though the index exists.

This is the search result of “GET testindex/_mapping”.

{
  "testindex": {
    "mappings": {
      "properties": {
        "passage_chunk_embedding": {
          "type": "nested",
          "properties": {
            "knn": {
              "type": "knn_vector",
              "dimension": 768
            }
          }
        },
        "passage_text": {
          "type": "text"
        }
      }
    }
  }
}

Did you have any problems following the document in the link below?
(Text chunking - OpenSearch Documentation)

help me please

How much nodes do you have in your OpenSearch cluster? From your cat index result, there are two indices, 1 primary and 1 replica. If your cluster has 3 nodes, there maybe a bug when getting the index setting within the text chunking processor in 2.13. You can either solve it by:

  1. Upgrade your OpenSearch to any version starting from 2.14 (recommended)
  2. Properly set the number of primary shards. For example, the number of primary shards should be divisible by the number of cluster nodes.