Cannot create index with HNSW + PQ

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.9

Describe the issue:
I am trying to train model in order to create index using HNSQ and PQ:

{
    "training_index": "training-data-index",
    "training_field": "embedding_vector",
    "dimension": 512,
    "description": "HNSW with PQ",
    "method": {
        "name": "hnsw",
        "engine": "faiss",
        "space_type": "l2",
        "parameters": {
            "encoder": {
                "name": "pq",
                "parameters": {
                    "code_size": 8,
                     "m": 8
                }
            }
        }
    }
}

When trying to read status, I’m getting error about probably running out of memory or invalid parameters. According to logs, it may be the latter case:

[2023-08-23T21:17:09,996][ERROR][o.o.k.t.TrainingJob      ] [fa696978b96b] Failed to run training job for model "hnsw_pq_m8": Error in std::unique_ptr<faiss::Index> faiss::{anonymous}::index_factory_sub(int, std::string, faiss::MetricType) at /tmp/tmpzmr1s8yb/k-NN/jni/external/faiss/faiss/index_factory.cpp:781: Error: 'index' failed: could not parse HNSW code description PQ8x8 in HNSW16,PQ8x8

Any ideas what may be wrong with the parameters?

EDIT: I dug into faiss source code starting from the line number provided in log.
The code - in that case PQ8x8 is parsed and matched sequentially to couple of regular expressions. And I confirm it does not match any. PQ8x8 would be valid for IVF, but not HNSW.

Configuration:

Relevant Logs or Screenshots:

Thanks @Tomex. You are correct. This is a bug. I created an issue in faiss to see if it can be further parametrized: Can HNSWPQ nbits parameter be made configurable? · Issue #3027 · facebookresearch/faiss · GitHub. Ill create an issue in the k-NN repo to track as well.

Was this issue resolved? It seems it was fixed here Update Faiss engine to allow PQ and HNSW (#1074) · opensearch-project/k-NN@a67e987 · GitHub and merged in version 2.10

I just tried the follow on version 2.11

POST /_plugins/_knn/models/hnsw_pq_model4/_train
{
  "training_index": "train-data-index",
  "training_field": "main_vector",
  "dimension": 768,
  "method": {
      "name":"hnsw",
      "engine":"faiss",
      "parameters":{
        "encoder":{
            "name":"pq",
            "parameters":{
              "m": 8,
              "code_size": 8
            }
        }
      }
  }
}

And I’m still seeing this error

"Failed to execute training. May be caused by an invalid method definition or not enough memory to perform training."

@jmazane can you check what could be the possible issue here

yes, will do some debugging on this.

Is there any way to enable PQ with HNSW in FAISS library in Opensearch 2.9? I keep getting this “error”: “Failed to execute training. May be caused by an invalid method definition or not enough memory to perform training.”,

error when I have the encoder as “pq”

Not with opensearch with 2.9. You can do with 2.10 or above.

@chaitujil

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.