Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.9
Describe the issue:
I am trying to train model in order to create index using HNSQ and PQ:
{
"training_index": "training-data-index",
"training_field": "embedding_vector",
"dimension": 512,
"description": "HNSW with PQ",
"method": {
"name": "hnsw",
"engine": "faiss",
"space_type": "l2",
"parameters": {
"encoder": {
"name": "pq",
"parameters": {
"code_size": 8,
"m": 8
}
}
}
}
}
When trying to read status, I’m getting error about probably running out of memory or invalid parameters. According to logs, it may be the latter case:
[2023-08-23T21:17:09,996][ERROR][o.o.k.t.TrainingJob ] [fa696978b96b] Failed to run training job for model "hnsw_pq_m8": Error in std::unique_ptr<faiss::Index> faiss::{anonymous}::index_factory_sub(int, std::string, faiss::MetricType) at /tmp/tmpzmr1s8yb/k-NN/jni/external/faiss/faiss/index_factory.cpp:781: Error: 'index' failed: could not parse HNSW code description PQ8x8 in HNSW16,PQ8x8
Any ideas what may be wrong with the parameters?
EDIT: I dug into faiss source code starting from the line number provided in log.
The code - in that case PQ8x8 is parsed and matched sequentially to couple of regular expressions. And I confirm it does not match any. PQ8x8 would be valid for IVF, but not HNSW.
Configuration:
Relevant Logs or Screenshots: