IFPQ model training time inputs and how to speed up?

warren · February 21, 2024, 12:00am

Versions:
Opensearch 2.11

Describe the issue:
My IFPQ model has been training for four days now. No sense as to when/if it will finish. How can I gauge the timeframe and what are the inputs into the system to make it go faster? Should I scale up memory?

Configuration:
Model is generated with:

POST /_plugins/_knn/models/my_ifpq/_train
{
  "training_index": "opinions",
  "training_field": "embedding",
  "dimension": 1024,
  "description": "embedding model",
  "method": {
    "name": "ivf",
    "engine": "faiss",
    "parameters": {
      "nlist": 3000,
      "encoder": {
        "name": "pq",
        "parameters": {
          "code_size": 8,
          "m": 64
        }
      }
    }
  }
}

current status:

GET /_plugins/_knn/models/my_ifpq
{
  "model_id": "my_ifpq",
  "model_blob": "",
  "state": "training",
  "timestamp": "2024-02-16T06:07:27.059481730Z",
  "description": "embedding model",
  "error": "",
  "space_type": "l2",
  "dimension": 1024,
  "engine": "faiss"
}

I have roughly 8 million records.

Instance type/quantity is one r6g.large.search

Relevant Logs or Screenshots:

Resource utilization is moderate/high, so something seems to be happening:

system · April 21, 2024, 12:01am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cannot create index with HNSW + PQ k-NN	7	983	January 16, 2024
Reproducibility of Faiss model k-NN discuss	1	616	April 29, 2022
Internals of the KNN plugin Machine Learning	2	176	April 8, 2024
Predict API slower than index pipeline Machine Learning	8	313	May 19, 2024
How should we handle updating ML/Sparse Encoding Models? Machine Learning	3	274	May 19, 2024

IFPQ model training time inputs and how to speed up?

Related topics