Reproducibility of Faiss model

andrey · April 29, 2022, 1:27pm

Hi all,

I am working on a ‘faiss ivf flat’ model. These are my model settings

{
            "training_index": self.train_index_name,
            "training_field": self.train_vector_name,
            "dimension": 24,
            "description": "My models description",
            "method": {
                "name": "ivf",
                "space_type": "l2",
                "engine": "faiss",
                "parameters": {
                    "nlist": 400,
                    "encoder": {"name": "flat"},
                },
            },
        }

The training index includes 200,000 vectors. I trained the model several times on this data and each time the trained model gave different results for the same test vector.

At the same time, when I train models with the same parameters through the faiss Python library, I get completely models that give the same results.

Is it possible to train reproducible models with opensearch?

My code for model with python

self.quantiser = faiss.IndexFlatL2(features.shape[1])
self.index = faiss.IndexIVFFlat(
            self.quantiser, features.shape[1], self.nlist, faiss.METRIC_L2
)
self.index.train(features.astype(np.float32))
self.index.add(features.astype(np.float32))

dtaivpp · April 29, 2022, 8:57pm

How much variance are you getting when you are testing? Also, what is your setup like? Are you running several nodes or in single node mode?

Topic		Replies	Views
Cannot create index with HNSW + PQ k-NN	7	997	January 16, 2024
IFPQ model training time inputs and how to speed up? k-NN troubleshoot , feature-request	1	125	April 21, 2024
Internals of the KNN plugin Machine Learning	2	176	April 8, 2024
Python client for train api (when creating a model for knn indexing) OpenSearch Client Libraries opensearch-py	2	494	April 21, 2023
Should The Same FAISS k-NN Query Yield The Same Results Each Time? k-NN discuss , troubleshoot	1	356	May 4, 2024

Reproducibility of Faiss model

Related topics