Reproducibility of Faiss model

Hi all,

I am working on a ‘faiss ivf flat’ model. These are my model settings

{
            "training_index": self.train_index_name,
            "training_field": self.train_vector_name,
            "dimension": 24,
            "description": "My models description",
            "method": {
                "name": "ivf",
                "space_type": "l2",
                "engine": "faiss",
                "parameters": {
                    "nlist": 400,
                    "encoder": {"name": "flat"},
                },
            },
        }

The training index includes 200,000 vectors. I trained the model several times on this data and each time the trained model gave different results for the same test vector.

At the same time, when I train models with the same parameters through the faiss Python library, I get completely models that give the same results.

Is it possible to train reproducible models with opensearch?

My code for model with python

self.quantiser = faiss.IndexFlatL2(features.shape[1])
self.index = faiss.IndexIVFFlat(
            self.quantiser, features.shape[1], self.nlist, faiss.METRIC_L2
)
self.index.train(features.astype(np.float32))
self.index.add(features.astype(np.float32))
1 Like

How much variance are you getting when you are testing? Also, what is your setup like? Are you running several nodes or in single node mode?