On demand roles evaluation

Greetings all,

After taking a close look at the source code of OpenSearch Security plugin, I noticed that all roles/mappings get loaded into memory upon node start, as shown in the below method of the ConfigurationRepository class. That can be memory expensive (could leave to crash) if we are replicating roles from a source database.

Question: Is there a way to force role evaluation to directly read from the security index (on demand) instead of pulling it from a cache? So basically I want to avoid loading all roles and their mappings to memory. In fact querying directly from an index should be very fast.

private void reloadConfiguration0(Collection configTypes, boolean acceptInvalid) {

    final Map<CType, SecurityDynamicConfiguration<?>> loaded = getConfigurationsFromIndex(configTypes, false, acceptInvalid);

    configCache.putAll(loaded);

    notifyAboutChanges(loaded);

} 

Thanks

2 Likes

That is a really interesting find. I am not as familiar with Java memory management. Do you have an idea about roughly how many roles it would take to fill it in its default config? Another thought I have is if you have enough roles to fill memory and we instead have it pulling from the index there may actually be a noticeable performance hit.

Really just thinking out loud here as I am not an expert in Java. What are your thoughts?

1 Like

It depends on how much memory you have and how many users per role before it hits a crashing state. I tried it once on a single machine and stress tested a scenario which crashed the node upon node start. The roles were around 100K with each mapped to 10K users.

The point here is that caching all roles and rolemappings could be an overkill in situations where we are required to sync security roles from a source database. Not all of that data should remain in memory all the time. It is likely to have thousands of roles per data source and the users in an organization can be hunders of throusands as well. I know that OpenSearch is memory intensive so why waste memory unnecessarily. I think there is room for improvement here.

Regarding performance when pulling directly from the security index, it should be similar to a parent join query, which returned in milleseconds last time I checked.

I understand that the security index is encrypted by default, but does it have to be entirely encrypted (especially roles and role mappings)?

1 Like