Users cannot log into their custom tenant after restoring indexes #263

Hi everyone!
We use docker-compose for deploy our cluster, product versions: amazon/opendistro-for-elasticsearch:1.13.1 and amazon/opendistro-for-elasticsearch-kibana:1.13.1.
We take a snapshot of the indexes on the working cluster (3 nodes) and transfer them to a similar new cluster (3 nodes), at the same time we take a snapshot of the opendistro_security plugin’s configuration files:
# ./ -backup /mnt/path/ -icl -nhnv -cacert /usr/share/elasticsearch/config/root-ca.pem -cert /usr/share/elasticsearch/config/admin.pem -key /usr/share/elasticsearch/config/admin.key
We start a new cluster with opendistro_security.disabled: true , setup snapshot repository, restore the indexes with:

  "ignore_unavailable": true,
  "include_global_state": true,
  "include_aliases": true

enable the opendistro_security plugin, restart the odfe cluster (by mounting the following directories into containers like all certs from old cluster, all yml files from securityconfig directory) and start kibana. Example for odfe-node01:

      - ./certs/new/root-ca.pem:/usr/share/elasticsearch/config/root-ca.pem
      - ./certs/new/odfe-node-prod-odfe01.key:/usr/share/elasticsearch/config/o4n01.key
      - ./certs/new/odfe-node-prod-odfe01.pem:/usr/share/elasticsearch/config/o4n01.pem
      - ./certs/new/odfe-node-prod-odfe01_http.key:/usr/share/elasticsearch/config/o4n01_http.key
      - ./certs/new/odfe-node-prod-odfe01_http.pem:/usr/share/elasticsearch/config/o4n01_http.pem
      - ./certs/new/admin.pem:/usr/share/elasticsearch/config/admin.pem
      - ./certs/new/admin.key:/usr/share/elasticsearch/config/admin.key
      - ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - ./securityconfig:/usr/share/elasticsearch/plugins/opendistro_security/securityconfig/

      - ./certs/new/root-ca.pem:/usr/share/kibana/config/root-ca.pem
      - ./certs/new/prod-odfe01-kibana_http.key:/usr/share/kibana/config/o4n01-kibana_http.key
      - ./certs/new/prod-odfe01-kibana_http.pem:/usr/share/kibana/config/o4n01-kibana_http.pem
      - ./custom-kibana.yml:/usr/share/kibana/config/kibana.yml

After that I execute the command to apply the new settings:
# ./ -cd /usr/share/elasticsearch/plugins/opendistro_security/securityconfig/ -icl -nhnv -cert /usr/share/elasticsearch/config/admin.pem -cacert /usr/share/elasticsearch/config/root-ca.pem -key /usr/share/elasticsearch/config/admin.key
No errors, cluster have “green” status, i login like admin and all the settings made on the old cluster are identical in the new cluster, but when I log in as a user in kibana ui, user does not have any custom tenants for select.
At what step in restoring indexes from one cluster to another did I make a mistake?
Thanks for any help.

Do you mean when you login as admin in kibana you see (and can select) all the tenants, but when you connect with any other user those tenants are not there?

Yes @Anthony admin user see and can choose all available from list “Choose from custom” tenants, but users can’t.
At the same time, all the settings and parameters in the tab Security are identical to those on the other cluster (from which everything was transferred).

@ioleg I have just followed your steps and I’m not able to reproduce the error (using the same setup and ODFE 1.13.1)

Can you retrieve all of the configs from old cluster and new cluster and compare the files, (as they should be identical). In particular the roles and roles mappings, as the users are obviously not getting the correct permissions to access the tenants in question.
I did notice that when I retrieved from first cluster my tenants file was named security_tenants.yml (not tenants.yml), which I had to rename manually (looks like a bug).
I assume you had to do the same, otherwise would have seen errors during loading of the retrieved config.

Can you also confirm if in elasticsearch.yml file you have the below line preventing default initialisation of security index:
opendistro_security.allow_default_init_securityindex: false

No, in my config file i have: opendistro_security.allow_default_init_securityindex: true
I use ansible when deploying a cluster, so the parameters are the same on the old and the new cluster.
Could this be the cause of the problem?

@ioleg have a look at the 2 retrieved versions, this should point you in the right direction.

I would imagine that some of the config gets overwritten with default config, which is why difference in behaviour.

regarding the allow_default_init_securityindex setting, if you are going to manually upload configuration (and initialise security index) then that option should be set to false.

Also connect to one of the containers and verify that the configuration in /usr/share/elasticsearch/plugins/opendistro_security/securityconfig/ is indeed correct, before running the script.

1 Like

@Anthony One more additional question. Should I copy the certificates from the old cluster and use them on the new one, or is it not necessary?
And should I switch “allow_default_init_securityindex” back to “true” after restore?

If you are not using the demo_install script (which would otherwise create the certificates for you), then yes, they should be copied into docker.

allow_default_init_securityindex is not needed as you are not looking for the index to be re-initialised again.

Perhaps I did not describe the situation that way, but I mean, do I need to back up certificates from all nodes (old cluster) and then use them instead of new certificates when restoring data on a new cluster? What kind of recovered data requires certificates from the old cluster?

@ioleg I don’t believe there is any of such data, as long as configuration in elasticsearch.yml (node_dn and admin_dn) is correct, everything should work as expected.

Just to make sure, you are not replacing any new certificates, because there are none on the new cluster (because you are not using demo_install script), you are just uploading the old certs to this cluster and restoring from snapshot. Hope my understanding is correct

@Anthony If i have a new cluster with new certificates that i can generate during deployment, can I have problems with recovering data from the old cluster? Information about old certificates is stored somewhere in indexes or other cluster files?

@ioleg no there shouldn’t be any issues related to certificates. You can try this yourself by running docker-compose file (which creates demo certs), then stop it and on start load in the new certificates via volumes (and updated opensearch.yml file with new node_dn). It should work as expected.

moving to security category