ElasticSearch Master Nodes Unable to Join Cluster - OpenDistro Security Plugin

zengyan-amazon · November 9, 2020, 11:00pm

@everbeck32 managing indices and shards cost master node resources, and adding/remove data node involves shard movement and recovery, also causes master node resources. Maybe something happened at mid Sunday, e.g. some expensive query or some resource consuming job execution caused the problem.

The fist thing we need to do is to stabilize the cluster. reduce or stop traffic to your cluster may help.

In addition, I don’t know if you still have access to your cluster state and stats API, you may want to observe the pending tasks, and node CPU and JVM node metrics to ensure no node is overloaded if you can access the APIs

everbeck32 · November 9, 2020, 11:02pm

We turned off Logstash ~9 hours ago, so I don’t think there’s much else adding to the traffic.

I can check, but I believe our Cluster State and Stats APIs will be blocked by the OpenDistro Security Plugin issue (that happened with a basic GET earlier today).

@zengyan-amazon I gave my max replies in a day so I’m editing this comment.
We ended up restoring our indexes from snapshot and lost our tenants. Is it possible to restore our .kibana tenant indexes from snapshot? There is very little documentation on this, so we’ll take any recommendation you’ve got

zengyan-amazon · November 9, 2020, 11:17pm

@everbeck32 if there is no traffic to the cluster, and the cluster is still unstable, maybe consider restart all nodes in the cluster (you may want to wait for some time so that the majority of the nodes can join the cluster, thus .opendistro_security index can be recovered, then the security plugin on each node can be initialized).

if it doesn’t solve the issue, I guess I would suggest to rebuild the cluster and restore from snapshot.

bgrabau · November 9, 2020, 11:27pm

We tried this just before you suggested, but we have a process that turns them back on, so we did full system restart, we have to get another team to turn that restart feature off… lol

hardik-k-shah · November 10, 2020, 1:08am

I have gone through all messages here.

If I understand correctly, your cluster was running with dedicated master nodes and suddenly master node started going down. And now, you lost all three master nodes.

Once master nodes are down, cluster become completely inaccessible as there is no master node which can perform management tasks.

Once you get your master up and running, it will not complain about missing .opendistro_security index. (this will be true for any other indices).

zengyan-amazon · November 10, 2020, 6:51pm

tenant indices are just common ES indices, you can restore them from snapshot just like other indices. Please note the your tenant entities are stored in the .opendistro_security index, you will need to restore that index so that you can see your tenants.

Topic		Replies	Views
Open Distro Security not initialized Security	2	17775	February 24, 2021
Elasticsearch not connecting to each other - Opendistro security not initialized Security	26	2179	November 30, 2021
Opendistro Security not Initialized Security	3	1982	June 21, 2021
Error every reboot: Not yet initialized (you may need to run securityadmin Security	3	2634	May 19, 2024
Upgrading to opendistro from elasticsearch General Feedback	2	1233	November 13, 2020

ElasticSearch Master Nodes Unable to Join Cluster - OpenDistro Security Plugin

Related topics