I took a backup of the opendistro config before upgrading anything.
Then I upgraded elasticsearch via yum, made sure elasticsearch wasn’t running, and finally ran the securityadmin.sh migrate command as indicated in the above opendistro upgrade doc. It fails, no matter what arguments i give it with the same error:
WARNING: JAVA_HOME not set, will use /bin/java
Error: A JNI error has occurred, please check your installation and try again
Exception in thread “main” java.lang.NoClassDefFoundError: org/elasticsearch/action/Action
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:650)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:632)
Caused by: java.lang.ClassNotFoundException: org.elasticsearch.action.Action
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
… 7 more
none of the elasicsearch plugins you have installed will work (including the opendistro_security plugin) after the elasticsearch upgrade, until you uninstall them, and reinstall them.
you need to have elasticsearch running before the opendistro securityadmin.sh migrate operation can work.
but elasticsearch won’t run until you remove all mention of xpack from your elasticsearch.yml, as well as remove some settings from the jvm.options file
After all that, running the securityadmin.sh in migrate mode still errors out with this message:
Open Distro Security Admin v7
Will connect to ip-10-249-198-144.ec2.internal:9300 ... done
Connected as C=US,ST=Wisconsin,L=Milwaukee,OU=EMS,O=Northwestern Mutual,CN=ELKDADDY.nm.nmfco.com
ERR: Your cluster consists of different node versions. It is not allowed to run securityadmin against a mixed cluster.
So, my next question is, is there a way to do this elasticsearch + opendistro_security upgrade in a rolling fashion, or can it only be done with a full outage?
I’ve asked if rolling upgrades are supported, but received no information to suggest they are. In fact, I’m not sure an upgrade is possible via any means (mass/batch or rolling).
This evidence suggests rolling upgrades are not possible:
securityadmin.sh produces an error “ERR: Your cluster consists of different node versions. It is not allowed to run securityadmin against a mixed cluster.”. This error is regarding the version of elasticsearch. So, it implies that you must first upgrade the entire elasticsearch cluster to v7.10.2. That’s a mass update, not a rolling update.
Furthermore, after mass upgrading the elastic nodes in the cluster to v7.10.2, none of them can start successfully because:
the elasticsearch plugins are only good for the previous version of elasticsearch. And, they can’t be upgraded. Only uninstalled, and then (re)installed.
the discovery-ec2 plugin config needs to be changed v1.x has breaking config rules.
xpack.security.authc config that worked w. o.d. v0.10 breaks v1.x, so it has to be removed.
after all the above, the elasticsearch service will start, but the cluster state stays red, and the logs suggest that maybe you need to run securityadmin.sh to initialize o.d.
o.d. initialization fails, suggesting you need to migrate your o.d. security index from legacy v6 to v7.
securityadmin.sh migrate command fails, saying you need to init o.d.
Rolling upgrades are supported (there may be bugs, but I would expect it to work). During rolling upgrades changes to the security are not allowed, once all nodes in the cluster are upgraded to the target version, security config can be either migrated from old version to a new version or restored from a backup.
Since rolling upgrade was not possible, due to the items listed above, in particular the catch-22 of items #5 and #6, I abandoned the attempt to upgrade, and instead created a new cluster with es7 and od1.10, and migrated data from the old cluster to the new one.