Hi,
We were running 7.9.1 but after after a machine power outage, ES did not start due to
an index created at version 5.6. And recommends downgrading to 6.x and reindex.
So I replaced with dpkg -i elasticsearch-oss-6.8.13.deb and
now I get an error message in startup :
failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st
[2020-11-04T10:41:59,031][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [iou2.uninett.no] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: ElasticsearchException[java.io.IOException: failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st]]; nested: IOException[failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st]]; nested: XContentParseException[[-1:36] [node_meta_data] unknown field [node_version], parser not found];
should I also downgrade some or all of the suppor packages to do the reindexing necessary ?
I turns out that after trying a couple of other version og elasticsearch-oss the 7.9.1 version eventually accepted the index and came up. Case dismissed !
After a reboot, the problem is back. No change in sw the last weeks. So how can ES come in such a state that it suddenly don’t acet indexes that has been up and running after a restart. I noticed there is a field version created on and one for the current. Could be that the startup code makes wrong guesses ?
We have both newer and older indices. How can we get our data back ?
Can you show the message where it recommends downgrading?
I am wondering if perhaps the message is incorrect and the problem really has nothing to do with downgrading. Perhaps you just have some indexes that got unrecoverably corrupted during the power failure
java.lang.IllegalStateException: The index [[uninett6/9HwC3NFKT7m0Ut7SK6M7Qw]] was created with version [5.6.10] but the minimum compatible version is [6.0.0-beta1]. It should be re-indexed in Elasticsearch 6.x before upgrading to 7.9.1.
(I cooperate with okvittem on this sometime “grumpy” ES-cluster.)
The ES cluster in question is a single node cluster. It seems to have a folder in its data-path for both a node 0 and a node 1. Folders and files for all operational indices are located under node 0. Some other legacy/zombie indices exists under node 1. Occasionally, after a reboot and restart, ES seems to discover these legacy indices and starts complaining about too old index versions. Removing node 1 files made ES start. However ES then creates a new node 1 folder hierarchy (with no indices).
… so it may seem that the question is: Why does our system insist on having a node 1 data folder in addition to the operational node 0 ?
We now seem to have realized why our ES installation sometimes insists on adding and/or starting as node 1 even though it is a single node systems (and should start node 0).
It turned out we had node.max_local_storage_nodes=3 set in elasticsearch.yml. This led to that when ES found node 0 locked (e.g. after a “rough” restart of some sort), it assumed some other ES process was running and hence booted up node 1 instead.
Setting node.max_local_storage_nodes=1 forced ES to fail at startup if node 0 resources were locked.