Catch 22 : downgrade to reindex : fails to start ES

okvittem · November 4, 2020, 10:53am

Hi,
We were running 7.9.1 but after after a machine power outage, ES did not start due to
an index created at version 5.6. And recommends downgrading to 6.x and reindex.

So I replaced with dpkg -i elasticsearch-oss-6.8.13.deb and
now I get an error message in startup :
failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st

    [2020-11-04T10:41:59,031][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [iou2.uninett.no] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: ElasticsearchException[java.io.IOException: failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st]]; nested: IOException[failed to read [id:116, file:/dynga/es-iou2/elasticsearch/nodes/0/_state/node-116.st]]; nested: XContentParseException[[-1:36] [node_meta_data] unknown field [node_version], parser not found];

should I also downgrade some or all of the suppor packages to do the reindexing necessary ?

    opendistro-alerting opendistro-anomaly-detection opendistro-index-management opendistro-job-scheduler opendistro-knn
  opendistro-performance-analyzer opendistro-security opendistro-sql opendistroforelasticsearch

okvittem · November 5, 2020, 7:58am

I turns out that after trying a couple of other version og elasticsearch-oss the 7.9.1 version eventually accepted the index and came up. Case dismissed !

okvittem · January 19, 2021, 1:04pm

After a reboot, the problem is back. No change in sw the last weeks. So how can ES come in such a state that it suddenly don’t acet indexes that has been up and running after a restart. I noticed there is a field version created on and one for the current. Could be that the startup code makes wrong guesses ?

We have both newer and older indices. How can we get our data back ?

shawnz · January 19, 2021, 2:49pm

Can you show the message where it recommends downgrading?

I am wondering if perhaps the message is incorrect and the problem really has nothing to do with downgrading. Perhaps you just have some indexes that got unrecoverably corrupted during the power failure

okvittem · January 19, 2021, 3:10pm

java.lang.IllegalStateException: The index [[uninett6/9HwC3NFKT7m0Ut7SK6M7Qw]] was created with version [5.6.10] but the minimum compatible version is [6.0.0-beta1]. It should be re-indexed in Elasticsearch 6.x before upgrading to 7.9.1.

shawnz · January 19, 2021, 3:23pm

Did you ever actually run the version 5.6.10 like it claims?

okvittem · January 19, 2021, 3:41pm

Yes that may have been the case we sstartet at about 4 a few years ago.

ottojwittner · January 22, 2021, 11:43am

(I cooperate with okvittem on this sometime “grumpy” ES-cluster.)

The ES cluster in question is a single node cluster. It seems to have a folder in its data-path for both a node 0 and a node 1. Folders and files for all operational indices are located under node 0. Some other legacy/zombie indices exists under node 1. Occasionally, after a reboot and restart, ES seems to discover these legacy indices and starts complaining about too old index versions. Removing node 1 files made ES start. However ES then creates a new node 1 folder hierarchy (with no indices).

… so it may seem that the question is: Why does our system insist on having a node 1 data folder in addition to the operational node 0 ?

ottojwittner · March 17, 2021, 1:22pm

We now seem to have realized why our ES installation sometimes insists on adding and/or starting as node 1 even though it is a single node systems (and should start node 0).

It turned out we had node.max_local_storage_nodes=3 set in elasticsearch.yml. This led to that when ES found node 0 locked (e.g. after a “rough” restart of some sort), it assumed some other ES process was running and hence booted up node 1 instead.

Setting node.max_local_storage_nodes=1 forced ES to fail at startup if node 0 resources were locked.

If there is no relevant reason for node 0 being locked, removing all *.lock files in all sub-folders of the node/0 folder in ES’s datapath enables ES to start node 0 again.
(Ref Elasticsearch: Failed to obtain node locks - #12 by rahulnama - Elasticsearch - Discuss the Elastic Stack )

NOTE: This type of “hacking” inside ES’s data path should be done with care, or rather not at all (according to ES-developers).

Topic		Replies	Views
OpenDistro cannot downgrade a node from version 7.6.2 to 7.6.1 Open Source Elasticsearch and Kibana	4	3040	June 9, 2020
Error when starting opensearch - failed to parse field [index_template] OpenSearch	3	1615	April 13, 2023
Upgrade failed and unable to start Elasticsearch General Feedback	1	1904	August 21, 2020
Migrate ElasticSearch 8.x to OpenSearch 1.3 DevOps troubleshoot , upgrade , index-management	1	1120	August 25, 2022
ES 6.7 to Opensearch 2.x upgrade issue Open Source Elasticsearch and Kibana	2	796	September 14, 2022

Catch 22 : downgrade to reindex : fails to start ES

Related topics