Using Luke to explore the index

espears · February 9, 2021, 3:34pm

Hi all,

Curious if anyone has successfully used Luke to explore the Lucene index within their OD deployment? We need to dive in at that level to troubleshoot a particularly pesky issue. Is one of the Luke binaries hiding within our OD setup by chance? Or, if anyone gone down this rabbit hole already, any tips or tricks you might be willing to share?

Many thanks,
–Erik

searchymcsearchface · February 9, 2021, 9:11pm

@espears checked with one of our Luence gurus - it doesn’t appears that it’s bundled open distro.

espears · February 9, 2021, 9:48pm

Hey Kyle…okay, thanks for checking! We’ll grab a binary directly and go from there. I appreciate you asking around to find out.

–Erik

searchymcsearchface · February 10, 2021, 4:29pm

Would love to hear how it goes and if Luke helps diagnose your problem!

msfroh · February 24, 2023, 6:02pm

If you’re self-hosting your OpenSearch cluster, you can find the directory holding the Lucene files for each shard under a path like $OPENSEARCH_ROOT/data/nodes/0/indices/sQ3gze7NQe2IsN1yOyyAlw/0/index. You can copy that directory somewhere else and hit it with Luke.

The catch is that (depending on the version) the index name may get mapped to an internal gibberish value (like sQ3gze7NQe2IsN1yOyyAlw). The good news is that you can resolve from the index name to the UUID using the /_cat/indices API, like:

GET /_cat/indices?v

health status index                        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   security-auditlog-2023.02.24 h1dNpm6iSdCLOSij_Utf2g   1   1          2            0     25.3kb         25.3kb
yellow open   security-auditlog-2022.11.28 sQ3gze7NQe2IsN1yOyyAlw   1   1          3            0     55.9kb         55.9kb

You can figure out which node holds which shards using the /_cat/shards API.

So, the basic steps are:

Find the UUID for your index using /_cat/indices.
Use /_cat/shards to figure out which nodes hold the relevant shards.
For each relevant shard, log onto the appropriate node and find the Lucene directory under $OPENSEARCH_ROOT/data/nodes/<node_num>/indices/<UUID>/<shard_num>/index.
Copy the Lucene directory somewhere else.
(Probably) You’ll need to delete the write.lock file from the copy. (Never delete it from the directory managed by OpenSearch!)
Point Luke at the copy of the Lucene directory.

Topic		Replies	Views
Recovering from false positive corrupted shard General Feedback troubleshoot	2	650	January 3, 2024
2022 Release Schedule General Feedback discuss	7	1028	March 18, 2022
OpenSearch Lucene Study Group Meeting - Monday, April 1st, 2024 Community community-meeting	2	231	April 1, 2024
OpenSearch Lucene Study Group Meeting - Monday, April 29th, 2024 Community community-meeting	2	149	April 29, 2024
OpenSearch Lucene Study Group Meeting - Monday, April 15th, 2024 Community community-meeting	2	153	April 15, 2024

Using Luke to explore the index

Related topics