Convert_index_to_remote issue with datastreams

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): OpenSearch 3.5

Describe the issue:

I have ISM policy that creates searchable snapshot. Flow: index fill up with data → rollover → snapshot action → convert_index_to_remote action → delete index

Above flow leaves me with remote_* indexes. I want to manage them. What I thought of doing was to create separate ISM policy that will pick up remote_ indexes and delete them after some time. Then I will have SM policy to delete old snapshots from S3.

This way I believe I can manage the whole cycle. I remove data from cluster and store in S3 and after some time automatically I delete ‘linked’ indexes and clean-up s3.

My problem is: My restored indexes are not picked-up by ISM policy… I believe the problem is: ISM policy takes a snapshot of a index that is part of datastream and later when restore takes place, we restore only index, but not a datastream (however the index has a mapping that suggests it’s a part of datastream!). Is it possible that plugin does not attach the ISM policy because it cannot find corresponding datastream?

If I do manual snapshot providing datastream name and then do restore, then datastream is also restored and restored index has ISM policy attached.

I completely understand that ISM policy should snapshot/restore only one index. However what can I do in my scenario? The only way that I can think of is to create some kind of script/cronjob that will look for remote* indexes and attach policy to them. But it seems an ugly workaround.

I’ve also tried to created index template for remote* pattern and try to attach ism policy via settings:

PUT _index_template/remote_add_ism 
{
  "index_patterns": ["remote*"],
  "priority": 600,
  "template": {
    "settings": {
      "index.opendistro.index_state_management.policy_id": "remote-index",
      "index.plugins.index_state_management.policy_id": "remote-index"
    }
  }
}

But it did not work…

I’d really appreciate some feedback. Is it possible to accomplish what I need? Or is there some flaw in my way of thinking?

Or is there other way to clean up after searchable snapshot? :slight_smile:

Thanks in advance for any replay!

Configuration:

Relevant Logs or Screenshots:

@zuuz94 The remote index has index.hidden: true inherited from the original data stream backing index. The coordinator explicitly skips hidden non-data-stream indices, see code block here

I would recommend to raise an issue for this here, as the restore request would probably need to override index.hidden: false in order for ISM to see this index.

Until such time the only workaround seems to be to manually attach the policy after restore.

Hope this helps.

1 Like

I would also mention issue 1426, this will change the name and deal with indexName.startsWith(".") part, see code snippet here.

Cool! Thank you very much for the response @Anthony !

Really appreciate that you pushed me in right direction :slight_smile:

If the issue here is this hidden parameter then I believe this PR should solve my problems. When I’d be able to ignore this setting it while doing the restore - then the ISM policy should be applied.

Thank you very much! I’ll validate it by triggering restore manually - but it looks very promising :slight_smile:

[update] I’ve tested manually restore with ignore index settings - and it worked. ISM picked up my new remote index. Thanks!

1 Like