Hello @Leeroy Thanks a lot for the writeup and your testing. To give more context on my current case.
We have been encountering a persistent issue with missing visualizations in OpenSearch Dashboards. Originally, the problematic dashboards and visualizations were created using version 2.3.0. After upgrading to 2.11, we began experiencing this issue. As a workaround, we reverted the .kibana alias back to .kibana_1, which had been migrated to .kibana_2 during the 2.11 upgrade. We also deleted the .kibana_2 index at that time.
Hoping the issue would be resolved, we upgraded further to 2.15. However, the problem persisted. We applied the same mitigation strategy again — reverting the alias to .kibana_1 and deleting .kibana_2. Unfortunately, this has become a recurring pain point. After every pod restart, the issue reappears and requires manual intervention.
To reproduce the issue in a development environment, I exported mappings, index objects from .kibana_1 as well as from the particular tenant index of the main(PROD) cluster to a dev cluster. I attempted an upgrade in the dev setup, but as you observed, we were unable to reproduce the missing visualizations.
Interestingly, we have another production cluster in a different region acting as a backup for the main cluster. This backup cluster is also running on OS version 2.15, with the global alias correctly pointing to .kibana_2, and it does not experience the missing visualization issue. Furthermore, the .kibana_2 index mappings in both the main and backup clusters are identical (as verified via GET .kibana_2/_mapping).
I also reviewed the verbose migration logs for the main (prod) cluster. They confirm that all visualizations were successfully migrated from .kibana_1 to .kibana_2 during the pod restart. However, we continue to manually revert the alias to .kibana_1 after each restart because of the missing visualizations issue. This cycle repeats with every pod restart, requiring the same manual mitigation.
One additional observation: in the main cluster, the tenant index (.kibana_1_572569278_abctenant_1, originally created in 2.3) does not migrate after a pod restart. In contrast, the backup cluster has the corresponding tenant index already migrated to .kibana_572569278_abctenant_2. It is created on 2.15. Note the difference in index naming — the main cluster uses .kibana_1_... while the backup uses .kibana_....
Analysis Summary:
I conducted a comparison of the impacted dashboard’s panelsJSON field between .kibana_1 and .kibana_2.
Observations:
-
.kibana_1 Panels:
-
"version": "2.3.0-oracle.24"
-
dashboard.migrationVersion: 7.9.0
-
visualization.migrationVersion: 7.10.0
-
.kibana_2 Panels:
-
"version": "2.15.0-oracle.19"
-
dashboard.migrationVersion: 7.9.0
-
visualization.migrationVersion: 7.10.0
Both indices (.kibana_1 and .kibana_2) show identical top-level migrationVersion values for the dashboard and visualizations. However, the individual panel versions differ.
Hypothesis:
Although the top-level dashboard.migrationVersion is set to 7.9.0, it appears that OpenSearch Dashboards 2.15 determines how to hydrate the saved object based on the version of the individual panels in panelsJSON.
-
When the system detects panel versions ≥ 2.15.0, it likely triggers the newer (7.10+) hydration logic.
-
However, since the dashboard-level metadata remains at an older version (7.9.0), the object may lack the normalized fields expected by the newer hydration logic — leading to partial or failed rendering (i.e., “Could not locate that visualization” errors).
Why reverting the alias to .kibana_1 works:
-
.kibana_1 contains dashboards and panels created with 2.3.0.
-
With both the dashboard and panel versions aligned to the pre-7.10 era, hydration logic defaults to the legacy safe path, handling references and structure in a backward-compatible way.
Possible Regression or Behavior Change:
-
In version 2.3.0, OpenSearch Dashboards correctly resolved saved object references (e.g., visualizations) entirely from the tenant index.
-
In version 2.15.0, it seems to expect referenced objects to also exist in the global index, and fails silently if they don’t — displaying generic missing visualization errors.
This could indicate either:
-
A regression in reference resolution logic, or
-
An intentional behavioral change that hasn’t been clearly documented.
-
I am missing some important piece in middle
Request for Guidance:
Could you please confirm:
-
Whether this understanding aligns with known changes in object hydration or saved object migration logic between 2.3 and 2.15?
-
If this behavior is expected, is there a recommended way to ensure reliable migration from older .kibana indices to .kibana_2 without breaking visualization references?
This issue is currently blocking us, as we must apply a manual alias rollback after every pod restart. Any insights or recommended next steps would be greatly appreciated. I can share additional logs, index exports, or config details if helpful.