Hello @Leeroy Thanks a lot for the writeup and your testing. To give more context on my current case.
We have been encountering a persistent issue with missing visualizations in OpenSearch Dashboards. Originally, the problematic dashboards and visualizations were created using version 2.3.0. After upgrading to 2.11, we began experiencing this issue. As a workaround, we reverted the .kibana
alias back to .kibana_1
, which had been migrated to .kibana_2
during the 2.11 upgrade. We also deleted the .kibana_2
index at that time.
Hoping the issue would be resolved, we upgraded further to 2.15. However, the problem persisted. We applied the same mitigation strategy again — reverting the alias to .kibana_1
and deleting .kibana_2
. Unfortunately, this has become a recurring pain point. After every pod restart, the issue reappears and requires manual intervention.
To reproduce the issue in a development environment, I exported mappings, index objects from .kibana_1
as well as from the particular tenant index of the main(PROD) cluster to a dev cluster. I attempted an upgrade in the dev setup, but as you observed, we were unable to reproduce the missing visualizations.
Interestingly, we have another production cluster in a different region acting as a backup for the main cluster. This backup cluster is also running on OS version 2.15, with the global alias correctly pointing to .kibana_2
, and it does not experience the missing visualization issue. Furthermore, the .kibana_2
index mappings in both the main and backup clusters are identical (as verified via GET .kibana_2/_mapping
).
I also reviewed the verbose migration logs for the main (prod) cluster. They confirm that all visualizations were successfully migrated from .kibana_1
to .kibana_2
during the pod restart. However, we continue to manually revert the alias to .kibana_1
after each restart because of the missing visualizations issue. This cycle repeats with every pod restart, requiring the same manual mitigation.
One additional observation: in the main cluster, the tenant index (.kibana_1_572569278_abctenant_1
, originally created in 2.3) does not migrate after a pod restart. In contrast, the backup cluster has the corresponding tenant index already migrated to .kibana_572569278_abctenant_2
. It is created on 2.15. Note the difference in index naming — the main cluster uses .kibana_1_...
while the backup uses .kibana_...
.
Analysis Summary:
I conducted a comparison of the impacted dashboard’s panelsJSON
field between .kibana_1
and .kibana_2
.
Observations:
-
.kibana_1
Panels:
-
"version": "2.3.0-oracle.24"
-
dashboard.migrationVersion
: 7.9.0
-
visualization.migrationVersion
: 7.10.0
-
.kibana_2
Panels:
-
"version": "2.15.0-oracle.19"
-
dashboard.migrationVersion
: 7.9.0
-
visualization.migrationVersion
: 7.10.0
Both indices (.kibana_1
and .kibana_2
) show identical top-level migrationVersion
values for the dashboard and visualizations. However, the individual panel versions differ.
Hypothesis:
Although the top-level dashboard.migrationVersion
is set to 7.9.0
, it appears that OpenSearch Dashboards 2.15 determines how to hydrate the saved object based on the version of the individual panels in panelsJSON
.
-
When the system detects panel versions ≥ 2.15.0
, it likely triggers the newer (7.10+) hydration logic.
-
However, since the dashboard-level metadata remains at an older version (7.9.0
), the object may lack the normalized fields expected by the newer hydration logic — leading to partial or failed rendering (i.e., “Could not locate that visualization” errors).
Why reverting the alias to .kibana_1
works:
-
.kibana_1
contains dashboards and panels created with 2.3.0
.
-
With both the dashboard and panel versions aligned to the pre-7.10 era, hydration logic defaults to the legacy safe path, handling references and structure in a backward-compatible way.
Possible Regression or Behavior Change:
-
In version 2.3.0, OpenSearch Dashboards correctly resolved saved object references (e.g., visualizations) entirely from the tenant index.
-
In version 2.15.0, it seems to expect referenced objects to also exist in the global index, and fails silently if they don’t — displaying generic missing visualization errors.
This could indicate either:
-
A regression in reference resolution logic, or
-
An intentional behavioral change that hasn’t been clearly documented.
-
I am missing some important piece in middle
Request for Guidance:
Could you please confirm:
-
Whether this understanding aligns with known changes in object hydration or saved object migration logic between 2.3 and 2.15?
-
If this behavior is expected, is there a recommended way to ensure reliable migration from older .kibana
indices to .kibana_2
without breaking visualization references?
This issue is currently blocking us, as we must apply a manual alias rollback after every pod restart. Any insights or recommended next steps would be greatly appreciated. I can share additional logs, index exports, or config details if helpful.