Problem Summary:
I am migrating data from Elasticsearch (v7.17.9) to OpenSearch (v2.x). The migration is mostly successful; however, not all data types are correctly preserved in OpenSearch. This results in mapping conflicts and incomplete field representations, especially for fields with dynamic mapping in Elasticsearch.
Environment:
- Source: Elasticsearch v7.17.9
- Target: OpenSearch v2.x
- Migration method: Python script using
elasticsearch
andopensearch-py
clients withscroll
andbulk indexing
- Index involved:
nscdata
Observed Issues:
- Some fields are coming in as
text
in OpenSearch even though they werekeyword
orinteger
in Elasticsearch. - Nested fields or arrays are flattened or interpreted incorrectly.
- Mappings are not preserved despite trying to copy settings and mappings using:
python
CopyEdit
es.indices.get_mapping(index='nscdata')
es.indices.get_settings(index='nscdata')
Steps Taken:
- Extracted source mappings using
GET nscdata/_mapping
- Created new index in OpenSearch using these mappings
- Used scroll API to extract documents and bulk indexed them into OpenSearch
- Verified data with
GET nscdata/_mapping
on both ends
Sample Mapping Difference:
Elasticsearch (source):
json
CopyEdit
"device_id": {
"type": "keyword"
}
OpenSearch (target):
json
CopyEdit
"device_id": {
"type": "text"
}
Question:
- How can I fully preserve the original Elasticsearch mappings, especially field types, when migrating data to OpenSearch?
- Is there a recommended tool or plugin for migration that maintains full mapping integrity?
- Is this a known issue when using Python scripts and
opensearch-py
for migration?