Hi everyone, I trust you all are well!
I’m trying to understand how OpenSearch calculates the total storage used per index. Specifically, I want to clarify whether the following components are considered in the storage cost:
-
Option 1: Total document size on the cluster is based only on the
field
size. -
Option 2: Total storage includes both
_source
and field data.
For reference, I’m using the following query:
GET sample_indexing_test/_search
{
“size”: 10,
“stored_fields”: [“_size”],
“query”: {
"match_all": {}
},
“fields”: [“*”],
“_source”: true
}
And here’s a sample response:
{
“_index”: “sample_indexing_test”,
“_id”: “WV91o5gBUI7NisIKDqg5”,
“_score”: 1,
“_size”: 170,
“_source”: {
"title": "OpenSearch for Beginners", "category": "Books", "description": "A complete guide to OpenSearch basics.", "price": 29.99, "sku": "BK-OS-001"
},
“fields”: {
"price": \[29.99\], "description": \["A complete guide to OpenSearch basics."\], "title": \["OpenSearch for Beginners"\], "sku": \["BK-OS-001"\], "category": \["Books"\]
}
}
Given this, I’d like to understand whether the _size
value reflects just the _source
, just the fields
, or both. And ultimately, how does this relate to the actual storage footprint of the index?
Any insights or references to documentation would be greatly appreciated.
Thanks in advance!