Term aggregation of nested fields

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser): 2.19.4

Describe the issue:

Hi,

I’m attempting to perform a terms aggregation on a nested field. I was able to get a result using the following query, however the result produced is a nested structure with buckets inside buckets.

GET /test_nested_field/_search?size=0
{
  "aggs": {
    "book": {
      "nested": {
        "path": "book"
      },
      "aggs": {
        "title": {
          "terms": {
            "field": "book.title.keyword"
          },
          "aggs": {
            "version": {
              "terms": {
                "field": "book.version.keyword"
              }
            }
          }
        }
      }
    }
  }
}

Is there a way to write the aggregation query so that the results keep the nested field contents grouped together? Ideally, the results would look something like:

{
  "book": {
    "buckets": [
      {
        "key": {"title": "a", "version": "v1.0"}
        "doc_count": 1
      },
      {
        "key": {"title": "b", "version": "v1.0"}
        "doc_count": 1
      },
      {
        "key": {"title": "b", "version": "v2.0"}
        "doc_count": 2
      },
    ]
  }
}

instead of:

{
  "book": {
    "title": {
      "buckets": [
        {
          "key": "a",
          "version": {
            "buckets": [
              {
                "key": "v1.0",
                "doc_count": 1
              }
            ]
          }
        },
        {
          "key": "b",
          "version": {
            "buckets": [
              {
                "key": "v1.0",
                "doc_count": 1
              },
              {
                "key": "v2.0",
                "doc_count": 2
              }
            ]
          }
        }
      ]
    }
  }
}

This is the mapping and some sample documents:

PUT /test_nested_field
{
  "mappings": {
    "properties": {
      "book": {
        "type": "nested"
      }
    }
  }
}

POST /test_nested_field/_doc
{
  "project": "One",
  "book": {
    "title": "a",
    "version": "v1.0"
  }
}

POST /test_nested_field/_doc
{
  "project": "Two",
  "book": {
    "title": "b",
    "version": "v1.0"
  }
}

POST /test_nested_field/_doc
{
  "project": "Three",
  "book": {
    "title": "b",
    "version": "v2.0"
  }
}

POST /test_nested_field/_doc
{
  "project": "Four",
  "book": {
    "title": "b",
    "version": "v2.0"
  }
}

Thank you.

@daniel13 Thank you for the question. Do you mean you are looking for something like tis:

GET test_nested_field/_search?size=0
{
  "aggs": {
    "book": {
      "nested": { "path": "book" },
      "aggs": {
        "title_version": {
          "multi_terms": {
            "terms": [
              { "field": "book.title.keyword" },
              { "field": "book.version.keyword" }
            ]
          }
        }
      }
    }
  }
}

The returned value would look like this:

{
  "took": 57,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "book": {
      "doc_count": 4,
      "title_version": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": [
              "b",
              "v2.0"
            ],
            "key_as_string": "b|v2.0",
            "doc_count": 2
          },
          {
            "key": [
              "a",
              "v1.0"
            ],
            "key_as_string": "a|v1.0",
            "doc_count": 1
          },
          {
            "key": [
              "b",
              "v1.0"
            ],
            "key_as_string": "b|v1.0",
            "doc_count": 1
          }
        ]
      }
    }
  }
}
1 Like

Hi @Anthony ,

Most interesting. I think this will do the trick.

Thank you!

1 Like