Aggregation Query in Elasticsearch for Hierarchical Data (Senior Manager -> Manager -> Team Lead -> Developer)

Hello everyone,

I am working with Elasticsearch and trying to create an aggregation query for hierarchical employee data. The data represents a company structure with Senior Managers, Managers, Team Leads, and Developers. I am using nested fields for the employee data (employees), and I need to perform aggregations to list employees in the following hierarchy:

  1. Senior Manager (role: "Senior Manager")
  2. Under each Senior Manager, list all Managers.
  3. For each Manager, list the Team Leads under them.
  4. For each Team Lead, list the Developers under them.

Mapping:

Here’s the mapping for the employee data:

{
“properties”: {
“Company”: {
“properties”: {
“id”: {
“type”: “keyword”
},
“name”: {
“type”: “text”
},
“domain”: {
“type”: “text”
},
“employees”: {
“type”: “nested”,
“properties”: {
“id”: {
“type”: “keyword”
},
“first_name”: {
“type”: “text”
},
“role”: {
“type”: “keyword”
},
“dept”: {
“type”: “keyword”
},
“parent_id”: {
“type”: “keyword”
}
}
}
}
}
}
}

Data Example:

Here’s a sample document for a company:

{
“Company”: {
“id”: “company1”,
“name”: “Company A”,
“domain”: “company-a.com”,
“employees”: [
{ “id”: 1, “first_name”: “SM1”, “role”: “Senior Manager”, “dept”: “Dept 1”, “parent_id”: null },
{ “id”: 2, “first_name”: “M1”, “role”: “Manager”, “dept”: “Dept 1”, “parent_id”: 1 },
{ “id”: 3, “first_name”: “TL1”, “role”: “Team lead”, “dept”: “Dept 2”, “parent_id”: 2 },
{ “id”: 4, “first_name”: “D1”, “role”: “Developer”, “dept”: “Dept 3”, “parent_id”: 3 },
{ “id”: 5, “first_name”: “SM2”, “role”: “Senior Manager”, “dept”: “Dept 4”, “parent_id”: null },
{ “id”: 6, “first_name”: “M2”, “role”: “Manager”, “dept”: “Dept 1”, “parent_id”: 5 },
{ “id”: 7, “first_name”: “TL2”, “role”: “Team lead”, “dept”: “Dept 1”, “parent_id”: 6 },
{ “id”: 8, “first_name”: “D2”, “role”: “Developer”, “dept”: “Dept 2”, “parent_id”: 7 }
]
}
}

The Problem:

I am trying to generate an aggregation query that will return the hierarchical structure as follows:

  • Senior Manager (e.g., SM1)
    • Manager (e.g., M1)
      • Team Lead (e.g., TL1)
        • Developer (e.g., D1)
    • Manager (e.g., M2)
      • Team Lead (e.g., TL2)
        • Developer (e.g., D2)

However, I’m encountering errors like:

  • Text fields are not optimized for operations: I need the names of the employees under each role (Senior Manager, Manager, Team Lead, Developer), but I’m facing issues with field type mismatches and aggregation not working correctly.
  • I have tried using role.keyword and first_name.keyword but still can’t get the correct aggregation.

What I’ve Tried:

  • Using the keyword type for aggregating role and first_name.
  • Attempting nested aggregation to handle the hierarchical data correctly.
  • Using fielddata=true for text fields, but it consumes too much memory.

My Question:

How can I successfully aggregate this nested data in Elasticsearch to get the hierarchical structure under each Senior Manager? Specifically, I need to list the Managers, Team Leads, and Developers in the correct hierarchy with names under each role.

Expected Output:

Here is the expected structure I am trying to get:

{
“aggregations”: {
“employees_nested”: {
“managers”: {
“buckets”: [
{
“key”: “1”, // Senior Manager 1
“doc_count”: 1,
“manager_ids”: {
“buckets”: [
{
“key”: “1”, // Manager 1
“doc_count”: 3,
“team_leads_nested”: {
“buckets”: [
{
“key”: “Manager”, // Team Lead 1
“doc_count”: 3,
“developers”: {
“buckets”: [
{
“key”: “D1”, // Developer 1
“doc_count”: 1
},
{
“key”: “D2”, // Developer 2
“doc_count”: 1
}
]
}
}
]
}
}
]
}
}
]
}
}
}
}

Any Suggestions?

If anyone has experience with aggregating hierarchical, nested data like this, I’d appreciate any guidance or suggestions on how to resolve the errors I’m facing.