Unwind nested documents

Hello,

Couldn’t find a solution for MongoDB-like “$unwrap” functionality for the nested documents, can anybody help?

I have two documents with “nested” prices field (a document is a product with multiple nested vendor prices):

{
	"_id": "1",
	"name": "Product 1",
	"prices": [
        {
		  "vendor_id": 1,
		  "cost": 10,
		},
        {
		  "vendor_id": 2,
		  "cost": 20,
		}
    ],
},
{
	"_id": "2",
	"name": "Product 2",
	"prices": [
        {
		  "vendor_id": 1,
		  "cost": 100,
		},
        {
		  "vendor_id": 2,
		  "cost": 200,
		}
    ],
}

The search results query should return four documents, with each vendor price “unwinded” (injected into main product document):

{
	"_id": "1XXXX",
	"name": "Product 1",
	"vendor_id": 1,
	"cost": 10,
},
{
	"_id": "1XXXX",
	"name": "Product 1",
	"vendor_id": 2,
	"cost": 20,
},
{
	"_id": "1XXXX",
	"name": "Product 2",
	"vendor_id": 1,
	"cost": 100,
},
{
	"_id": "1XXXX",
	"name": "Product 2",
	"vendor_id": 2,
	"cost": 200,
}

The only idea for now, is to index each product-price pair as a separate document, but it creates extra data because product-specific fields like ‘name’ are duplicated in each product-price document.

1 Like

+1
I have the same question.

Retrieve inner hits - OpenSearch Documentation is it what you are looking for?

Unfortunately, not what’s required.
The example still returns only one top-level hit with two nested hits.
I expect to see two top-level hits with nested hit.
So, each nested hit should create a top-level hit with top-level document data + the data of of single nested hit.

I think, the only way is to create a plugin with response processor. But even it’s possible paging and other might be problematic.

another idea - use aggregation: if you use
agg : { nested : {top_hits:{…}}} it should return nested docs, then you may just repeat top level doc fields in nested and it should solve the puzzle.

1 Like

sounds like a nice idea, thanks!
another problem is that I still should see the top-level document, even if there are no nested hits.

SQL equivalent is:

SELECT products.*, prices.*
FROM products
LEFT JOIN prices ON prices.product_id = products.id

Sounds like you need to search for top-levels having certain nested ones. Thus, you need to search for them with nested query.

Thanks for the great idea.
Here is sample query string which I verified.

PUT my-index-000004
{
  "mappings": {
    "properties": {
      "name": {...},
      "region": {...},
      "users": {
        "type": "nested",
        "properties": {
          "address": {...},
          "email": {...},
          "id": {...},
          "name": {...}
        }
      }
    }
  }
}

POST my-index-000004/_search
{
  "size": 0,
  "aggs": {
    "aggs_nested": {
      "nested": {
        "path": "users"
      },
      "aggs": {
        "aggs_query": {
          "filter": {
            "match": {
              "users.name": {
                "query": "Steven"
              }
            }
          },
          "aggs": {
            "top_result": {
              "top_hits": {
                "size": 20,
                "from": 0,
                "sort": [
                  {
                    "users.name.keyword": {
                      "order": "asc"
                    }
                  }
                ]
              }
            }
          }
        }
      }
    }
  }
}

Here is the query string simulating left join in SQL.
Hope this helps!

{
  "size": 0,
  "aggs": {
    "aggs_filter": {
      "filter": {
        "bool": {
          "should": [
            {
              "nested": {
                "path": "users",
                "query": {
                  "term": {
                    "users.name.keyword": {
                      "value": "Steven",
                      "boost": 1.0
                    }
                  }
                }
              }
            },
            {
              "bool": {
                "must_not": [
                  {
                    "nested": {
                      "path": "users",
                      "query": {
                        "exists": {
                          "field": "users"
                        }
                      }
                    }
                  }
                ]
              }
            }
          ],
          "adjust_pure_negative": true,
          "boost": 1.0
        }
      },
      "aggs": {
        "parent": {
          "top_hits": {
            "from": 0,
            "size": 5
          }
        },
        "aggs_nested_child": {
          "nested": {
            "path": "users"
          },
          "aggs": {
            "child": {
              "top_hits": {
                "from": 0,
                "size": 5
              }
            }
          }
        }
      }
    }
  }
}