Index patterns for Querying nested collection

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
2.6.0

Describe the issue:
I’m querying nested collections as described in the documentation:

I figured out that the query only works, when the array is of “nested” type. Now I would like to use index pattern, e.g

SELECT e.name AS employeeName,
       p.name AS projectName
FROM employees_nested* AS e,
       e.projects AS p
WHERE p.name LIKE '%security%'

but I get an error saying such indexes do not exist. There are multiple indexes fulfilling that pattern however.

Using simple query works:
SELECT * FROM employees_nested*

How do I define index pattern for nested collection queries?

Hey @PeterS

Have tyou tried to use the mappings tab in your Index that was created then use JSON Editor?

Hi!

Wildcards in index names and JOINs are not completely supported. This works in simple queries only. Try this:

SELECT e.name AS employeeName,
       e.projects.name AS projectName
FROM `employees_nested_exact_index_name_here` AS e
WHERE e.projects.name LIKE '%security%'

Nested field support is in development right now.

Regards,
Yury

1 Like

I use mapping to define nested type for project fields. What do you mean?

I would like to use * notation in sql queries for index names.

Thank you for confirming it.

Exact name works as expected. Also that example from the documentation uses exact index name.

However, our indexes have date suffix. E.g. employees_nested_2023_04_13 hence the wildcard * need.

I see.
I did a test to confirm that wildcard in index name works there. The main obstacle in your query is a self join defined by

employees_nested* AS e,
       e.projects AS p

Can you try this query?

SELECT e.name AS employeeName,
       e.projects.name AS projectName
FROM employees_nested* AS e
WHERE e.projects.name LIKE '%security%'

You query passes but does not return any data:

{
  "schema": [
    {
      "name": "name",
      "alias": "employeeName",
      "type": "text"
    },
    {
      "name": "projects.name",
      "alias": "projectName",
      "type": "text"
    }
  ],
  "datarows": [],
  "total": 0,
  "size": 0,
  "status": 200
}

It seem that self joining is vital for accessing array data.

Hi @PeterS,

Do you know if your data has arrays of objects within a record? For example,
array of objects: "projects": [{"name":"project1"}, {"name":"project2"}]
and
array of values: "projects": {"name":["project1","project2"]}

@yuryf-bq’s query in Index patterns for Querying nested collection - #6 by yuryf-bq seems to work with 2.6. However, there is a bug regarding arrays in objects that is critical. You need to ensure that you do not query fields where values may be arrays while waiting for nested to be fully implemented in V2.

Could you provide mappings and/or sample data for more context?

Yes, we index documents that has arrays of objects, even arrays in arrays. E.g.

{
  field_0A: ...,
  field_0B: ...,
  level1Array: [{
     field_1A: ...,
     field_1B: ...,
     ...,
     level2Array: [{
        field_2A: ...,
        field_2B: ...,
        ...
     }]
  ]
}

All arrays are of nested type to avoid array fields flattering.

I need to un-nest array fields in queries to be able to search by array object’s fields:

SELECT  ... FROM 
  parent* AS p,
  p.level1Array AS l1
WHERE    l1.field_1A = 'foo'

I’ll wait for nested to be fully implemented in V2.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.