Document Groups and Document Level Security

Hey,

I’m developing a solution to store and retrieve geographic data from an OpenSearch data storage. Data entries look like the following:

{
 "cityid": "a unique id of the city",
 "city": "the name of a city",
 "countyid" the county of the city",
 "payload": {
   "some": "payload"
 }
}

The payload is sensitive data with requires the following restrictions to be implemented:

  1. an analyst must be granted access to the data of a city individually by an administrator
  2. if an analyst gets access to the data of a city, he must get access to the corresponding countyid and all other entries with the same countyid
  3. there will be more than one analyst accessing the data of a specific city
  4. there will be incremental updates to the data set, possibly adding unseen cities in existing counties, or even new counties

The data set is already quite huge (millions of entries, thousands of cities and hundreds of counties), with a potential large number of analysts (1000+).

Using document level security looks like the way to go. Problem is, I currently do not see a way to enforce the rules without implementing some maintenance tasks, e.g., executed via CRON or a after finishing ingesting new data.

Do you have any ideas for using document level security (or other schemes of permission) to achieve the outlined view restrictions on the data, if possible without any further piece of software involved?

Thank you!

What do you mean by that? DLS is implemented in the roles.

Let me clarify: an idea is to generate a term query containing all the countyids the analyst should have access to, and assign the role to the analyst. This is caused by the process, that the administrator is assigning permissions using the cityid, i.e., the countyid is determined by the cityid.

In other words: from my current understanding, the step from the city to the county, when only the city is known, cannot be done in a DLS query, and therefore requires some other software to do that, right?

@mschmidt Have you tried term-level lookup in DLS query?

I have tried term-level lookup without success. My understanding is that it only works if I know the document ID. In my case the cityid is not the document ID, or am I missing something in the correct usage of term-level lookup?

Here is an example from DevTools:

{
  "query": {
    "terms": {
        "countyid" : {
            "index" : "some-index",
            "cityid" : "1",
            "path" : "countyid"
        }
    }
  }
}

The exception is

… [terms_lookup] unknown field [cityid]

or if i skip the ID, instead of replace it with cityid, I get

Required [id]