Does Field Level "fully searchable encryption" make sense? This is for breach-proofing the index

Hi everyone,
A few of us at Titaniam have built field level security (by encrypting the source doc and index), while still allowing full featured search. Slightly increased storage and ingest time. Comparable performance on search.

Demo:

2 Likes

If I understand this correctly, you want to allow searching certain fields but prevent retrieving them in the source.

I implemented this on a “content” field by creating two indices containing the same documents. However, one index (read index) contains content attribute and metadata while the second (browse index) contain metadat but not the content field.

Then I used document level security to control which document to return to the user based on his/her access permissions.

This ways performs faster search but increases the required storage.

Does it make sense, requirement-wise it does for my customers. Implementation-wise, I was hoping that ODFE (Or Opensearch) provides combined document and field level security in addition to enabling searchable secured fields ( searchable but not retrievable). Having this feature will eliminate the need for additional index and the storage overhead that comes with it.

The intent is to breach-proof an index containing sensitive data. Even if an index is breached, how can we ensure that nothing is compromised. In the example above, the data remains fully searchable. Both the index and the source document has only encrypted text - for the sensitive fields (not all fields). A down stream system can translate the search results to clear text. But that can be done at the point of consumption (example when we need the email addresses to send email, perhaps in the mail server).

Making a copy of the index makes the copy susceptible to breach.

In this case our requirements are different with a bit of overlap.

Anyways there is a way suggested in ODFE docs here Encryption at Rest - Open Distro Documentation to implement node-wide encryption at rest.

Titaniam is extending the encryption to cover data-in-use. This gives always on protection regardless of who is logged in or accessing the index. I saw data-at-rest and wanted to call out the difference. This covers data-at-rest but also includes the harder use case of protecting data-in-use

Hi Hasan,
Quite frequently sensitive data gets into logs. Per NIST, even public IP addresses are considered as PII. With GDPR and other similar regs (there are 100+ now :-), it is a whole lot simpler to just tell to your auditors that "yeah we encrypt all that stuff. We want to enable “Secured by default” posture. And make it dead simple to get there and without breaking the wallet.

One of my past employers had logs that were being routed to Elastic and the logs contained customer phone numbers. Bizarre but true. And every engineer in the company had access to these phone numbers. They did not use OpenDistro nor did they use xpack. So it was all in the open. Yeah they would do well to switch to Opensearch soon given the license change.

Check out the info video… if you have time. Thanks for engaging. This page has moved to titaniam.io