Keyword field unable to be used for aggregation

Versions:
v 1.0.0
Amazon OpenSearch Service
Managed Cluster

Describe the issue:
I currently have a search that fires every five minutes via an AWS lambda that makes a request to the _search endpoint for my index. This search has been working successfully for four months, but suddenly today it has started giving me the following error:

Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [webaclId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.

This error only happens when I run the search via my lambda. When I run the same search using the dev tools in the dashboard it runs just fine.

As best as I can tell, I’ve configured webaclId as a keyword field, not a text field, as per my mapping shared below, so I can’t figure out the issue. Has anyone else experienced this or found a solution?

The search is as follows:

{
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "wildcard": {
                "rateBasedRuleList.limitKey": "*"
              }
            },
            {
              "match_phrase": {
                "nonTerminatingMatchingRules.action": "COUNT"
              }
            },
            {
              "match_phrase": {
                "terminatingRuleId": "Default_Action"
              }
            },
            {
              "range": {
                "timestamp": {
                  "gte": "now-1h"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
    "webaclId_count": {
      "terms": {
        "field": "webaclId"
      },
      "aggs": {
        "single_webaclId_count": {
          "value_count": {
            "field": "webaclId"
          }
        },
        "limit": {
          "bucket_selector": {
            "buckets_path": {
              "count": "single_webaclId_count"
            },
            "script": "params.count > 100"
          }
        }
      }
    }
  }
}

The relevant section of the mapping template is as follows:

{
  "myindex" : {
    "mappings" : {
      "properties" : {
        "nonTerminatingMatchingRules" : {
          "properties" : {
            "action" : {
              "type" : "keyword"
            },
            "ruleId" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "ruleMatchDetails" : {
              "properties" : {
                "conditionType" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "location" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "matchedData" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "sensitivityLevel" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            }
          }
        },
        "rateBasedRuleList" : {
          "properties" : {
            "limitKey" : {
              "type" : "keyword"
            },
            "limitValue" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "maxRateAllowed" : {
              "type" : "long"
            },
            "rateBasedRuleId" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "rateBasedRuleName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        },
        "terminatingRuleId" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "timestamp" : {
          "type" : "date"
        },
        "webaclId" : {
          "type" : "keyword"
        }
      }
    }
  }
}

Full error:

{
    "error": {
        "root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [webaclId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
            }
        ],
        "type": "search_phase_execution_exception",
        "reason": "",
        "phase": "fetch",
        "grouped": true,
        "failed_shards": [
            {
                "shard": 0,
                "index": "redacted",
                "node": "redacted",
                "reason": {
                    "type": "illegal_argument_exception",
                    "reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [webaclId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
                }
            }
        ],
        "caused_by": {
            "type": "null_pointer_exception",
            "reason": null
        }
    },
    "status": 400
}

Maybe not all of the indices you search against have the same mapping, you can check whether the field webaclId in all of the indices are keyword type or not.

Thank you for the good suggestion.

I went to have a look at the index pattern and found this. Based on “Searchable” and “Aggregatable” being green, I assume this is correctly a keyword across the whole index pattern that I am searching.

If it were different in some of the indexes I was searching I imagine I would see a conflict for this field.

Further screenshot of the settings when I go to edit the field

You can see the mapping in Dev Tools as well, something like:

GET /index-pattern*/_mapping

You should see a response with all the indices matching your pattern. Search for your field name there and see which one is a text field, because my assumption is that your lambda function hits one of these fields.

If there’s nothing in there, I’d also look for the same field name in other indices (e.g. by getting all the mappings via GET /_mapping

You can see the mapping in Dev Tools as well, something like:
GET /index-pattern*/_mapping
You should see a response with all the indices matching your pattern. Search for your field name there and see which one is a text field, because my assumption is that your lambda function hits one of these fields.

Using this method I found some instances of the following on some really old indexes:

"webaclId" : {
  "type" : "text",
  "fields" : {
    "keyword" : {
      "type" : "keyword",
      "ignore_above" : 256
    }
  }
}

I assume the solution then would be to reindex these older indexes to match the mapping below that I’m expecting/need for my query?

"webaclId" : {
  "type" : "keyword"
}

That’s correct (sorry for the late reply).

Is it possible to perform a bulk reindex/mapping update for a number of old indexes?

Yes, I think you can hit an index pattern as a source of your reindex API call, and then the destination can be an alias. If an alias rotates automatically (because it’s managed by ISM), then you should be good.