Cancel after time interval clarification

Currently deployed: OpenSearch v2.3

Configuration clarification: To ensure a clear understanding of our global search configuration for a given cluster, I want to clarify that the following is correctly stated:

  • The search.cancel_after_time_interval (which defaults to 300s) configures at the shard level termination of a query after 300s if a query has not yet completed; with regard to the search.low_level_cancellation default of true.

  • The search.default_search_timeout configures at the shard level the maximum time to wait for a query to complete before returning a 408 request timeout to the coordinating node; without regard to the search.cancel_after_time_interval setting. As such, if the search.default_search_timeout is configured to be a positive value (instead of the default of -1), it should be a value less than the search.cancel_after_time_interval.

  • The search.keep_alive_interval is the interval at which the coordinating node will send a TCP keep alive to a shard/data node while the search.default_keep_alive and search.max_keep_alive are configured at the shard level/data node level.

  • The search.max_keep_alive global value overrides any keep alive setting that may be included in the query.

I appreciate any clarifications you can provide. The documentation is somewhat helpful, but I haven’t found a lot of straight forward detail specific to the settings above. I’m making some assumptions and I have run into trouble making assumptions in the past. :joy:

Thank you.

Max

Hey @maxfriz

Where in the doc’s did you see this and is it posible to link it? They all look like a search/connection time inverval, meaning “if nothing was found stop”

1 Like

The reference to this specific parameter is found in the API documentation and the API lightly references the cluster setting. This can be found on this page:
OpenSearch Search API

We have an operational OpenSearch cluster that returns the details of these settings with this command:

GET /_cluster/settings?include_defaults=true

While I won’t provide links to Elasticsearch documentation in this forum, these settings can be found in Elasticsearch documentation as well.

Hey @maxfriz

I had to do a little research and try to find some documentation; it’s been a while.

So, it looks like shard level setting/search query.
For example, large queries take too much time you may need to increase the settings.

search.cancel_after_time_interval && search.default_search_timeout

I found this and they show a good explanation.

As for

search.keep_alive_interval && search.max_keep_alive

These two, which I think configure the TCP_KEEPINTVL option for nodes, which determines the time in seconds between sending TCP keepalive probes. Defaults to -1, which means to use the system default.

That is all I know about those, perhaps someone else will jump in.

1 Like

Hi @Gsmitt There is a good documentation that provides more details / explanations.

  • search.cancel_after_time_interval (Dynamic, time unit): A cluster-level setting that sets the default timeout for all search requests at the coordinating node level. After the specified time has been reached, the request is stopped and all associated tasks are canceled. Default is -1 (no timeout).

  • search.default_search_timeout (Dynamic, time unit): A cluster-level setting that specifies the maximum amount of time that a search request can run before the request is canceled at the shard-level. If the timeout interval is specified in the search request, that interval takes precedence over the configured setting. Default is -1.

Source:

1 Like