Help create a detector

Hi.
I have an OpenSearch cluster with two nodes, a hot and a warm. An attempt to create a detector fails with a “timeout” error. Looking at the GET _cluster/allocation/explain command, we see the following response. The system index keeps getting recreated endlessly.

{
  "index": ".opensearch-sap-test-detectors-queries-optimized-d4aaffb3-926e-4cfe-98ec-9d3b237eef8e-000001",
  "shard": 0,
  "primary": false,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "REPLICA_ADDED",
    "at": "2026-05-21T08:55:27.219Z",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "u9omgUBHQ-e2VPJMy9ICAA",
      "node_name": "opensearch-node-date",
      "transport_address": "172.18.0.3:9300",
      "node_attributes": {
        "temp": "warm",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "replica_after_primary_active",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        },
        {
          "decider": "throttling",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        }
      ]
    },
    {
      "node_id": "ySQQb38KTOOMBVxPbew6Bw",
      "node_name": "opensearch-node",
      "transport_address": "172.18.0.2:9300",
      "node_attributes": {
        "temp": "hot",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "replica_after_primary_active",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        },
        {
          "decider": "same_shard",
          "decision": "NO",
          "explanation": "a copy of this shard is already allocated to this node [[.opensearch-sap-test-detectors-queries-optimized-d4aaffb3-926e-4cfe-98ec-9d3b237eef8e-000001][0], node[ySQQb38KTOOMBVxPbew6Bw], [P], recovery_source[new shard recovery], s[INITIALIZING], a[id=DPPHFBbPSD-v-uzBSmYl9w], unassigned_info[[reason=INDEX_CREATED], at[2026-05-21T08:55:27.218Z], delayed=false, allocation_status[no_attempt]], expected_shard_size[208]]"
        },
        {
          "decider": "throttling",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        }
      ]
    }
  ]
}

Help me please.
Version OS: 3.6.0

@Kin0sh thank you for the question, can you provide additional details, see below:

following values in both opensearch.yml files (Hot and warm):

  • cluster.routing.allocation.awareness.*
  • cluster.routing.allocation.require.* or include.*
  • indices.recovery.* settings
  • node.attr.* entries

also please provide the responses to the following (replace with the correct index):

POST /_cluster/allocation/explain
{ "index": ".opensearch-sap-test-detectors-queries-optimized-d4aaffb3-926e-4cfe-98ec-9d3b237eef8e-000001", "shard": 0, "primary": true }


GET /.opensearch-sap-test-detectors-queries-optimized-d4aaffb3-926e-4cfe-98ec-9d3b237eef8e-000001/_settings?flat_settings=true

GET /_cluster/settings?pretty&include_defaults=false

GET /_cat/nodeattrs?v&h=node,attr,value

I hope I missed nothing.
opensearch.yml

cluster.name: docker-cluster

# Bind to all interfaces because we don't know what IP address Docker will assign to us.
network.host: 0.0.0.0

# # minimum_master_nodes need to be explicitly set when bound on a public IP
# # set to 1 to allow single node clusters
# discovery.zen.minimum_master_nodes: 1

# Setting network.host to a non-loopback address enables the annoying bootstrap checks. "Single-node" mode disables them again.
# discovery.type: single-node

opensearch.experimental.feature.telemetry.enabled: true
telemetry.feature.metrics.enabled: true
telemetry.feature.tracer.enabled: true
telemetry.tracer.enabled: true

######## Start OpenSearch Security Demo Configuration ########
# WARNING: revise all the lines below before you go into production
plugins.security.ssl.transport.pemcert_filepath: esnode.pem
plugins.security.ssl.transport.pemkey_filepath: esnode-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: root-ca.pem
plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: esnode.pem
plugins.security.ssl.http.pemkey_filepath: esnode-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: root-ca.pem
plugins.security.allow_unsafe_democertificates: true
plugins.security.allow_default_init_securityindex: true
plugins.security.authcz.admin_dn: ['CN=kirk,OU=client,O=client,L=test,C=de']
plugins.security.audit.type: internal_opensearch
plugins.security.enable_snapshot_restore_privilege: true
plugins.security.check_snapshot_restore_write_privileges: true
plugins.security.restapi.roles_enabled: [all_access, security_rest_api_access]
plugins.security.system_indices.enabled: true
plugins.security.system_indices.indices: [.plugins-ml-agent, .plugins-ml-config, .plugins-ml-connector,
  .plugins-ml-controller, .plugins-ml-model-group, .plugins-ml-model, .plugins-ml-task,
  .plugins-ml-conversation-meta, .plugins-ml-conversation-interactions, .plugins-ml-memory-meta,
  .plugins-ml-memory-message, .plugins-ml-stop-words, .opendistro-alerting-config,
  .opendistro-alerting-alert*, .opendistro-anomaly-results*, .opendistro-anomaly-detector*,
  .opendistro-anomaly-checkpoints, .opendistro-anomaly-detection-state, .opendistro-reports-*,
  .opensearch-notifications-*, .opensearch-notebooks, .opensearch-observability, .ql-datasources,
  .opendistro-asynchronous-search-response*, .replication-metadata-store, .opensearch-knn-models,
  .geospatial-ip2geo-data*, .plugins-flow-framework-config, .plugins-flow-framework-templates,
  .plugins-flow-framework-state, .plugins-search-relevance-experiment, .plugins-search-relevance-judgment-cache]
node.max_local_storage_nodes: 3

docker-compose.yml

services:
  opensearch-node:
    image: opensearchproject/opensearch:3.6.0
    container_name: opensearch-node
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node
      - discovery.seed_hosts=opensearch-node,opensearch-node-date
      - cluster.initial_cluster_manager_nodes=opensearch-node
      - node.roles=cluster_manager,data,ingest,remote_cluster_client
      - node.attr.temp=hot
      - bootstrap.memory_lock=true
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
#      - "OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g -Dopensearch.experimental.feature.telemetry.enabled=true -Dtelemetry.feature.tracer.enabled=true -Dtelemetry.tracer.enabled=true"
      - "OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g"
      - OPENSEARCH_HOME=/usr/share/opensearch
      - OPENSEARCH_PATH_CONF=/usr/share/opensearch/config
#      - HTTP_PROXY=${HTTP_PROXY}
#      - HTTPS_PROXY=${HTTPS_PROXY}
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
        hard: 65536
    volumes:
      - opensearch-data:/usr/share/opensearch/data
      - ./opensearch.yml:/usr/share/opensearch/config/opensearch.yml
      - ./root-ca.pem:/usr/share/opensearch/config/root-ca.pem
      - ./esnode.pem:/usr/share/opensearch/config/esnode.pem
      - ./esnode-key.pem:/usr/share/opensearch/config/esnode-key.pem
      - /etc/localtime:/etc/localtime
      - ./test/config2.yml:/usr/share/opensearch/config/opensearch-security/config.yml
      - ./test/roles_mapping.yml:/usr/share/opensearch/config/opensearch-security/roles_mapping.yml
      - ./test/roles.yml:/usr/share/opensearch/config/opensearch-security/roles.yml
      - ./test/internal_users.yml:/usr/share/opensearch/config/opensearch-security/internal_users.yml
    ports:
      - 9200:9200
      - 9600:9600
    networks:
      - opensearch-net

  opensearch-node-date:
    image: opensearchproject/opensearch:3.6.0
    container_name: opensearch-node-date
    environment:
      - cluster.name=opensearch-cluster
      - node.name=opensearch-node-date
      - discovery.seed_hosts=opensearch-node,opensearch-node-date
      - cluster.initial_cluster_manager_nodes=opensearch-node
      - node.attr.temp=warm
      - node.roles=data,remote_cluster_client
      - bootstrap.memory_lock=true
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
#      - "OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g -Dopensearch.experimental.feature.telemetry.enabled=true -Dtelemetry.feature.tracer.enabled=true -Dtelemetry.tracer.enabled=true"
      - "OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g"
      - OPENSEARCH_HOME=/usr/share/opensearch
      - OPENSEARCH_PATH_CONF=/usr/share/opensearch/config
#      - HTTP_PROXY=${HTTP_PROXY}
#      - HTTPS_PROXY=${HTTPS_PROXY}
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536 # maximum number of open files for the OpenSearch user, set to at least 65536 on modern systems
        hard: 65536
    volumes:
      - /mnt/docker_volume/opensearch_opensearch-node-date:/usr/share/opensearch/data
      - ./opensearch.yml:/usr/share/opensearch/config/opensearch.yml
      - ./root-ca.pem:/usr/share/opensearch/config/root-ca.pem
      - ./esnode.pem:/usr/share/opensearch/config/esnode.pem
      - ./esnode-key.pem:/usr/share/opensearch/config/esnode-key.pem
      - /etc/localtime:/etc/localtime
      - ./test/config2.yml:/usr/share/opensearch/config/opensearch-security/config.yml
      - ./test/roles_mapping.yml:/usr/share/opensearch/config/opensearch-security/roles_mapping.yml
      - ./test/roles.yml:/usr/share/opensearch/config/opensearch-security/roles.yml
      - ./test/internal_users.yml:/usr/share/opensearch/config/opensearch-security/internal_users.yml
    ports:
      - 9201:9200
    networks:
      - opensearch-net

  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:3.6.0
    container_name: opensearch-dashboards
    depends_on:
      - opensearch-node
      - opensearch-node-date
    volumes:
      - /etc/localtime:/etc/localtime
    ports:
      - 5601:5601
    expose:
      - "5601"
    environment:
      OPENSEARCH_HOSTS: '["https://opensearch-node:9200","https://opensearch-node-date:9200"]'
    networks:
      - opensearch-net
volumes:
  opensearch-data:
#  opensearch-node-date:
  logstash:

networks:
  opensearch-net:

POST /_cluster/allocation/explain
{ “index”: “.opensearch-sap-test-detectors-queries-optimized-d657c548-e220-48ce-9f09-c9e0a07430d4-000001”, “shard”: 0, “primary”: true }

{
  "index": ".opensearch-sap-test-detectors-queries-optimized-d657c548-e220-48ce-9f09-c9e0a07430d4-000001",
  "shard": 0,
  "primary": true,
  "current_state": "started",
  "current_node": {
    "id": "ySQQb38KTOOMBVxPbew6Bw",
    "name": "opensearch-node",
    "transport_address": "172.18.0.2:9300",
    "attributes": {
      "temp": "hot",
      "shard_indexing_pressure_enabled": "true"
    },
    "weight_ranking": 1
  },
  "can_remain_on_current_node": "yes",
  "can_rebalance_cluster": "no",
  "can_rebalance_cluster_decisions": [
    {
      "decider": "rebalance_only_when_active",
      "decision": "NO",
      "explanation": "rebalancing is not allowed until all replicas in the cluster are active"
    },
    {
      "decider": "cluster_rebalance",
      "decision": "NO",
      "explanation": "the cluster has unassigned shards and cluster setting [cluster.routing.allocation.allow_rebalance] is set to [indices_all_active]"
    }
  ],
  "can_rebalance_to_other_node": "no",
  "rebalance_explanation": "rebalancing is not allowed",
  "node_allocation_decisions": [
    {
      "node_id": "u9omgUBHQ-e2VPJMy9ICAA",
      "node_name": "opensearch-node-date",
      "transport_address": "172.18.0.3:9300",
      "node_attributes": {
        "temp": "warm",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "worse_balance",
      "weight_ranking": 2
    }
  ]
}

GET /.opensearch-sap-test-detectors-queries-optimized*/_settings?flat_settings=true

{
  ".opensearch-sap-test-detectors-queries-optimized-d657c548-e220-48ce-9f09-c9e0a07430d4-000001": {
    "settings": {
      "index.analysis.analyzer.rule_analyzer.char_filter": [
        "rule_ws_filter"
      ],
      "index.analysis.analyzer.rule_analyzer.tokenizer": "keyword",
      "index.analysis.char_filter.rule_ws_filter.pattern": "(_ws_)",
      "index.analysis.char_filter.rule_ws_filter.replacement": " ",
      "index.analysis.char_filter.rule_ws_filter.type": "pattern_replace",
      "index.auto_expand_replicas": "0-1",
      "index.creation_date": "1779674720465",
      "index.hidden": "true",
      "index.mapping.total_fields.limit": "1008",
      "index.number_of_replicas": "1",
      "index.number_of_shards": "1",
      "index.provided_name": ".opensearch-sap-test-detectors-queries-optimized-d657c548-e220-48ce-9f09-c9e0a07430d4-000001",
      "index.replication.type": "DOCUMENT",
      "index.uuid": "gXrcDEu8RBWdcGkaO2Wy1w",
      "index.version.created": "137277827"
    }
  },
  ".opensearch-sap-test-detectors-queries-optimized-07bd22ba-da09-40c9-9eef-d73c5ed7d44b-000001": {
    "settings": {
      "index.analysis.analyzer.rule_analyzer.char_filter": [
        "rule_ws_filter"
      ],
      "index.analysis.analyzer.rule_analyzer.tokenizer": "keyword",
      "index.analysis.char_filter.rule_ws_filter.pattern": "(_ws_)",
      "index.analysis.char_filter.rule_ws_filter.replacement": " ",
      "index.analysis.char_filter.rule_ws_filter.type": "pattern_replace",
      "index.auto_expand_replicas": "0-1",
      "index.creation_date": "1779674720524",
      "index.hidden": "true",
      "index.mapping.total_fields.limit": "1008",
      "index.number_of_replicas": "1",
      "index.number_of_shards": "1",
      "index.provided_name": ".opensearch-sap-test-detectors-queries-optimized-07bd22ba-da09-40c9-9eef-d73c5ed7d44b-000001",
      "index.replication.type": "DOCUMENT",
      "index.uuid": "H4uECzX2Q1Whim4gLYFNhQ",
      "index.version.created": "137277827"
    }
  }
}

GET /_cat/nodeattrs?v&h=node,attr,value

node                 attr                            value
opensearch-node      temp                            hot
opensearch-node      shard_indexing_pressure_enabled true
opensearch-node-date temp                            warm
opensearch-node-date shard_indexing_pressure_enabled true

GET /_cluster/settings?pretty&include_defaults=false

{
  "persistent": {
    "cluster": {
      "default_number_of_replicas": "0",
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "20gb",
              "flood_stage": "10gb",
              "high": "20gb"
            }
          }
        }
      }
    },
    "opendistro": {
      "index_state_management": {
        "history": {
          "number_of_replicas": "0"
        }
      }
    }
  },
  "transient": {}
}

@Kin0sh can you check how much available space you have using the following command:

curl -ku admin:'<password>' 'https://localhost:9200/_cat/allocation?v&h=node,shards,disk.used,disk.avail,disk.total,disk.percent'

The culprit might be:

"routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "20gb",
              "flood_stage": "10gb",
              "high": "20gb"
            }
          }
        }

In practice this mean: “Refuse new shard allocations on any node with < 20 GB free”

@Anthony that doesn’t seem to be the issue. There are no issues with other shards. Free space on the server is maintained using ISM

node                                    shards   disk.used   disk.avail    disk.total    disk.percent
opensearch-node              225        125gb         167.1gb      292.2gb      42
opensearch-node-date     317        2.4tb           527.8gb      2.9tb           82
security-auditlog-2026.05.26 0     p      STARTED  602    1.7mb 172.18.0.3 opensearch-node
security-auditlog-2026.05.26 0     r      STARTED  602    1.7mb 172.18.0.2 opensearch-node-date

@Kin0sh Can you try. the following steps:

1. Clean up orphaned indices

curl -ku admin:'<password>' -XDELETE \
  'https://localhost:9200/.opensearch-sap-*-detectors-queries-optimized-*'

2. Retry any stuck shard allocations

curl -ku admin:'<password>' -XPOST \
  'https://localhost:9200/_cluster/reroute?retry_failed=true'

3. Check how loaded your hot node is before retrying

curl -ku admin:'<password>' \
  'https://localhost:9200/_cat/nodes?v&h=name,heap.percent,heap.current,heap.max,ram.percent,cpu'

4. Retry creating the detector: if the hot node has headroom, the primary should initialize fast enough for the creation to complete successfully.

What might be happening is the primary shard for that index starts initializing on your hot node. Your hot node is already managing a large number of shards, so it’s slow to finish initializing. The Alerting plugin tries to write all the SIGMA rules into the index immediately, but the primary isn’t ready yet. After a timeout, the whole detector creation fails. The UUID index is left behind as orphaned waste and the next attempt generates a brand new UUID, leaving another one behind.

Also regarding the hot/warm separation, Unless this config was missed in the output, you have node.attr.temp=hot and node.attr.temp=warm set on your nodes, but these attributes do nothing for routing on their own. For OpenSearch to actually use them for shard placement you also need this in your cluster settings:

curl -ku admin:'<password>' -XPUT 'https://localhost:9200/_cluster/settings' \
  -H 'Content-Type: application/json' \
  -d '{
    "persistent": {
      "cluster.routing.allocation.awareness.attributes": "temp"
    }
  }'

Without this, OpenSearch places shards purely by balance, your hot/warm separation is not being enforced at all.

@Anthony I followed all the steps, and also deleted the index template that is created after the detector is created. It looks like the error has changed.
GET _cluster/allocation/explain

{
  "index": ".opensearch-sap-test-detectors-queries-optimized-fae695e2-69a2-47f1-9a64-08ceba9b346e-000001",
  "shard": 0,
  "primary": false,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "REPLICA_ADDED",
    "at": "2026-05-29T02:23:22.241Z",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "yes",
  "allocate_explanation": "can allocate the shard",
  "target_node": {
    "id": "u9omgUBHQ-e2VPJMy9ICAA",
    "name": "opensearch-node-date",
    "transport_address": "172.18.0.2:9300",
    "attributes": {
      "temp": "warm",
      "shard_indexing_pressure_enabled": "true"
    }
  },
  "node_allocation_decisions": [
    {
      "node_id": "u9omgUBHQ-e2VPJMy9ICAA",
      "node_name": "opensearch-node-date",
      "transport_address": "172.18.0.2:9300",
      "node_attributes": {
        "temp": "warm",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "yes",
      "weight_ranking": 1
    },
    {
      "node_id": "ySQQb38KTOOMBVxPbew6Bw",
      "node_name": "opensearch-node",
      "transport_address": "172.18.0.3:9300",
      "node_attributes": {
        "temp": "hot",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "weight_ranking": 2,
      "deciders": [
        {
          "decider": "same_shard",
          "decision": "NO",
          "explanation": "a copy of this shard is already allocated to this node [[.opensearch-sap-test-detectors-queries-optimized-fae695e2-69a2-47f1-9a64-08ceba9b346e-000001][0], node[ySQQb38KTOOMBVxPbew6Bw], [P], s[STARTED], a[id=hVrBBJGNQOGnyfQ7Ib3bMg]]"
        },
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "there are too many copies of the shard allocated to nodes with attribute [temp], there are [2] total configured shard copies for this shard id and [2] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
        }
      ]
    }
  ]
}

I’m bad at Linux, but the RAM is fine. It’s sufficient.
free -h

               total        used        free      shared  buff/cache   available
Mem:           24Gi        18Gi       428Mi       196Ki       6.0Gi       5.9Gi
Swap:           8.0Gi       1.8Gi       6.2Gi

Here is the detection rules that I am trying to build a detector around.

id: h0y1Hp0BFsnEJOLbSRF7
logsource:
  product: test
title: Detect Logon 
description: Detect Logon
tags: []
falsepositives: []
level: informational
status: test
references: []
author: it's me
detection:
  condition: Selection_1 and not Selection_2
  Selection_1:
    event.code: 4624
    winlog.event_data.TargetUserName|contains: username
  Selection_2:
    winlog.event_data.IpAddress:
      - 192.168.1.2
      - 192.168.1.3
      - 192.168.1.4
      - 192.168.1.5

@Kin0sh The explain seems to be returning the correct information, opensearch wants to allocate the replica to a warm node ("node_decision": "yes"). However, it is strange that it has not done so already.

Can you run the following to see what cluster.routing.allocation.enable is set to.

curl -ku admin:'<password>' 'https://localhost:9200/_cluster/settings?pretty&include_defaults=true'

Also, does the reroute resolve the issue:

curl -ku admin:'<password>' -XPOST 'https://localhost:9200/_cluster/reroute'

@Anthony
“cluster.routing.allocation.enable”: “all”
I created a detector, got a timeout error, and ran the _cluster/reroute command. The result seems to be the same.

GET _cluster/allocation/explain

{
  "index": ".opensearch-sap-test-detectors-queries-optimized-0f6c6182-2325-479b-9cbf-1644e13178b3-000001",
  "shard": 0,
  "primary": false,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "REPLICA_ADDED",
    "at": "2026-06-05T03:29:19.232Z",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "no",
  "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions": [
    {
      "node_id": "u9omgUBHQ-e2VPJMy9ICAA",
      "node_name": "opensearch-node-date",
      "transport_address": "172.18.0.3:9300",
      "node_attributes": {
        "temp": "warm",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "replica_after_primary_active",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        },
        {
          "decider": "throttling",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        }
      ]
    },
    {
      "node_id": "ySQQb38KTOOMBVxPbew6Bw",
      "node_name": "opensearch-node",
      "transport_address": "172.18.0.2:9300",
      "node_attributes": {
        "temp": "hot",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "deciders": [
        {
          "decider": "replica_after_primary_active",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        },
        {
          "decider": "same_shard",
          "decision": "NO",
          "explanation": "a copy of this shard is already allocated to this node [[.opensearch-sap-test-detectors-queries-optimized-0f6c6182-2325-479b-9cbf-1644e13178b3-000001][0], node[ySQQb38KTOOMBVxPbew6Bw], [P], recovery_source[new shard recovery], s[INITIALIZING], a[id=7hjYIk-yQW-yyjjfBDUr_w], unassigned_info[[reason=INDEX_CREATED], at[2026-06-05T03:29:19.231Z], delayed=false, allocation_status[no_attempt]]]"
        },
        {
          "decider": "throttling",
          "decision": "NO",
          "explanation": "primary shard for this replica is not yet active"
        },
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "there are too many copies of the shard allocated to nodes with attribute [temp], there are [2] total configured shard copies for this shard id and [2] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
        }
      ]
    }
  ]
}

GET _cat/shards?v

.opensearch-sap-test-detectors-queries-optimized-0f6c6182-2325-479b-9cbf-1644e13178b3-000001 0     p      INITIALIZING                   172.18.0.2 opensearch-node
.opensearch-sap-test-detectors-queries-optimized-0f6c6182-2325-479b-9cbf-1644e13178b3-000001 0     r      UNASSIGNED                                

@Kin0sh after testing locally I can confirm that the configuration is now correct, the primary is attempting to land on the hot node, but fails to initialise. This can be as a result of busy cluster, with many shards (200+) residing on it and potentially being written to.

Can you run the following API to see how many pending tasks you currently have.

GET /_cluster/pending_tasks

Also the following API can provide details if anything was rejected as a result of saturated cluster:

curl -ku admin:'admin' \     
  'https://localhost:9200/_nodes/opensearch-node/stats/thread_pool?filter_path=nodes.*.thread_pool.generic,nodes.*.thread_pool.write&pretty'

Also check how much heap is currently being used/available:

curl -ku admin:admin 'https://localhost:9200/_cat/nodes?v&h=name,heap.percent,heap.current,heap.max'

I would recommend to increase the heap memory on the hot node, as it’s currently set to 6G (- "OPENSEARCH_JAVA_OPTS=-Xms6g -Xmx6g").

@Anthony After my last attempt at creating a detector, I rebooted the cluster. Now I have the following results.
GET /_cluster/pending_tasks

{
  "tasks": []
}

_nodes/opensearch-node/stats/thread_pool

{
  "nodes" : {
    "ySQQb38KTOOMBVxPbew6Bw" : {
      "thread_pool" : {
        "generic" : {
          "threads" : 39,
          "queue" : 0,
          "active" : 0,
          "rejected" : 0,
          "largest" : 39,
          "completed" : 458348
        },
        "write" : {
          "threads" : 6,
          "queue" : 0,
          "active" : 2,
          "rejected" : 0,
          "largest" : 6,
          "completed" : 925371
        }
      }
    }
  }
}

_cat/nodes?v&h=name,heap.percent,heap.current,heap.max’

name                 heap.percent heap.current heap.max
opensearch-node                56        3.3gb      6gb
opensearch-node-date           47        2.8gb      6gb

Increased the hot node heap to 8 GB

name                 heap.percent heap.current heap.max
opensearch-node                13        1.1gb      8gb
opensearch-node-date           34          2gb      6gb

GET _cluster/allocation/explain

{
  "index": ".opensearch-sap-test-detectors-queries-optimized-eef56c6e-1c0b-45a3-bfe6-141aef11d08b-000001",
  "shard": 0,
  "primary": false,
  "current_state": "unassigned",
  "unassigned_info": {
    "reason": "REPLICA_ADDED",
    "at": "2026-06-05T09:30:41.418Z",
    "last_allocation_status": "no_attempt"
  },
  "can_allocate": "yes",
  "allocate_explanation": "can allocate the shard",
  "target_node": {
    "id": "u9omgUBHQ-e2VPJMy9ICAA",
    "name": "opensearch-node-date",
    "transport_address": "172.18.0.2:9300",
    "attributes": {
      "temp": "warm",
      "shard_indexing_pressure_enabled": "true"
    }
  },
  "node_allocation_decisions": [
    {
      "node_id": "u9omgUBHQ-e2VPJMy9ICAA",
      "node_name": "opensearch-node-date",
      "transport_address": "172.18.0.2:9300",
      "node_attributes": {
        "temp": "warm",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "yes",
      "weight_ranking": 1
    },
    {
      "node_id": "ySQQb38KTOOMBVxPbew6Bw",
      "node_name": "opensearch-node",
      "transport_address": "172.18.0.3:9300",
      "node_attributes": {
        "temp": "hot",
        "shard_indexing_pressure_enabled": "true"
      },
      "node_decision": "no",
      "weight_ranking": 2,
      "deciders": [
        {
          "decider": "same_shard",
          "decision": "NO",
          "explanation": "a copy of this shard is already allocated to this node [[.opensearch-sap-test-detectors-queries-optimized-eef56c6e-1c0b-45a3-bfe6-141aef11d08b-000001][0], node[ySQQb38KTOOMBVxPbew6Bw], [P], s[STARTED], a[id=aag8_2esRSKAmQqdspqLBw]]"
        },
        {
          "decider": "awareness",
          "decision": "NO",
          "explanation": "there are too many copies of the shard allocated to nodes with attribute [temp], there are [2] total configured shard copies for this shard id and [2] total attribute values, expected the allocated shard count per attribute [2] to be less than or equal to the upper bound of the required number of shards per attribute [1]"
        }
      ]
    }
  ]
}

What is the output now from:

GET _cat/shards?v

I deleted the old indexes and templates and created a new detector. Here are the states I was able to obtain:

.opensearch-sap-test-detectors-queries-optimized-66b138b6-f8a8-4a9b-b083-10dab9a5af5a-000001 0     p      INITIALIZING                   172.18.0.3 opensearch-node
.opensearch-sap-test-detectors-queries-optimized-66b138b6-f8a8-4a9b-b083-10dab9a5af5a-000001 0     r      UNASSIGNED                                

.opensearch-sap-test-detectors-queries-optimized-eef56c6e-1c0b-45a3-bfe6-141aef11d08b-000001 0     p      STARTED                        172.18.0.3 opensearch-node
.opensearch-sap-test-detectors-queries-optimized-eef56c6e-1c0b-45a3-bfe6-141aef11d08b-000001 0     r      INITIALIZING                   172.18.0.2 opensearch-node-date

@Kin0sh according to the second snippet, the primary was successfully started. I would recommend to check the reason for the replica delay, by checking the current threads and memory usage. And also the explain API

@Anthony I found the following error in the hot node logs.

opensearch-node  | [2026-06-08T11:25:17,921][ERROR][o.o.a.u.DocLevelMonitorQueries] [opensearch-node] unknown exception during PUT mapping on queryIndex: .opensearch-sap-test-detectors-queries-optimized-c6d99585-82b5-41dd-983e-effd777c7928-000001, retrying with deletion of query index
opensearch-node  | org.opensearch.index.mapper.MapperParsingException: normalizer [normalized_keyword] not found for field [computer_name_dc-security_PeNApZ4BT0DwyqDuwYZi]
opensearch-node  |      at org.opensearch.index.mapper.KeywordFieldMapper$Builder.buildFieldType(KeywordFieldMapper.java:235)
opensearch-node  |      at org.opensearch.index.mapper.KeywordFieldMapper$Builder.build(KeywordFieldMapper.java:257)
opensearch-node  |      at org.opensearch.index.mapper.KeywordFieldMapper$Builder.build(KeywordFieldMapper.java:131)
opensearch-node  |      at org.opensearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:246)
opensearch-node  |      at org.opensearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:196)
opensearch-node  |      at org.opensearch.index.mapper.ObjectMapper$Builder.build(ObjectMapper.java:246)
opensearch-node  |      at org.opensearch.index.mapper.RootObjectMapper$Builder.build(RootObjectMapper.java:116)
opensearch-node  |      at org.opensearch.index.mapper.DocumentMapper$Builder.<init>(DocumentMapper.java:95)
opensearch-node  |      at org.opensearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:143)
opensearch-node  |      at org.opensearch.index.mapper.DocumentMapperParser.parse(DocumentMapperParser.java:132)
opensearch-node  |      at org.opensearch.index.mapper.MapperService.parse(MapperService.java:602)
opensearch-node  |      at org.opensearch.cluster.metadata.MetadataMappingService$PutMappingExecutor.applyRequest(MetadataMappingService.java:281)
opensearch-node  |      at org.opensearch.cluster.metadata.MetadataMappingService$PutMappingExecutor.execute(MetadataMappingService.java:245)
opensearch-node  |      at org.opensearch.cluster.service.ClusterManagerService.executeTasks(ClusterManagerService.java:890)
opensearch-node  |      at org.opensearch.cluster.service.ClusterManagerService.calculateTaskOutputs(ClusterManagerService.java:441)
opensearch-node  |      at org.opensearch.cluster.service.ClusterManagerService.runTasks(ClusterManagerService.java:301)
opensearch-node  |      at org.opensearch.cluster.service.ClusterManagerService$Batcher.run(ClusterManagerService.java:214)
opensearch-node  |      at org.opensearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:206)
opensearch-node  |      at org.opensearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:264)
opensearch-node  |      at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:952)
opensearch-node  |      at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedOpenSearchThreadPoolExecutor.java:299)
opensearch-node  |      at org.opensearch.common.util.concurrent.PrioritizedOpenSearchThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedOpenSearchThreadPoolExecutor.java:262)
opensearch-node  |      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090)
opensearch-node  |      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
opensearch-node  |      at java.base/java.lang.Thread.run(Thread.java:1474)
opensearch-node  | [2026-06-08T11:25:17,923][INFO ][o.o.c.m.MetadataDeleteIndexService] [opensearch-node] [.opensearch-sap-test-detectors-queries-optimized-c6d99585-82b5-41dd-983e-effd777c7928-000001/_D_SsZg4Qryz3T1D9xCLKA] deleting index

When creating a detector, automatic field mapping occurs normally. The computer_name field is not in the datastream. But there is winlog.computer_name. “dc-security” — this is the datastream name.
Maybe I made a mistake with the mapping somewhere?


{
  "index_templates": [
    {
      "name": "common",
      "index_template": {
        "index_patterns": [
          "userpc*",
          "servers*",
          "dc*",
          "other*",
          "ts*",
          "filialsrv",
          "userpc-application*",
          "userpc-application*",
          "userpc-application*",
          "dc-security*"
        ],
        "template": {
          "settings": {
            "index": {
              "routing": {
                "allocation": {
                  "require": {
                    "temp": "hot"
                  }
                }
              },
              "analysis": {
                "normalizer": {
                  "normalized_keyword": {
                    "filter": [
                      "uppercase"
                    ],
                    "type": "custom",
                    "char_filter": []
                  }
                }
              },
              "number_of_shards": "1",
              "number_of_replicas": "0"
            }
          },
          "mappings": {
            "properties": {
              "winlog": {
                "type": "object",
                "properties": {
                  "computer_name": {
                    "normalizer": "normalized_keyword",
                    "type": "keyword"
                  },
                  "event_data": {
                    "type": "object",
                    "properties": {
                      "OldValue": {
                        "type": "text"
                      },
                      "NewValue": {
                        "type": "text"
                      }
                    }
                  }
                }
              }
            }
          }
        },
        "composed_of": [],
        "priority": 0,
        "_meta": {
          "flow": "components"
        },
        "data_stream": {
          "timestamp_field": {
            "name": "@timestamp"
          }
        }
      }
    }
  ]
}

@Anthony I think I fixed the problem. I made an alias to the winlog.computer_name field, and the detector was created! Thank you so much for helping me)

1 Like