Roles created through securityadmin.sh get deleted when using the API to create Roles

We create multiple Roles, Tenants and RoleMappings with securityadmin.sh when deploying OpenSearch with an Ansible Playbook.
We have now tried to create another Role with the API of the security plugin, this however removed all the roles we have created with the securityadmin.sh script except for the standard roles.
This does not happen when creating additional tenants through the API.
This does not seem like intended behaviour and is not mentioned in documentation anywhere, quite contrary, the documentation recommends to basically do what we are trying to do here.
This behaviour happens in the lastest release of OpenSearch 2.3.0 as well as an earlier version of OpenDistro we are using (1.13.3 of opendistro for elasticsearch).
Has anyone encountered this behaviour before? Is this expected and could be caused by a misconfiguration on our side?
We would like to both have the option to create roles with securityadmin.sh and the API instead of having to use one exclusively to manges roles.

@mgelszinnis I understand that you’ve used Ansible Playbook to configure all the roles and then decided to use securityadmin.sh to add an extra role, am I correct?

Thank you for your response!
We are using securityadmin.sh in the playbook. Meaining, we generate the roles.yaml in the playbook and then run securityadmin.sh in the playbook.
After that we added a role via the API which resulted in the deletion of the roles we configured in the roles.yaml.

@mgelszinnis Could you share the exact API call you’ve used to create a new role?

PUT _plugins/_security/api/roles/testrole
{
“cluster_permissions”: [
“test_read_cluster”
],
“index_permissions”: [{
“index_patterns”: [
“*”
],
“dls”: “”,
“fls”: ,
“masked_fields”: ,
“allowed_actions”: [
“read”
]
}],
“tenant_permissions”: [{
“tenant_patterns”: [
“test_tenant”
],
“allowed_actions”: [
“kibana_all_read”
]
}]
}

I have used this API call to create the role on our OpenSearch instance.

And this call on our older OpenDistro instance

PUT _opendistro/_security/api/roles/testrole
{
“cluster_permissions”: [
“test_read_cluster”
],
“index_permissions”: [{
“index_patterns” : [
“.kibana*”,
“f*-test*”
],
“fls” : ,
“masked_fields” : ,
“allowed_actions” : [
“read”
]
},
{
“index_patterns” : [
“*”
],
“fls” : ,
“masked_fields” : ,
“allowed_actions” : [
“test_read_indices”
]
}],
“tenant_permissions”: [{
“tenant_patterns”: [
“test_tenant”
],
“allowed_actions”: [
“kibana_all_read”
]
}]
}

@mgelszinnis Just to confirm the steps.

  1. OpenSearch and OpenSearch Dashboards deployed with Ansible Playbook
  2. Roles created with securityadmin.sh during the Ansible Playbook deployment
  3. Additional role created with PUT _plugins/_security/api/roles/testrole
  4. All roles created with securityadmin.sh disappeared and only built-in are present.

I don’t have the OpenSearch ansible-playbook but I understand that securityadmin.sh is executed during the deployment with files located in the ansible/files folder.

Also, I know that you can re-run the ansible command and reapply the config to the existing deployment.

Could you answer the below questions?

  1. Do you also run securityadmin.sh script outside of the Ansible deployment?
  2. If so, which files do you use with the securityadmin.sh? Is the one from the ansible folder or separate?
  3. Do you take the backup of the config before running securityadmin.sh restore?
  4. Did you check the roles just before running the PUT API?
  5. Did you apply custom roles with ansible-deployment or you used securityadmin.sh afterwards?

I’ve tested the securitydamin.sh and PUT API and I successfully added a new role to the existing config in the cluster with the API. The new role disappeared only when I re-run the securityadmin.sh without the prior backup of the existing config.

Thank you for your response again.
You got the steps right.
One correction: We don’t actually use the official ansible playbook from the OpenSearch project. We have our own scripts, which deploy everything in containers using podman.

  1. We only run securityadmin.sh during the deployment
  2. We apply the roles.yaml, roles_mapping.yaml, tenants.yaml, internal_users.yaml, action_groups.yaml and the config.yaml for the security configuration.
  3. We don’t do a backup and also don’t run a restore, we apply the files directly
  4. We did exactly that
  5. We apply the custom roles during the deployment

Yes, we are aware that rerunnning securityadmin.sh after adding roles using the API deletes the new roles as securityadmin.sh initializes the index used to store the configuration. We have the reverse: Using the API deletes roles added with securityadmin.sh

I have just tried creating a new role through GUI, the same thing happens: all roles created with securityadmin.sh get deleted.

@mgelszinnis How many nodes does your cluster contain?

Could you share the output of the below command?

curl --insecure -u admin:admin -XGET https://<opensearch_node_IP_or_FQDN>:9200/_cat/nodes

172.22.157.249 24 92 4 0.32 0.18 0.06 dimr cluster_manager,data,ingest,remote_cluster_client * hostname

I have replaced the hostname with “hostname” as I can’t share that with you

@mgelszinnis I’ve deployed a single-node cluster with ansible but I’ve used the official ansible playbook.
I also have OpenSearch running as docker containers.
In both cases, no matter if single or multi-node, I had no issues with disappearing roles and I couldn’t repro your scenario.

When the security plugin is initiated for the first time, the .opendistro_security index is created.
That index holds all the security objects including roles. The only ways to change the content of that index are the execution of securityadmin.sh or PUT/PATCH API calls.

However, all OpenSearch indices are kept as files in the OpenSearch filesystem. Maybe you have some kind of read-only mode set for OpenSearch’s data folder in podman.

Have you tried to restart the OpenSearch node after deployment? Does the node hold the new roles after restart?

The roles do persist after a restart. The volume we use is definitetly read-write.
One thing I have noticed, is that our .opendistro_security index seems to be empty when queried through search or through the sql plugin. Is that also the case for you?
Another thing I have noticed: The GUI seems to be using a different API, it’s using a POST to api/v1/configuration/tenants/ to create a new tenant or /api/v1/configuration/roles/ to create new roles. It also uses this API (api/v1/configuration/) for the rest of the security config items.
Even after using the GUI or the API I mentioned above (_opendistro/_security/api/roles/testrole) the .opendistro_security index remains empty.

@mgelszinnis That index is not empty. You can check the size by listing the indices.

To see the content of the .opendistro_security you must use the admin certificate for authentication.

i.e.

curl --insecure --cert kirk.pem --key kirk-key.pem -XGET https://localhost:9200/.opendistro_security/_search?pretty

Have you tried to deploy it as a service instead of podman?
Maybe you should try to deploy OpenSearch without ansible playbook.

Could you share your opensearch.yml config?

Thank you for your continued patience, here is our opensearch.yaml

network.host: 0.0.0.0   
network.publish_host: "{{ host_ip.stdout }}"

plugins.security.allow_unsafe_democertificates: false
plugins.security.authcz.admin_dn:
  - '' # I have deleted this information for security reasons
#plugins.security.audit.type: internal_opensearch

plugins.security.restapi.roles_enabled:
  - all_access
  - security_rest_api_access

plugins.security.allow_default_init_securityindex: false

plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: /usr/share/opensearch/config/certs/client_cert.pem
plugins.security.ssl.http.pemkey_filepath: /usr/share/opensearch/config/certs/client_key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root_ca_cert.pem

plugins.security.ssl.transport.enforce_hostname_verification: false
plugins.security.ssl.transport.pemcert_filepath: /usr/share/opensearch/config/certs/node_cert.pem
plugins.security.ssl.transport.pemkey_filepath: /usr/share/opensearch/config/certs/node_key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root_ca_cert.pem
plugins.security.ssl.transport.resolve_hostname: false

search.max_buckets: 5000000
index.codec: best_compression

Also thank you for your notice regarding the security index, our index is indeed not empty.

@mgelszinnis According to your previous output you have only one OpenSearch node running. However, this config is not set for the single-node cluster.

How many nodes do you have running/configured in this deployment?

@mgelszinnis Does the client_cert.pem contains the same CN/SAN as node_cert.pem?

The CN of the client_cert and the node_cert are the same. We are only using one node/server in this deployment.

@mgelszinnis

Have you tried to deploy without roles.yml file? Could you try that? Then apply securityadmin.sh script and try to use PUT API against the roles.

When the roles are reset after running the PUT API, does the next PUT API call against the roles reset the roles or do your changes remain?

What if you run securityadmin.sh again after roles reset and then run the PUT API call? Does this reset roles too?

I know this is not part of your workflow but I’m trying narrow down the issue.

@mgelszinnis Thanks to your files I’ve reproduced your issue.

It appears that adding static: true to the roles will remove that roles once updated by API call or with OpenSearch Dashboards GUI.
As far as I’m aware, that setting is reserved for the built-in roles only (i.e. all_access, readall, logstash).
I suggest either removing that line from each role or setting it to false.

1 Like

Thank you so very much! I will test this solution as soon as I am able. Do you still need the output of the API call you sent?