Rolling restart from security disabled to security enabled

Versions: OpenSearch 2.2.1 (tarball installation)

Describe the issue:
Is there an option to perform rolling restart of OpenSearch to switch over from security disabled (plugins.security.disabled: true) to security enabled?

I see an undocumented option to enable dual mode, in which the dual mode enabled node seems to be able to talk to security disabled servers. The dual mode servers have the following configurations

plugins.security.disabled: false
plugins.security.ssl_only: true
plugins.security_config.ssl_dual_mode_enabled: true

Once all the nodes are switch over to dual mode, the expectation was that the nodes can be rolling restarted to security enabled mode. Unfortunately the first node which is restarted in full security enabled mode get the following error

2023-01-09T20:21:26,440 worker][T#1] [E] c.ssl.tra.SecuritySSLNettyTransport - Exception during establishing a SSL connection: io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 4553ffffffff
io.netty.handler.ssl.NotSslRecordException: not an SSL/TLS record: 4553ffffffff

The dual mode nodes are are not able to communicate to the full security enabled nodes

2023-01-09T22:25:43,061 worker][T#2] [E] c.ssl.tra.SecuritySSLNettyTransport - SSL dual mode is enabled but dual mode handshake and OpenSearch ping has failed during client connection setup, closing channel

I was reading up here Adding support for SSL dual mode by sachetalva · Pull Request #712 · opensearch-project/security · GitHub and based on scenario 2, the dual mode transport client should be able to communicate with the SSL enabled transport server. Am I missing something?

1 Like

Hi @ronniepg,

Once all the nodes are running on dual mode and the cluster is formed, the dual mode setting is no longer needed and can be switched off (since there are no non-ssl nodes).

The dual mode can then be switched off using the PUT _cluster/settings api
ex:

PUT _cluster/settings
{
    "persistent" : {
        "plugins.security_config.ssl_dual_mode_enabled": false
    }
}

After switching off “dual mode”, the nodes can be transitioned into full security mode by removing the plugins.security_config.ssl_dual_mode_enabled: true line from the opensearch.yml file on each node and restarting opensearch on the nodes.

Hi @sachetalva ,
Thanks for your reply.

I could disable dual mode using the PUT _cluster/settings API. Also was able to switch over the data nodes one by one (rolling restart) from dual mode configs in opensearch.yml to full ssl mode with authc and authz enabled (users & roles all being internal). Doing this for the master nodes works fine until the primary master node is a full ssl node (i.e. the error does not happen as long as the primary master node still has the dual mode configs ). Once the primary master is a full ssl node any master/data nodes which are yet to be switched over to full ssl configs will have the following error for various APIs, e.g.

http://<master-node-with-dual-mode-configs>:port/_cat/indices?v&pretty
{
  "error" : {
    "root_cause" : [
      {
        "type" : "security_exception",
        "reason" : "No user found for indices:monitor/settings/get"
      }
    ],
    "type" : "security_exception",
    "reason" : "No user found for indices:monitor/settings/get"
  },
  "status" : 500
}

Any pointers to avoid this issue?

The upgrade works fine, if the primary master node is switched over to full ssl mode at the very end, i.e. after all other nodes are switched over to full ssl.