Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):
Opensearch 2.13.0
Describe the issue:
I ran into [Bug] SSLHandshakeException: Insufficient buffer remaining for AEAD cipher fragment · Issue #3299 · opensearch-project/security · GitHub after certs were renewed for the opensearch cluster. Specifically, we start to see that error when a fluentbit client tries to write to opensearch. Among the workarounds I tried based on the issue, one was to restrict to TLS v1.2, which then results in a different error
[2024-05-09T04:57:52,821][WARN ][o.o.h.AbstractHttpServerTransport] [opensearch-cluster-master-0] caught exception while handling client http traffic, closing connection Netty4HttpChannel{localAddress=/xxxx:9200, remoteAddress=/xxxx:38362}
io.netty.handler.codec.DecoderException: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499) ~[netty-codec-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290) ~[netty-codec-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:689) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:652) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562) [netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) [netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.107.Final.jar:4.1.107.Final]
at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: javax.net.ssl.SSLHandshakeException: Received fatal alert: certificate_expired
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:130) ~[?:?]
at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:117) ~[?:?]
at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:365) ~[?:?]
at java.base/sun.security.ssl.Alert$AlertConsumer.consume(Alert.java:287) ~[?:?]
at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:204) ~[?:?]
at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:172) ~[?:?]
at java.base/sun.security.ssl.SSLEngineImpl.decode(SSLEngineImpl.java:736) ~[?:?]
at java.base/sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:691) ~[?:?]
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:506) ~[?:?]
at java.base/sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:482) ~[?:?]
at java.base/javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:679) ~[?:?]
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:310) ~[netty-handler-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1445) ~[netty-handler-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1338) ~[netty-handler-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1387) ~[netty-handler-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529) ~[netty-codec-4.1.107.Final.jar:4.1.107.Final]
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468) ~[netty-codec-4.1.107.Final.jar:4.1.107.Final]
... 16 more
I added -Djavax.net.debug=all
which seemed to suggest that the Opensearch cluster is closing the connection during TLS handshake with fluentbit. I’m a bit confused why this would happen – for one, setting clientauth_mode: NONE
doesn’t help. For another, the fluentbit client’s certificates are definitely not expired. I suppose opensearch thinks its own certificates are expired? Anyway, this made me wonder if this has something to do with certificate hot reloading. I tried using
curl --cacert <ca.pem> --cert <admin.pem> --key <admin.key> -XPUT https://localhost:9200/_plugins/_security/api/ssl/transport/reloadcerts
and
curl --cacert <ca.pem> --cert <admin.pem> --key <admin.key> -XPUT https://localhost:9200/_plugins/_security/api/ssl/http/reloadcerts
to reload the certs. Both of those commands return 200 and the appropriate message, but the opensearch node logs display
Not sure if that means whether the reload worked correctly or not.
Any thoughts or suggestions appreciated.