Custom webhook from OpenSearch Dashboard to AlertManager

Versions: OpenSearch 2.8.0

Describe the issue:
I attempting to sent custom notification from Alerting plugin to the AlertManager in same Kubernetes cluster. AlertManager have his endpoint secured with HTTPS and custom CA certificate.

If I attempting to sent test message to the AlertManager, in OSDash I see error:

Error is also logged:

main {"type":"error","@timestamp":"2023-07-03T12:52:46Z","tags":[],"pid":456,"level":"error","error":{"message":"Internal Server Error","name":"Error","stack":"Error: Internal Server Error\n    at HapiResponseAdapter.toError (/usr/share/opensearch-dashboards/src/core/server/http/router/response_adapter.js:143:19)\n    at HapiResponseAdapter.toHapiResponse (/usr/share/opensearch-dashboards/src/core/server/http/router/response_adapter.js:97:19)\n    at HapiResponseAdapter.handle (/usr/share/opensearch-dashboards/src/core/server/http/router/response_adapter.js:92:17)\n    at Router.handle (/usr/share/opensearch-dashboards/src/core/server/http/router/router.js:164:34)\n    at processTicksAndRejections (internal/process/task_queues.js:95:5)\n    at handler (/usr/share/opensearch-dashboards/src/core/server/http/router/router.js:124:50)\n    at exports.Manager.execute (/usr/share/opensearch-dashboards/node_modules/@hapi/hapi/lib/toolkit.js:60:28)\n    at Object.internals.handler (/usr/share/opensearch-dashboards/node_modules/@hapi/hapi/lib/handler.js:46:20)\n    at exports.execute (/usr/share/opensearch-dashboards/node_modules/@hapi/hapi/lib/handler.js:31:20)\n    at Request._lifecycle (/usr/share/opensearch-dashboards/node_modules/@hapi/hapi/lib/request.js:371:32)\n    at Request._execute (/usr/share/opensearch-dashboards/node_modules/@hapi/hapi/lib/request.js:281:9)"},"url":"http://opensearch.localcluster.internal/api/notifications/test_message/0gOdC4kBtwMlZllVSTKb","message":"Internal Server Error"}

Error is persistent even if I mount the CA certificate to the OSDash pod and load it via opensearch.ssl.certificateAuthorities parameter.

BUT

In same time in one OpenSearch pod with role client I see the cause error:

{"type": "logging", "timestamp": "2023-07-03T12:52:47,087Z", "level": "WARN", "component": "o.o.n.a.PluginBaseAction", "cluster.name": "oststclstr", "node.name": "opensearch-client-87b7645dc-v2pgk", "message": "notifications:OpenSearchStatusException:", "cluster.uuid": "UkQ3GL0lTWmJVlCDFNgRdg", "node.id": "fXO8tzs0S7G3nwZJI_iIFw" , 
"stacktrace": ["org.opensearch.OpenSearchStatusException: {\"event_status_list\": [{\"config_id\":\"0gOdC4kBtwMlZllVSTKb\",\"config_type\":\"webhook\",\"config_name\":\"AlertMananagerWebhook\",\"email_recipient_status\":[],\"delivery_status\":{\"status_code\":\"500\",\"status_text\":\"Failed to send webhook message PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target\"}}]}",
"at org.opensearch.notifications.send.SendMessageActionHelper.executeRequest(SendMessageActionHelper.kt:99) ~[?:?]",
"at org.opensearch.notifications.send.SendMessageActionHelper$executeRequest$1.invokeSuspend(SendMessageActionHelper.kt) ~[?:?]",
"at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-1.6.10.jar:1.6.10-release-923(1.6.10)]",
"at kotlinx.coroutines.internal.ScopeCoroutine.afterResume(Scopes.kt:32) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:113) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46) [kotlin-stdlib-1.6.10.jar:1.6.10-release-923(1.6.10)]",
"at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]"] }
{"type": "logging", "timestamp": "2023-07-03T12:52:47,089Z", "level": "ERROR", "component": "o.o.n.a.SendTestNotificationAction", "cluster.name": "oststclstr", "node.name": "opensearch-client-87b7645dc-v2pgk", "message": "notifications:SendTestNotificationAction-send Error:OpenSearchStatusException[{\"event_status_list\": [{\"config_id\":\"0gOdC4kBtwMlZllVSTKb\",\"config_type\":\"webhook\",\"config_name\":\"AlertMananagerWebhook\",\"email_recipient_status\":[],\"delivery_status\":{\"status_code\":\"500\",\"status_text\":\"Failed to send webhook message PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target\"}}]}]", "cluster.uuid": "UkQ3GL0lTWmJVlCDFNgRdg", "node.id": "fXO8tzs0S7G3nwZJI_iIFw"  }
{"type": "logging", "timestamp": "2023-07-03T12:52:47,090Z", "level": "WARN", "component": "r.suppressed", "cluster.name": "oststclstr", "node.name": "opensearch-client-87b7645dc-v2pgk", "message": "path: /_plugins/_notifications/feature/test/0gOdC4kBtwMlZllVSTKb, params: {config_id=0gOdC4kBtwMlZllVSTKb}", "cluster.uuid": "UkQ3GL0lTWmJVlCDFNgRdg", "node.id": "fXO8tzs0S7G3nwZJI_iIFw" , 
"stacktrace": ["org.opensearch.OpenSearchStatusException: {\"event_status_list\": [{\"config_id\":\"0gOdC4kBtwMlZllVSTKb\",\"config_type\":\"webhook\",\"config_name\":\"AlertMananagerWebhook\",\"email_recipient_status\":[],\"delivery_status\":{\"status_code\":\"500\",\"status_text\":\"Failed to send webhook message PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target\"}}]}",
"at org.opensearch.notifications.send.SendMessageActionHelper.executeRequest(SendMessageActionHelper.kt:99) ~[?:?]",
"at org.opensearch.notifications.send.SendMessageActionHelper$executeRequest$1.invokeSuspend(SendMessageActionHelper.kt) ~[?:?]",
"at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-1.6.10.jar:1.6.10-release-923(1.6.10)]",
"at kotlinx.coroutines.internal.ScopeCoroutine.afterResume(Scopes.kt:32) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:113) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46) [kotlin-stdlib-1.6.10.jar:1.6.10-release-923(1.6.10)]",
"at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]",
"at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665) [kotlinx-coroutines-core-jvm-1.4.3.jar:?]"] }

And here starting my confusion: Who communicate with AlertManager? OpenSearch Dashboard or OpenSearch Node? Who must know about custom CA cert? It seems, that OpenSearch Node.

Unfortunately, I not find any usable information in the documentation.

If I attempt to call same webhook (https://alertmanager.prometheus.svc.cluster.local:9093/api/v1/alerts) via CURL, everything is working. Yes, I have the CA certificate available for CURL. So, problem is really in the trust between OpenSearch and AlertManager.

I would like to ask you to help me with the problem. My goal is have notification from OSDash Alerts shown in the AlertManager.

Any suggestions or tips will be valued as well.

Thank you!

@LHozzan I understand you’ve created a Monitor in OpenSearch Alerting with AlertManager webhook as destination.

The OpenSearch node executes the Monitor and OpenSearch will also handle the connection to the target defined in the Monitor (Channels).

I assume that OpenSearch doesn’t have AlertManager’s self-signed certificate in its keystore.
I’m not aware of any certificate settings for Alerting plugin. However, please check this link.

@pablo
Thank you. I try to use it.
Best regards.