Migration Assistant without requiring AWS

farman · May 5, 2026, 6:40pm

Versions (relevant - OpenSearch/Dashboard/Server OS/Browser):

Describe the issue: I have some queries regarding the Migration Assistant especially when using Migration Assistant without requiring the use of any AWS services. I want to use Migration Assistant for OpenSearch Upgrade where up to 100 TB (or more) data may be involved. Below are the questions:

Do we need to build the Migration Assistant binaries always from source or any pre-built images available that can be downloaded and run
Has Migration Assistant ever been run on a non-AWS environment in Production on K8S ? Would it require changes in Migration Assistant code to run it in non AWS K8S environment ?
For using RFS will already taken previous Snapshots will work or a new Snapshots need to be taken for the entire cluster indices before starting the migration procedure
Is it necessary to store these snapshots in S3 storage or can they be stored on other type of storage like NFS and then used for RFS ?
When RFS based backfill is bringing historical data into Target Cluster, can at the same time live traffic also be served to the target cluster OR live traffic must always be captured in a Kafka queue while RFS backfill is underway and then replayed later

Configuration:

Relevant Logs or Screenshots:

pablo · May 7, 2026, 1:28pm

Hi @farman

According to documentation, Migration Assistant is deployed with AWS CloudFormation. The CloudFormation pulls the Migration Assistant EC2 image from AWS repo.

As mentioned before, AWS CloudFormation uses an AWS image to deploy the Migration Assistant as EC2 instance. However, OpenSearch Migration Roadmap has K8s/EKS implementation on the list

github.com/opensearch-project/opensearch-migrations

Migration Assistant on K8s/EKS

opened 02:42PM - 15 Jul 25 UTC

closed 10:46AM - 16 Apr 26 UTC

lewijacn

enhancement MAv3.0

### Is your feature request related to a problem? Migration Assistant migrations… can be greatly improved, in terms of both speed and ease, by setting up the infrastructure in Kubernetes. Additionally, built-in mechanisms for scaling and orchestration in Kubernetes should pave a way for tasks such as https://github.com/opensearch-project/opensearch-migrations/issues/1072. ### What solution would you like? For this feature, the focus should be on setting up required infrastructure in Kubernetes to enable effectively performing migrations in a subsequent feature. This feature should include documentation for setting up and deploying the Migration Assistant to a Kubernetes cluster. Additionally, the Migration Assistant deployment should setup any required resources, such as a bucket for storing snapshots or tuples, or required roles that the setup process or future migration pods will need. Core functionality such as enabling persistent logging for all pods that get created as well as sending metrics with OTEL to configured targets is also expected with this feature. ### What alternatives have you considered? Some considerable time was spent trying to solve some of these problems in the existing Docker compose environment for local deployments and CDK package for AWS environment deployments, with limited success. Difficulties such as a lack of consistency between local and cloud deployments, and lack of built-in mechanisms for orchestration, scheduling, and scaling have put a large strain on this existing approach, all of which Kubernetes should be more targeted to resolve. ### Do you have any additional context? The Jira Epic for this work can be found here: https://opensearch.atlassian.net/browse/MIGRATIONS-2402

According to documentation, you can use existing snapshots. However, they must exist on the S3 bucket.

Documentation only mentions S3 bucket. I couldn’t find anything in the OpenSearch Migrations - Roadmap.

You should consider opening a feature request in the Migration Assistant repo.

You may also find this proposition interesting.

github.com/opensearch-project/opensearch-migrations

Multi-Cloud Support for Migration Assistant

opened 08:47PM - 29 Apr 26 UTC

tonypiazza

untriaged

## Summary This document proposes extending the OpenSearch Migration Assistant …to support cloud-agnostic deployments, enabling customers running OpenSearch on GCP, Azure, or bare-metal Kubernetes to use the same migration tooling currently available to AWS users. The changes build on existing abstractions in the codebase rather than requiring architectural changes. ## Motivation The Migration Assistant is the most mature open-source tool for migrating Elasticsearch clusters to OpenSearch. However, its deployment and storage layers are currently tied to AWS services, which limits adoption among the significant portion of the OpenSearch community running on other cloud providers or on-premises infrastructure. Aiven is an AWS partner that operates managed data infrastructure across AWS, GCP, Azure, and smaller regional cloud providers. Aiven is truly cloud-agnostic — our customers choose the provider that best fits their needs, and we deploy workloads wherever they need them. Our OpenSearch customers migrating from Elasticsearch need a reliable migration path regardless of where their target cluster runs. Today, customers running on non-AWS providers simply cannot use the Migration Assistant. This proposal does not seek to diminish the AWS deployment path — it seeks to extend the same quality of migration tooling to the broader OpenSearch community. Making the Migration Assistant cloud-agnostic would: - Expand the tool's addressable user base significantly - Align with OpenSearch's identity as a vendor-neutral, community-driven project - Enable managed service providers to offer integrated migration tooling - Reduce the barrier to OpenSearch adoption for non-AWS users ## Current State The codebase already has meaningful abstraction points that make this work tractable: ### Existing Cloud-Agnostic Layers - **Core pipeline** (`coreUtilities`, `transformation`, `RfsCommon`, `RfsPipeline`, `MetadataMigration`) has zero AWS dependencies - [`SourceRepo`][SourceRepo] interface in `RfsCommon` abstracts snapshot storage with [`FileSystemRepo`][FileSystemRepo] and [`S3Repo`][S3Repo] implementations - [`SnapshotCreator`][SnapshotCreator] abstract class supports filesystem and [`S3SnapshotCreator`][S3SnapshotCreator] variants - [`RequestTransformer`][RequestTransformer] / [`IAuthTransformerFactory`][IAuthTransformerFactory] interfaces make authentication pluggable (SigV4, BasicAuth, NoAuth) - [`BlobSource`][BlobSource] functional interface in `RfsCommon` further abstracts blob reading from any storage backend - **Python console** uses a strategy/factory pattern ([`factories.py`][factories]) with ECS, Kubernetes, and Docker backends for orchestration ([`backfill_base.py`][backfill_base], [`replayer_base.py`][replayer_base]) - **Prometheus metrics** source already exists alongside CloudWatch ([`metrics_source.py`][metrics_source]) - **Kubernetes Secrets** support exists alongside AWS Secrets Manager ([`cluster.py`][cluster]) - **Helm charts** in [`deployment/k8s/`][helm] include `valuesForLocalK8s.yaml` and AWS-specific toggles ### Remaining AWS-Coupled Areas 1. [`S3Repo`][S3Repo] in `SnapshotReader` — hardcodes `S3AsyncClient` construction (but implements [`SourceRepo`][SourceRepo] interface) 2. [`S3TupleSink`][S3TupleSink] in `TrafficCapture/tupleSink` — no abstraction layer for tuple output destination 3. **`TrafficReplayer`** main class — inline `S3AsyncClient` creation for tuple output 4. **`captureKafkaOffloader`** — MSK IAM auth with no alternative 5. **Helm charts** — some AWS-specific defaults and assumptions remain in the non-toggled paths ## Proposal ### Phase 1: Cloud-Agnostic Object Storage **Goal:** Enable snapshot reading and tuple storage on GCS and Azure Blob Storage. #### 1a. Snapshot Storage Backends Implement [`SourceRepo`][SourceRepo] for GCS and Azure Blob Storage in `SnapshotReader`: - `GcsRepo implements SourceRepo` — using the Google Cloud Storage client library - `AzureBlobRepo implements SourceRepo` — using the Azure Storage Blob SDK Implement corresponding [`SnapshotCreator`][SnapshotCreator] subclasses: - `GcsSnapshotCreator extends SnapshotCreator` - `AzureBlobSnapshotCreator extends SnapshotCreator` The existing [`SourceRepo`][SourceRepo] and [`SnapshotCreator`][SnapshotCreator] abstractions mean no changes to `RfsCommon`, `RfsPipeline`, or `DocumentsFromSnapshotMigration` are required. #### 1b. Tuple Sink Abstraction Introduce a `TupleSink` interface (or extend the existing sink pattern) with: - [`S3TupleSink`][S3TupleSink] (existing, refactored behind interface) - `GcsTupleSink` - `AzureBlobTupleSink` - `FileSystemTupleSink` (useful for local testing) Update `TrafficReplayer` to accept the sink via dependency injection rather than constructing `S3AsyncClient` inline. #### 1c. Python Console Snapshot Support Add `GcsSnapshot` and `AzureBlobSnapshot` implementations alongside the existing `S3Snapshot` and `FileSystemSnapshot` in the console's [factory dispatch][factories] ([`snapshot.py`][snapshot]). ### Phase 2: Cloud-Agnostic Kafka Authentication **Goal:** Remove the hard dependency on MSK IAM auth for Kafka connectivity. - Make `captureKafkaOffloader` support SASL/SCRAM and mTLS as first-class authentication methods (these are standard Kafka auth mechanisms supported by all managed Kafka providers) - Ensure the Helm charts can configure Kafka brokers without MSK-specific properties - The console's `StandardKafka` and `ScramKafka` classes already handle this on the Python side ([`kafka.py`][kafka]) ### Phase 3: Infrastructure Provisioning and Lifecycle **Goal:** Provide full infrastructure automation — provisioning and deprovisioning — on GCP, Azure, and other providers, matching the turnkey experience that the AWS CDK path provides today. The AWS implementation provisions the entire stack (VPC, ECS/Fargate, MSK, S3, ALB, IAM, security groups) and tears it all down when the migration is complete. Non-AWS users must have the same experience: a single command to stand up all migration infrastructure, and a single command to remove it. The existing repo structure already supports this cleanly. Deployment methods are organized as siblings: ``` deployment/ ├── cdk/ # AWS CDK (TypeScript) ├── k8s/ # Helm charts (Kubernetes) ├── migration-assistant-solution/ # AWS one-click CloudFormation └── terraform/ # NEW — multi-cloud provisioning ├── gcp/ ├── azure/ └── modules/ # Shared modules (K8s config, Helm install) ``` Adding `deployment/terraform/` is purely additive — the CDK and CloudFormation paths remain untouched. #### 3a. Infrastructure Provisioning Modules Create infrastructure-as-code modules that provision the complete migration infrastructure per cloud provider: - **GCP:** GKE cluster, VPC, Cloud Storage bucket, firewall rules, IAM service accounts, Kafka (self-managed on GKE or Confluent Cloud) - **Azure:** AKS cluster, VNet, Blob Storage container, NSGs, managed identities, Kafka (self-managed on AKS or Event Hubs) - **Generic/bare-metal:** Documentation and scripts for environments without a managed K8s offering Each module should: - Accept a minimal configuration (source endpoint, target endpoint, auth credentials, cloud region) - Provision a K8s cluster with the correct storage classes, networking, and access to source/target clusters - Install the Helm charts with the appropriate provider-specific values - Expose a single teardown command that cleanly deprovisions all resources #### 3b. Refactor Helm Charts for Multi-Cloud The existing charts already use a base + overlay pattern with AWS-specific templates gated behind `aws.configureAwsEksResources` in a dedicated `templates/resources/aws/` directory. The refactoring formalizes and extends this pattern. **Add cloud-specific resource directories and values overlays:** ``` templates/resources/ ├── aws/ # Already exists, gated by aws.configureAwsEksResources ├── gcp/ # NEW, gated by gcp.configureGkeResources ├── azure/ # NEW, gated by azure.configureAksResources └── objectStore/ # Refactored from s3/, with cloud-conditional templates valuesEks.yaml # Already exists valuesGke.yaml # NEW valuesAks.yaml # NEW ``` **Specific changes:** - **Object storage:** Refactor `templates/resources/s3/` (which currently uses `aws s3` CLI directly for bucket creation and deletion) into a cloud-conditional `objectStore/` directory with templates for GCS and Azure Blob alongside the existing S3 templates - **Storage classes:** Add GCP Persistent Disk (`pd-ssd`) and Azure Managed Disk (`premium-lrs`) StorageClass definitions in the new cloud-specific directories, following the existing `aws/gp3StorageClass.yaml` pattern - **Node autoscaling:** Gate existing Karpenter NodePool/NodeClass templates and affinity rules so they only apply on EKS. Add equivalent configurations for GKE Node Auto-Provisioning and AKS Cluster Autoscaler - **Argo Workflows artifact storage:** Currently configured for S3. Add GCS and Azure Blob as artifact repository options, selectable via the values overlay - **Observability:** The base OTEL collector config already uses generic Prometheus/OTLP exporters. Ensure the GKE and AKS overlays configure appropriate exporters (e.g., Google Cloud Monitoring, Azure Monitor) rather than inheriting the CloudWatch EMF exporters from the EKS overlay - **Certificate management:** The default path already uses a cloud-agnostic self-signed CA chain via cert-manager. AWS PCA integration is opt-in. No changes needed for non-AWS providers unless they want to integrate with Google CA Service or Azure Key Vault ### Phase 4: End-to-End Validation **Goal:** Verify the complete migration workflow — including infrastructure provisioning and teardown — on non-AWS providers. - Establish a CI test matrix covering GKE and AKS deployments - Test the full lifecycle: provision infrastructure, run migration (metadata + backfill + live capture), deprovision - Test with Aiven for OpenSearch as the target cluster - Document provider-specific configuration and troubleshooting ## What This RFC Does NOT Propose - **Rewriting the CDK deployment** — the AWS CDK path remains as-is for AWS-native users - **Removing AWS support** — all changes are additive - **New migration capabilities** — the scope is portability of existing features, not new functionality ## Aiven's Commitment Aiven is prepared to: - Contribute engineering resources to implement and test these changes - Maintain the GCS and Azure storage backends going forward - Provide CI infrastructure for multi-cloud testing - Contribute documentation for non-AWS deployment scenarios ## Open Questions 1. **Module organization:** Should GCS/Azure implementations live in `SnapshotReader` alongside `S3Repo`, or in separate Gradle modules (e.g., `SnapshotReaderGcs`, `SnapshotReaderAzure`) to avoid adding cloud SDK dependencies to the default build? 2. **Helm chart structure:** Should provider-specific values files live in the main chart or as separate sub-charts? 3. **CI ownership:** How should multi-cloud CI be structured? Aiven can host the GCP/Azure test infrastructure, but the test definitions should live in the main repo. 4. **Release cadence:** Should multi-cloud support be gated behind a feature flag initially, or ship as generally available from the start? 5. **IaC tooling:** Terraform is the assumed default for infrastructure provisioning in Phase 3, but alternatives like OpenTofu, Pulumi, or Crossplane are also viable. What does the community prefer? ## References - [OpenSearch Migration Assistant Documentation](https://docs.opensearch.org/3.0/migration-assistant/) - [opensearch-project/opensearch-migrations](https://github.com/opensearch-project/opensearch-migrations) - [Existing K8s Helm charts](https://github.com/opensearch-project/opensearch-migrations/tree/main/deployment/k8s) [SourceRepo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsCommon/src/main/java/org/opensearch/migrations/bulkload/common/SourceRepo.java#L6 [BlobSource]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsCommon/src/main/java/org/opensearch/migrations/bulkload/common/BlobSource.java#L10-L11 [S3Repo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/SnapshotReader/src/main/java/org/opensearch/migrations/bulkload/common/S3Repo.java#L23 [FileSystemRepo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/SnapshotReader/src/main/java/org/opensearch/migrations/bulkload/common/FileSystemRepo.java#L12 [SnapshotCreator]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RFS/src/main/java/org/opensearch/migrations/bulkload/common/SnapshotCreator.java#L15 [S3SnapshotCreator]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RFS/src/main/java/org/opensearch/migrations/bulkload/common/S3SnapshotCreator.java#L10 [RequestTransformer]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsHttp/src/main/java/org/opensearch/migrations/bulkload/common/http/RequestTransformer.java#L9 [IAuthTransformerFactory]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/transform/IAuthTransformerFactory.java#L7 [S3TupleSink]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/TrafficCapture/tupleSink/src/main/java/org/opensearch/migrations/replay/sink/S3TupleSink.java#L38 [factories]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/factories.py#L48 [backfill_base]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/backfill_base.py#L28 [replayer_base]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/replayer_base.py#L52 [metrics_source]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/metrics_source.py#L66 [cluster]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/cluster.py#L130 [snapshot]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/snapshot.py#L74 [kafka]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/kafka.py#L158 [helm]: https://github.com/opensearch-project/opensearch-migrations/tree/cae2480e4f205a64fe09f45974a7ffb114dcdda7/deployment/k8s

farman · May 9, 2026, 7:37am

Hi @pablo,

Thanks for your comments. As I see there is already code available for non-AWS K8S deployment in github at following location:

However, as per documentation it is deployable on “MiniKube”, “Kind”. As I understand this is mostly useful for test and development OR can this be enhanced for use in Production environments ?
The ReadMe file Home · opensearch-project/opensearch-migrations Wiki · GitHub says:
“Migration Assistant runs on Kubernetes and uses Argo Workflows for orchestration. The diagram below shows the architecture on AWS EKS, but Migration Assistant works equivalently on any Kubernetes distribution including GKE, AKS, OpenShift, and self-managed Kubernetes clusters.”

But looking at the new backlog item (link below) you sent to enhance Migration Assistant for Multi-Cloud Support, it seems still a lot of work needs to be done to support Migration Assistant on non-AWS platforms (including on-premise).

github.com/opensearch-project/opensearch-migrations

Multi-Cloud Support for Migration Assistant

opened 08:47PM - 29 Apr 26 UTC

tonypiazza

untriaged

## Summary This document proposes extending the OpenSearch Migration Assistant …to support cloud-agnostic deployments, enabling customers running OpenSearch on GCP, Azure, or bare-metal Kubernetes to use the same migration tooling currently available to AWS users. The changes build on existing abstractions in the codebase rather than requiring architectural changes. ## Motivation The Migration Assistant is the most mature open-source tool for migrating Elasticsearch clusters to OpenSearch. However, its deployment and storage layers are currently tied to AWS services, which limits adoption among the significant portion of the OpenSearch community running on other cloud providers or on-premises infrastructure. Aiven is an AWS partner that operates managed data infrastructure across AWS, GCP, Azure, and smaller regional cloud providers. Aiven is truly cloud-agnostic — our customers choose the provider that best fits their needs, and we deploy workloads wherever they need them. Our OpenSearch customers migrating from Elasticsearch need a reliable migration path regardless of where their target cluster runs. Today, customers running on non-AWS providers simply cannot use the Migration Assistant. This proposal does not seek to diminish the AWS deployment path — it seeks to extend the same quality of migration tooling to the broader OpenSearch community. Making the Migration Assistant cloud-agnostic would: - Expand the tool's addressable user base significantly - Align with OpenSearch's identity as a vendor-neutral, community-driven project - Enable managed service providers to offer integrated migration tooling - Reduce the barrier to OpenSearch adoption for non-AWS users ## Current State The codebase already has meaningful abstraction points that make this work tractable: ### Existing Cloud-Agnostic Layers - **Core pipeline** (`coreUtilities`, `transformation`, `RfsCommon`, `RfsPipeline`, `MetadataMigration`) has zero AWS dependencies - [`SourceRepo`][SourceRepo] interface in `RfsCommon` abstracts snapshot storage with [`FileSystemRepo`][FileSystemRepo] and [`S3Repo`][S3Repo] implementations - [`SnapshotCreator`][SnapshotCreator] abstract class supports filesystem and [`S3SnapshotCreator`][S3SnapshotCreator] variants - [`RequestTransformer`][RequestTransformer] / [`IAuthTransformerFactory`][IAuthTransformerFactory] interfaces make authentication pluggable (SigV4, BasicAuth, NoAuth) - [`BlobSource`][BlobSource] functional interface in `RfsCommon` further abstracts blob reading from any storage backend - **Python console** uses a strategy/factory pattern ([`factories.py`][factories]) with ECS, Kubernetes, and Docker backends for orchestration ([`backfill_base.py`][backfill_base], [`replayer_base.py`][replayer_base]) - **Prometheus metrics** source already exists alongside CloudWatch ([`metrics_source.py`][metrics_source]) - **Kubernetes Secrets** support exists alongside AWS Secrets Manager ([`cluster.py`][cluster]) - **Helm charts** in [`deployment/k8s/`][helm] include `valuesForLocalK8s.yaml` and AWS-specific toggles ### Remaining AWS-Coupled Areas 1. [`S3Repo`][S3Repo] in `SnapshotReader` — hardcodes `S3AsyncClient` construction (but implements [`SourceRepo`][SourceRepo] interface) 2. [`S3TupleSink`][S3TupleSink] in `TrafficCapture/tupleSink` — no abstraction layer for tuple output destination 3. **`TrafficReplayer`** main class — inline `S3AsyncClient` creation for tuple output 4. **`captureKafkaOffloader`** — MSK IAM auth with no alternative 5. **Helm charts** — some AWS-specific defaults and assumptions remain in the non-toggled paths ## Proposal ### Phase 1: Cloud-Agnostic Object Storage **Goal:** Enable snapshot reading and tuple storage on GCS and Azure Blob Storage. #### 1a. Snapshot Storage Backends Implement [`SourceRepo`][SourceRepo] for GCS and Azure Blob Storage in `SnapshotReader`: - `GcsRepo implements SourceRepo` — using the Google Cloud Storage client library - `AzureBlobRepo implements SourceRepo` — using the Azure Storage Blob SDK Implement corresponding [`SnapshotCreator`][SnapshotCreator] subclasses: - `GcsSnapshotCreator extends SnapshotCreator` - `AzureBlobSnapshotCreator extends SnapshotCreator` The existing [`SourceRepo`][SourceRepo] and [`SnapshotCreator`][SnapshotCreator] abstractions mean no changes to `RfsCommon`, `RfsPipeline`, or `DocumentsFromSnapshotMigration` are required. #### 1b. Tuple Sink Abstraction Introduce a `TupleSink` interface (or extend the existing sink pattern) with: - [`S3TupleSink`][S3TupleSink] (existing, refactored behind interface) - `GcsTupleSink` - `AzureBlobTupleSink` - `FileSystemTupleSink` (useful for local testing) Update `TrafficReplayer` to accept the sink via dependency injection rather than constructing `S3AsyncClient` inline. #### 1c. Python Console Snapshot Support Add `GcsSnapshot` and `AzureBlobSnapshot` implementations alongside the existing `S3Snapshot` and `FileSystemSnapshot` in the console's [factory dispatch][factories] ([`snapshot.py`][snapshot]). ### Phase 2: Cloud-Agnostic Kafka Authentication **Goal:** Remove the hard dependency on MSK IAM auth for Kafka connectivity. - Make `captureKafkaOffloader` support SASL/SCRAM and mTLS as first-class authentication methods (these are standard Kafka auth mechanisms supported by all managed Kafka providers) - Ensure the Helm charts can configure Kafka brokers without MSK-specific properties - The console's `StandardKafka` and `ScramKafka` classes already handle this on the Python side ([`kafka.py`][kafka]) ### Phase 3: Infrastructure Provisioning and Lifecycle **Goal:** Provide full infrastructure automation — provisioning and deprovisioning — on GCP, Azure, and other providers, matching the turnkey experience that the AWS CDK path provides today. The AWS implementation provisions the entire stack (VPC, ECS/Fargate, MSK, S3, ALB, IAM, security groups) and tears it all down when the migration is complete. Non-AWS users must have the same experience: a single command to stand up all migration infrastructure, and a single command to remove it. The existing repo structure already supports this cleanly. Deployment methods are organized as siblings: ``` deployment/ ├── cdk/ # AWS CDK (TypeScript) ├── k8s/ # Helm charts (Kubernetes) ├── migration-assistant-solution/ # AWS one-click CloudFormation └── terraform/ # NEW — multi-cloud provisioning ├── gcp/ ├── azure/ └── modules/ # Shared modules (K8s config, Helm install) ``` Adding `deployment/terraform/` is purely additive — the CDK and CloudFormation paths remain untouched. #### 3a. Infrastructure Provisioning Modules Create infrastructure-as-code modules that provision the complete migration infrastructure per cloud provider: - **GCP:** GKE cluster, VPC, Cloud Storage bucket, firewall rules, IAM service accounts, Kafka (self-managed on GKE or Confluent Cloud) - **Azure:** AKS cluster, VNet, Blob Storage container, NSGs, managed identities, Kafka (self-managed on AKS or Event Hubs) - **Generic/bare-metal:** Documentation and scripts for environments without a managed K8s offering Each module should: - Accept a minimal configuration (source endpoint, target endpoint, auth credentials, cloud region) - Provision a K8s cluster with the correct storage classes, networking, and access to source/target clusters - Install the Helm charts with the appropriate provider-specific values - Expose a single teardown command that cleanly deprovisions all resources #### 3b. Refactor Helm Charts for Multi-Cloud The existing charts already use a base + overlay pattern with AWS-specific templates gated behind `aws.configureAwsEksResources` in a dedicated `templates/resources/aws/` directory. The refactoring formalizes and extends this pattern. **Add cloud-specific resource directories and values overlays:** ``` templates/resources/ ├── aws/ # Already exists, gated by aws.configureAwsEksResources ├── gcp/ # NEW, gated by gcp.configureGkeResources ├── azure/ # NEW, gated by azure.configureAksResources └── objectStore/ # Refactored from s3/, with cloud-conditional templates valuesEks.yaml # Already exists valuesGke.yaml # NEW valuesAks.yaml # NEW ``` **Specific changes:** - **Object storage:** Refactor `templates/resources/s3/` (which currently uses `aws s3` CLI directly for bucket creation and deletion) into a cloud-conditional `objectStore/` directory with templates for GCS and Azure Blob alongside the existing S3 templates - **Storage classes:** Add GCP Persistent Disk (`pd-ssd`) and Azure Managed Disk (`premium-lrs`) StorageClass definitions in the new cloud-specific directories, following the existing `aws/gp3StorageClass.yaml` pattern - **Node autoscaling:** Gate existing Karpenter NodePool/NodeClass templates and affinity rules so they only apply on EKS. Add equivalent configurations for GKE Node Auto-Provisioning and AKS Cluster Autoscaler - **Argo Workflows artifact storage:** Currently configured for S3. Add GCS and Azure Blob as artifact repository options, selectable via the values overlay - **Observability:** The base OTEL collector config already uses generic Prometheus/OTLP exporters. Ensure the GKE and AKS overlays configure appropriate exporters (e.g., Google Cloud Monitoring, Azure Monitor) rather than inheriting the CloudWatch EMF exporters from the EKS overlay - **Certificate management:** The default path already uses a cloud-agnostic self-signed CA chain via cert-manager. AWS PCA integration is opt-in. No changes needed for non-AWS providers unless they want to integrate with Google CA Service or Azure Key Vault ### Phase 4: End-to-End Validation **Goal:** Verify the complete migration workflow — including infrastructure provisioning and teardown — on non-AWS providers. - Establish a CI test matrix covering GKE and AKS deployments - Test the full lifecycle: provision infrastructure, run migration (metadata + backfill + live capture), deprovision - Test with Aiven for OpenSearch as the target cluster - Document provider-specific configuration and troubleshooting ## What This RFC Does NOT Propose - **Rewriting the CDK deployment** — the AWS CDK path remains as-is for AWS-native users - **Removing AWS support** — all changes are additive - **New migration capabilities** — the scope is portability of existing features, not new functionality ## Aiven's Commitment Aiven is prepared to: - Contribute engineering resources to implement and test these changes - Maintain the GCS and Azure storage backends going forward - Provide CI infrastructure for multi-cloud testing - Contribute documentation for non-AWS deployment scenarios ## Open Questions 1. **Module organization:** Should GCS/Azure implementations live in `SnapshotReader` alongside `S3Repo`, or in separate Gradle modules (e.g., `SnapshotReaderGcs`, `SnapshotReaderAzure`) to avoid adding cloud SDK dependencies to the default build? 2. **Helm chart structure:** Should provider-specific values files live in the main chart or as separate sub-charts? 3. **CI ownership:** How should multi-cloud CI be structured? Aiven can host the GCP/Azure test infrastructure, but the test definitions should live in the main repo. 4. **Release cadence:** Should multi-cloud support be gated behind a feature flag initially, or ship as generally available from the start? 5. **IaC tooling:** Terraform is the assumed default for infrastructure provisioning in Phase 3, but alternatives like OpenTofu, Pulumi, or Crossplane are also viable. What does the community prefer? ## References - [OpenSearch Migration Assistant Documentation](https://docs.opensearch.org/3.0/migration-assistant/) - [opensearch-project/opensearch-migrations](https://github.com/opensearch-project/opensearch-migrations) - [Existing K8s Helm charts](https://github.com/opensearch-project/opensearch-migrations/tree/main/deployment/k8s) [SourceRepo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsCommon/src/main/java/org/opensearch/migrations/bulkload/common/SourceRepo.java#L6 [BlobSource]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsCommon/src/main/java/org/opensearch/migrations/bulkload/common/BlobSource.java#L10-L11 [S3Repo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/SnapshotReader/src/main/java/org/opensearch/migrations/bulkload/common/S3Repo.java#L23 [FileSystemRepo]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/SnapshotReader/src/main/java/org/opensearch/migrations/bulkload/common/FileSystemRepo.java#L12 [SnapshotCreator]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RFS/src/main/java/org/opensearch/migrations/bulkload/common/SnapshotCreator.java#L15 [S3SnapshotCreator]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RFS/src/main/java/org/opensearch/migrations/bulkload/common/S3SnapshotCreator.java#L10 [RequestTransformer]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/RfsHttp/src/main/java/org/opensearch/migrations/bulkload/common/http/RequestTransformer.java#L9 [IAuthTransformerFactory]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/TrafficCapture/trafficReplayer/src/main/java/org/opensearch/migrations/transform/IAuthTransformerFactory.java#L7 [S3TupleSink]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/TrafficCapture/tupleSink/src/main/java/org/opensearch/migrations/replay/sink/S3TupleSink.java#L38 [factories]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/factories.py#L48 [backfill_base]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/backfill_base.py#L28 [replayer_base]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/replayer_base.py#L52 [metrics_source]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/metrics_source.py#L66 [cluster]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/cluster.py#L130 [snapshot]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/snapshot.py#L74 [kafka]: https://github.com/opensearch-project/opensearch-migrations/blob/cae2480e4f205a64fe09f45974a7ffb114dcdda7/migrationConsole/lib/console_link/console_link/models/kafka.py#L158 [helm]: https://github.com/opensearch-project/opensearch-migrations/tree/cae2480e4f205a64fe09f45974a7ffb114dcdda7/deployment/k8s

So, I am still little confused as to what is the exact status of support of Migration Assistant for non-AWS K8S especially on-prem deployment

pablo · May 11, 2026, 6:51pm

@farman I followed this documentation.

git clone https://github.com/opensearch-project/opensearch-migrations

cd opensearch-migrations/deployment/k8s

The current deployment/k8s/charts/aggregates/migrationAssistantWithArgo/values.yml is using the Docker Hub repository. But all images are on the AWS repo. So I changed that to:

images:
  captureProxy:
    repository: public.ecr.aws/opensearchproject/opensearch-migrations-traffic-capture-proxy
    tag: latest
  trafficReplayer:
    repository: public.ecr.aws/opensearchproject/opensearch-migrations-traffic-replayer
    tag: latest
  reindexFromSnapshot:
    repository: public.ecr.aws/opensearchproject/opensearch-migrations-reindex-from-snapshot
    tag: latest
  migrationConsole:
    repository: public.ecr.aws/opensearchproject/opensearch-migrations-console
    tag: latest
  installer:
    repository: public.ecr.aws/opensearchproject/opensearch-migrations-console
    tag: latest

Then I’ve just run this command on my k8s cluster (1 master + 3 workers)

helm install --create-namespace -n ma ma   deployment/k8s/charts/aggregates/migrationAssistantWithArgo

As a result, I’ve got this.

argo-server-7995446fd-8rssr                                1/1     Running   0          3m22s
argo-server-7995446fd-d95th                                1/1     Running   0          3m22s
argo-workflow-controller-7c969bf5b6-67mkl                  1/1     Running   0          3m22s
argo-workflow-controller-7c969bf5b6-q9mfj                  1/1     Running   0          3m22s
cert-manager-579475b66-fs7jz                               1/1     Running   0          3m22s
cert-manager-cainjector-7ff59b747c-29ptc                   1/1     Running   0          3m22s
cert-manager-webhook-74d7d5f5fd-wjwxt                      1/1     Running   0          3m22s
fluent-bit-k8nw8                                           1/1     Running   0          3m24s
fluent-bit-rnk4j                                           1/1     Running   0          3m24s
fluent-bit-s8jff                                           1/1     Running   0          3m24s
jaeger-559d44f9d8-jvwfq                                    1/1     Running   0          3m24s
kube-prometheus-stack-grafana-65b597d47-wdhqx              3/3     Running   0          3m13s
kube-prometheus-stack-kube-state-metrics-78c6b78b6-wsqrm   1/1     Running   0          3m13s
kube-prometheus-stack-operator-9c8cf6547-kttv6             1/1     Running   0          3m13s
kube-prometheus-stack-prometheus-node-exporter-7nrtg       1/1     Running   0          3m13s
kube-prometheus-stack-prometheus-node-exporter-ljkzd       1/1     Running   0          3m13s
kube-prometheus-stack-prometheus-node-exporter-pz67t       1/1     Running   0          3m13s
kube-prometheus-stack-prometheus-node-exporter-zwbh9       1/1     Running   0          3m13s
localstack-5975984cd6-mljrx                                1/1     Running   0          3m24s
migration-console-0                                        1/1     Running   0          2m8s
otel-collector-2964f                                       1/1     Running   0          2m8s
otel-collector-9vsqt                                       1/1     Running   0          2m8s
otel-collector-c4rl8                                       1/1     Running   0          2m8s
otel-collector-pnxrj                                       1/1     Running   0          2m8s
prometheus-kube-prometheus-stack-prometheus-0              3/3     Running   0          3m9s
strimzi-cluster-operator-55f77779bf-cjwpn                  1/1     Running   0          3m23s

pablo@kube-master-1:~/opensearch-migrations$ kubectl exec -it migration-console-0 -n ma -- /bin/bash                                                      Welcome to the Migration Assistant Console
(18:49:25) migration-console (~) -> console --version
Migration Assistant 3.2.0
(18:49:30) migration-console (~) ->

I haven’t tested any migration yet, but at least this gives a starting point.

Topic		Replies	Views
Migration Assistant without AWS Services Index Management upgrade	4	45	April 23, 2026
Error creating Migration Assistant cloudformation stack OpenSearch migration	2	76	June 12, 2025
Kubernetes operator support for the fork OpenDistro discuss	32	7549	May 3, 2022
Aiven offers OpenSearch as a service OpenDistro	9	634	October 7, 2021
What storage service can I use for snapshots? OpenSearch configure	11	1767	August 22, 2022

Related topics