Cloud Migration & Cost Optimization vendor-lock-inportabilitystrategy

Cloud Exit Strategy: What Data Teams Should Plan For

By Infra IT Consulting · April 14, 2024 · 9 min read

Content on this site is AI-assisted and personally reviewed by Hazem. Learn more

Cloud exit strategy is the topic that most cloud migrations do not discuss. Understandably — when a team is investing months of engineering effort and organisational capital in migrating to AWS, the last thing they want to think about is how they would leave. But the organisations that build thoughtful data architectures on cloud platforms are precisely the ones that give themselves the most flexibility — whether they ever need to exercise it or not.

A cloud exit plan is not a statement of intent to leave. It is a risk management exercise, a negotiating position, and an architectural discipline that produces better systems regardless of whether you ever migrate away from AWS. This post covers the specific considerations for data teams: where lock-in actually lives, what it costs to exit, and the architectural patterns that preserve optionality.

Where Lock-In Actually Lives for Data Teams

Not all AWS dependencies carry the same exit cost. The distinction that matters is between dependencies on AWS infrastructure (compute, networking, storage) and dependencies on AWS-proprietary services and data formats.

Low lock-in risk:

EC2 compute — applications that run on EC2 can run on any Linux server or in any cloud
S3 storage with open data formats (Parquet, ORC, Avro, CSV) — data in open formats on S3 is portable; the storage layer changes but the data is accessible from any compute platform
Amazon MSK (Managed Kafka) — Kafka is an open standard; MSK runs standard Apache Kafka APIs
Amazon EMR running Apache Spark — Spark is open source; EMR jobs written in standard PySpark can run on Databricks, GCP Dataproc, Azure HDInsight, or self-managed Spark

High lock-in risk:

Amazon Redshift — proprietary columnar warehouse with its own SQL dialect, distribution key model, and internal data format. Exiting Redshift requires exporting data (via UNLOAD to S3) and migrating to another warehouse platform.
AWS Glue Data Catalog — the Catalog’s Hive compatibility means table definitions are portable in principle, but Glue-specific Crawlers, Classifiers, and Glue ETL job scripts are AWS-specific.
AWS Glue ETL jobs (DynamicFrame API) — the GlueContext and DynamicFrame classes are proprietary; native PySpark code is portable but Glue-specific API code is not.
Amazon Kinesis Data Streams — Kinesis uses a proprietary API that does not match the Kafka API; applications must be rewritten to use Kafka or another streaming platform on exit.
Amazon Athena — Athena queries against S3 are portable (Presto/Trino compatible), but Athena-specific features (federated queries, Athena Workgroups) are proprietary.
AWS Lake Formation — fine-grained access control configured in Lake Formation is not portable; equivalent controls would need to be rebuilt in any replacement platform.

Understanding this map allows data teams to make deliberate choices about where to accept lock-in (because the AWS-managed service delivers sufficient value to justify the dependency) and where to use open-standard equivalents.

The Real Cost of Exiting AWS for a Data Team

Exit cost analysis is rarely done rigorously because it is uncomfortable. Here is a realistic breakdown for a mid-size data platform:

Data egress costs. AWS charges $0.09 per GB for data transferred out of AWS to the internet (lower for Direct Connect). A data lake with 100 TB of data costs $9,000 USD in egress fees to move to another cloud or on-premises. A 1 PB data lake costs $90,000 in egress alone. This cost is real and should be factored into any exit scenario analysis.

Schema and query migration. Redshift SQL to BigQuery, Snowflake, or another warehouse requires schema conversion. The effort is similar to a Teradata-to-Redshift migration — SCT-like tools handle a portion, but complex logic requires manual rewriting. Budget 1–2 engineer-days per complex view, stored procedure, or CTAS query.

ETL pipeline rewrite. AWS Glue jobs using DynamicFrame APIs need rewriting to native PySpark or to the target platform’s equivalent. AWS Step Functions orchestration needs rewriting to Apache Airflow, Prefect, or another orchestrator. Budget 2–5 days per complex pipeline.

Certification and compliance recertification. If your AWS architecture underpins compliance certifications (PCI DSS, SOC 2, HIPAA), migrating to a new platform requires re-certifying against the new infrastructure. This can add 3–6 months to an exit timeline.

Total realistic exit timeline for a mature data platform: 12–24 months from decision to full migration, depending on platform complexity. This timeline is not a reason to avoid building on AWS — it is context for understanding that exit is a programme, not a weekend project, and that planning for it in advance reduces the timeline significantly.

Architectural Patterns That Preserve Portability

Preserving optionality does not require avoiding AWS-managed services. It requires deliberate architectural choices in a few critical areas.

Use open data formats on S3 as the canonical storage layer. This is the single most important portability decision. If your data lake stores data in Parquet or ORC on S3, the data itself is readable by any modern analytics engine — Spark on any platform, BigQuery (via Google’s S3 federation), Snowflake External Tables, DuckDB, Presto/Trino. Your exit path for data begins with the data already being in a neutral format.

Separate orchestration from compute. Pipelines orchestrated by Apache Airflow (via Amazon MWAA) are more portable than pipelines orchestrated by AWS Step Functions, because the Airflow DAG definition is standard Python that runs on self-managed Airflow or on any managed Airflow provider. If you use Step Functions, document the workflow logic thoroughly so that a migration to Airflow is a translation exercise rather than a discovery exercise.

Write pipeline logic in standard PySpark, not Glue DynamicFrame APIs. Glue jobs can be written using the standard pyspark library with GlueContext used only for connectivity (reading from Glue Catalog, writing to JDBC targets). Pipeline transformation logic written in standard PySpark is portable to EMR, Databricks, or any Spark environment with minimal modification.

# Portable: standard PySpark transformation (runs on Glue, EMR, Databricks)
from pyspark.sql import functions as F

def transform_transactions(df):
    return (
        df
        .filter(F.col("status") == "COMPLETED")
        .withColumn("amount_cad", F.col("amount_usd") * F.col("exchange_rate"))
        .withColumn("processing_date", F.to_date(F.col("event_timestamp")))
        .dropDuplicates(["transaction_id"])
    )

# Less portable: Glue-specific DynamicFrame API
# from awsglue.transforms import Filter, ApplyMapping
# Use only for Glue Catalog connectivity, not core transformation logic

Document your Glue Data Catalog as DDL. Maintain Terraform or SQL DDL that recreates every table, view, and partition scheme in the Glue Catalog. If you ever need to migrate the catalog to a different metastore (Hive, Unity Catalog, BigQuery Datasets), the DDL-as-code approach means you have a portable definition of your schema.

Multi-Cloud and Hybrid as Partial Exit Strategies

Some organisations reduce AWS dependency without fully exiting by distributing workloads across providers. Common patterns:

AWS for data lake storage + Snowflake for warehousing. Snowflake can query data directly from S3 via External Tables or Data Sharing. This pattern avoids Redshift lock-in while retaining S3 as the canonical storage layer. The trade-off is data movement costs and query latency for large scans through Snowflake External Tables.

AWS for compute + Cloudflare R2 or Backblaze B2 for archival storage. For cold data that is rarely accessed, storing in an S3-compatible object store without egress fees (Cloudflare R2 charges no egress) reduces long-term lock-in risk for archival tiers. Active pipeline data remains on S3; aged data moves to egress-free storage.

Apache Iceberg as a portable table format. Iceberg is an open table format that adds ACID transactions, schema evolution, and partition pruning to data lake storage on S3, GCS, or Azure ADLS. Iceberg tables can be queried by Athena, EMR Spark, Snowflake, BigQuery, and Trino — making the table format itself neutral. Migrating from Iceberg on S3 with Athena to Iceberg on S3 with BigQuery changes only the query engine, not the data or table format.

Practical Steps: Building Your Exit Plan

A useful cloud exit plan for a data team covers:

Data inventory with exit cost estimate. List all datasets by volume and classify by format (open vs. proprietary). Calculate egress cost for the top 20 data stores.
Service dependency map. List every AWS service used and classify each as low, medium, or high lock-in risk.
Migration effort estimate by layer. Storage, compute, orchestration, and access control — estimate the engineering effort for each layer independently.
Alternative platform mapping. For each high-lock-in service, identify the equivalent in at least one alternative provider.
Annual exit plan review. As the platform evolves, update the exit plan. Lock-in risk accumulates with each new proprietary service adoption.

For teams that have recently migrated to AWS and are now thinking about long-term architecture, combining exit planning with the Well-Architected Review is an efficient approach — the Review naturally surfaces areas of high lock-in. See Applying the AWS Well-Architected Framework to Data Workloads for how to conduct that review.

For teams at the start of a migration journey, building portability into the initial architecture is far cheaper than retrofitting it later. See our guide on On-Premises to AWS Migration for architecture decisions that preserve optionality from day one.

Conclusion

A cloud exit plan is not pessimism — it is engineering discipline. The organisations that think through exit scenarios tend to build better architectures: more modular, more standards-compliant, and more auditable. The data portability practices that make exit feasible — open formats, documented schemas, standard orchestration — are also the practices that make your AWS platform easier to operate, test, and evolve.

Infra IT Consulting works with data teams across Canada, the UK, and Africa on cloud architecture reviews that consider long-term flexibility alongside immediate delivery needs. If you would like to assess your current platform’s portability or build an exit plan as part of a governance programme, contact us.

Cloud Migration & Cost Optimization

Talk to our team →

Cloud Exit Strategy: What Data Teams Should Plan For

Where Lock-In Actually Lives for Data Teams

The Real Cost of Exiting AWS for a Data Team

Architectural Patterns That Preserve Portability

Multi-Cloud and Hybrid as Partial Exit Strategies

Practical Steps: Building Your Exit Plan

Conclusion

Related posts

Applying the AWS Well-Architected Framework to Data Workloads

Amazon Redshift Cost Tuning: Getting More from Every Dollar

FinOps for Data Engineering: Building a Cost-Conscious Culture

Where Lock-In Actually Lives for Data Teams

The Real Cost of Exiting AWS for a Data Team

Architectural Patterns That Preserve Portability

Multi-Cloud and Hybrid as Partial Exit Strategies

Practical Steps: Building Your Exit Plan

Conclusion

Related posts

Applying the AWS Well-Architected Framework to Data Workloads

Amazon Redshift Cost Tuning: Getting More from Every Dollar

FinOps for Data Engineering: Building a Cost-Conscious Culture

We value your privacy