Infra IT Consulting logo Infra ITC
AWS Data Engineering delta-lakes3acid

Implementing Delta Lake on AWS: ACID Transactions for S3

By Infra IT Consulting · · 9 min read

Object storage was never designed for transactional workloads. Amazon S3 provides eleven nines of durability and practically unlimited scale, but its eventual consistency model (now strong consistency since December 2020) and lack of atomic multi-object operations create real problems when you try to run concurrent writers, perform partial updates, or recover from a failed job that wrote half its output before crashing. For years, data teams worked around these limitations by scheduling jobs carefully, using partition-level overwrites, and accepting that their data lake was never quite the reliable source of truth they needed it to be. Delta Lake changes that equation entirely.

Delta Lake is an open-source storage layer developed by Databricks that brings ACID transactions, scalable metadata handling, and data versioning to object stores like S3. It sits between your compute engine (Spark, Glue, or EMR) and your S3 buckets, maintaining a transaction log that records every change to the table. The result is a storage format that behaves like a database while remaining open, accessible, and cost-effective at data lake scale.

How Delta Lake Works on S3

A Delta table is simply a directory in S3 containing Parquet data files and a _delta_log subdirectory. The transaction log is a sequence of JSON files, each representing a single committed transaction. Every read or write operation consults the log first to determine which Parquet files constitute the current version of the table. This design means Delta tables are readable by any engine that understands the format — Spark, Trino, Athena (with connector support), and increasingly native AWS services.

When a writer commits a transaction, it atomically writes a new JSON log entry listing which files were added and which were removed. If two writers attempt to commit conflicting changes simultaneously, the Delta protocol uses optimistic concurrency control: the second writer detects the conflict during commit and either retries or raises an error, depending on the operation type. Reads are never blocked by writes, and writers never see partial results from concurrent writers.

Setting up Delta Lake on AWS with EMR requires adding the Delta Lake JAR to your Spark configuration. Here is a minimal spark-defaults.conf for EMR:

spark.jars=/usr/share/aws/delta/lib/delta-core.jar,/usr/share/aws/delta/lib/delta-storage.jar
spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension
spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog

With that configuration in place, creating and writing to a Delta table is straightforward:

from delta.tables import DeltaTable
from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("delta-demo") \
    .getOrCreate()

# Write initial data as a Delta table
df = spark.read.parquet("s3://raw-data-bucket/orders/2024/")
df.write.format("delta").mode("overwrite").save("s3://delta-lake-bucket/orders/")

# Upsert (MERGE) new records into the existing table
new_data = spark.read.parquet("s3://raw-data-bucket/orders/updates/")
delta_table = DeltaTable.forPath(spark, "s3://delta-lake-bucket/orders/")

delta_table.alias("target").merge(
    new_data.alias("source"),
    "target.order_id = source.order_id"
).whenMatchedUpdateAll() \
 .whenNotMatchedInsertAll() \
 .execute()

The MERGE operation is one of Delta Lake’s most valuable features for data engineering. It replaces the cumbersome read-modify-write patterns required with plain Parquet, where a full partition overwrite was often the only safe way to apply updates.

ACID Guarantees and What They Mean in Practice

ACID stands for Atomicity, Consistency, Isolation, and Durability. In the context of Delta Lake on S3, these guarantees have concrete operational implications:

Atomicity means that either all files from a write operation are committed or none are. If your Glue job writes 500 Parquet files and crashes after writing 300, the transaction log will not contain a commit entry for that operation, and readers will never see the partial data. This eliminates the class of “zombie partition” bugs that plague teams using plain S3 writes.

Consistency means that schema enforcement prevents you from accidentally writing data that doesn’t match the table’s declared schema. Delta Lake will reject writes with incompatible column types or missing required fields unless you explicitly enable schema evolution. This catches data quality issues at write time rather than at query time, hours or days later.

Isolation means that readers always see a consistent snapshot of the table, even while writers are actively committing changes. This enables concurrent reads and writes without coordination overhead — a significant operational simplification compared to managing read/write windows manually.

Durability leverages S3’s storage guarantees. Once a transaction is committed (the log entry is written to S3), it is as durable as S3 itself.

Time Travel and Audit Capabilities

Delta Lake retains the full history of all changes to a table, queryable through time travel syntax. This is practically useful for debugging (“what did this table look like before last night’s job ran?”), regulatory compliance (point-in-time data reproduction), and incremental processing (reading only records changed since a given version).

# Read the table as it existed 7 days ago
historical_df = spark.read.format("delta") \
    .option("timestampAsOf", "2024-03-11") \
    .load("s3://delta-lake-bucket/orders/")

# Or reference a specific version number
version_df = spark.read.format("delta") \
    .option("versionAsOf", 42) \
    .load("s3://delta-lake-bucket/orders/")

The transaction log retains metadata indefinitely, but the underlying Parquet data files are only kept as long as the retention period allows (default: 7 days for files no longer referenced by the current version). The VACUUM command removes files older than the retention threshold, reclaiming S3 storage costs while preserving the log history.

For teams building lakehouse architecture on AWS, Delta Lake’s time travel capability is a key differentiator over plain Parquet — it enables point-in-time recovery without maintaining separate backup snapshots.

Schema Evolution Without Downtime

One of the most painful operational realities of Parquet-based data lakes is schema evolution. Adding a column to a Parquet dataset typically requires a full rewrite or careful partition management. Delta Lake handles schema evolution gracefully with two modes:

Schema enforcement (the default) rejects writes that don’t match the existing schema, protecting against accidental breaking changes.

Schema evolution (enabled with .option("mergeSchema", "true")) automatically adds new columns from incoming data to the table schema without requiring a full rewrite. Existing rows will have null values for the new columns, which is typically the correct semantic for additive schema changes.

df_with_new_column.write.format("delta") \
    .mode("append") \
    .option("mergeSchema", "true") \
    .save("s3://delta-lake-bucket/orders/")

This capability is particularly valuable in raw ingestion layers where source schemas evolve frequently and you cannot afford downtime to perform table migrations.

Integrating Delta Lake with AWS Glue and Athena

AWS Glue natively supports Delta Lake as of Glue version 3.0. You can run Delta operations directly in Glue ETL jobs without managing JAR dependencies manually — simply enable the Delta Lake connector in your Glue job configuration and set the appropriate Spark extensions. The AWS Glue Data Catalog can store Delta table metadata, making your tables discoverable by Athena and other catalog-aware services.

Athena support for Delta Lake requires enabling the Delta table format in your Athena workgroup and referencing the table through the Glue catalog. Query performance is comparable to native Parquet for most analytical workloads, though extremely large transaction logs can add metadata overhead. Running periodic OPTIMIZE operations on your Delta tables compacts small files and rewrites the log efficiently, which significantly improves both query performance and Athena scan costs.

Delta Lake vs. Apache Iceberg on AWS

Both Delta Lake and Apache Iceberg solve the same fundamental problem — bringing transactional semantics to object storage — but they make different design choices. Apache Iceberg with AWS Glue has deeper native integration in the AWS ecosystem, with Glue, Athena, and EMR all supporting Iceberg natively as a first-class format. Delta Lake has a larger Spark-centric ecosystem and more mature MERGE performance characteristics at very large scales.

For teams already invested in Databricks or Spark-heavy workloads, Delta Lake is the natural choice. For teams building AWS-native pipelines that must be queried by Athena without a Spark cluster, Iceberg’s broader native AWS support is often the deciding factor. Both formats are open, both are production-ready, and the architectural principles are transferable between them.

Conclusion

Delta Lake transforms S3 from a raw file store into a reliable, transactional data platform without sacrificing the cost and scale advantages that make object storage attractive in the first place. ACID transactions eliminate entire categories of pipeline bugs, time travel provides operational recovery capabilities that previously required expensive backup strategies, and schema evolution removes one of the most painful friction points in managing evolving data assets.

For Canadian data teams looking to build a production-grade lakehouse on AWS, Delta Lake running on EMR or Glue is a proven, well-supported path with a growing ecosystem of tooling and community support.

If you’re evaluating Delta Lake for your AWS data platform or need help migrating an existing Parquet-based data lake to a transactional format, contact Infra IT Consulting for a technical assessment.

Related posts