This code example demonstrates how to perform batch DELETE and UPDATE operations in Amazon Aurora DSQL when working with datasets exceeding the 3,000-row transaction mutation limit. The example uses pgJDBC with the Aurora DSQL JDBC Connector for automatic IAM authentication.
Two patterns are provided:
- Sequential: A single-threaded loop that processes rows in configurable-size batches (default 1,000), committing each batch as a separate transaction.
- Parallel: Multiple worker threads each process a disjoint partition of the dataset concurrently using
hashtext()partitioning, with each worker running its own batch loop.
Both patterns include OCC (Optimistic Concurrency Control) retry logic with exponential backoff.
Aurora DSQL limits each transaction to 3,000 row mutations. To DELETE or UPDATE more than 3,000 rows, you must split the work into batches, each committed as a separate transaction.
The parallel pattern partitions rows across worker threads using
abs(hashtext(CAST(id AS text))) % num_workers = worker_id, ensuring workers operate on disjoint sets
of rows and avoid OCC conflicts with each other.
- Running this code might result in charges to your AWS account.
- Each batch is a separate transaction. A failure mid-way leaves the dataset partially modified. Design your operations to be idempotent where possible.
- You must have an AWS account, and have your default credentials and AWS Region configured as described in the Globally configuring AWS SDKs and tools guide.
- Java Development Kit (JDK) 17 or later.
- Gradle (the wrapper is included in this project).
- You must have an Aurora DSQL cluster. For information about creating a cluster, see the Getting started with Aurora DSQL guide.
Before running the examples, create and populate the test table. Aurora DSQL uses IAM authentication,
so you need to generate a fresh auth token each time you connect with psql:
export CLUSTER_ENDPOINT="<your-cluster-endpoint>"
export CLUSTER_REGION="<your-region>"
# Generate a fresh auth token (expires in 3600 seconds)
export PGPASSWORD=$(aws dsql generate-db-connect-admin-auth-token \
--hostname $CLUSTER_ENDPOINT \
--region $CLUSTER_REGION \
--expires-in 3600)
psql "host=$CLUSTER_ENDPOINT dbname=postgres user=admin sslmode=verify-full sslrootcert=system" \
-f batch_test_setup.sqlSet environment variables for your cluster:
# e.g. "admin"
export CLUSTER_USER="admin"
# e.g. "foo0bar1baz2quux3quuux4.dsql.us-east-1.on.aws"
export CLUSTER_ENDPOINT="<your-cluster-endpoint>"Build and run:
./gradlew run --args="--endpoint $CLUSTER_ENDPOINT --user $CLUSTER_USER"| Option | Default | Description |
|---|---|---|
--endpoint |
(required) | Aurora DSQL cluster endpoint |
--user |
admin |
Database user |
--batch-size |
1000 |
Rows per batch transaction (must be < 3000) |
--num-workers |
4 |
Number of parallel worker threads |
After running the demo, drop the test table to avoid unnecessary storage:
# Generate a fresh token if the previous one expired
export PGPASSWORD=$(aws dsql generate-db-connect-admin-auth-token \
--hostname $CLUSTER_ENDPOINT \
--region $CLUSTER_REGION \
--expires-in 3600)
psql "host=$CLUSTER_ENDPOINT dbname=postgres user=admin sslmode=verify-full sslrootcert=system" \
-c "DROP TABLE IF EXISTS batch_test;"Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0