Aurora DSQL implements a 3,000 row mutation limit to enable efficient coordination and scaling. When working with large DELETE or UPDATE operations, batching is essential. This code sample demonstrates how to handle this across Python, Java, and JavaScript.
What the samples cover:
Two patterns, each implemented in all three languages:
- Sequential batch processing — single-threaded loop that processes rows in batches of N (default 1,000), committing each batch as a separate transaction
- Parallel batch processing — multiple worker threads partition rows using abs(hashtext(id::text)) % num_workers and process their partitions concurrently, each running its own batch loop
Both patterns include:
- OCC retry logic with exponential backoff (SQLSTATE 40001 handling)
- Connection pooling using the DSQL connectors (auto IAM token refresh)
- Subquery-based DELETE (DELETE WHERE id IN (SELECT id ... LIMIT N)) since PostgreSQL doesn't support DELETE ... LIMIT
- Subquery-based UPDATE with updated_at tracking to prevent reprocessing
- Table repopulation between DELETE and UPDATE demos
- CLI entry point with configurable batch size and worker count
Languages and drivers:
| Language |
Driver |
Connector |
| Python |
psycopg2 |
aurora-dsql-python-connector (AuroraDSQLThreadedConnectionPool) |
| Java |
pgJDBC |
AWS JDBC Wrapper (AwsWrapperDataSource) |
| JavaScript |
node-postgres |
@aws/aurora-dsql-node-postgres-connector |
Files per language:
- OCC retry module (reusable)
- Sequential batch DELETE
- Sequential batch UPDATE
- Parallel batch DELETE (hashtext partitioning)
- Parallel batch UPDATE (hashtext partitioning)
- Repopulate helper (restores test data between demos)
- Main entry point with CLI args
- README with prerequisites, setup, and usage instructions
Shared:
- SQL setup script (creates
batch_test table, inserts 5,000 rows in batches of 1,000)
Proposed structure:
I have this organized as a batch-operations/ top-level directory following the existing repo conventions (language/driver/src/). Let me know if we should these integrated differently — for example, as additional files within the existing python/psycopg2/, java/pgjdbc/, and javascript/node-postgres/ directories.
batch-operations/
├── README.md
├── sql/batch_test_setup.sql
├── python/psycopg2/
│ ├── README.md
│ ├── requirements.txt
│ └── src/
├── javascript/node-postgres/
│ ├── README.md
│ ├── package.json
│ └── src/
└── java/pgjdbc/
├── README.md
├── build.gradle
└── src/main/java/...
The parallel pattern is inspired by the re:Invent DAT401 parallel worker approach.
Aurora DSQL implements a 3,000 row mutation limit to enable efficient coordination and scaling. When working with large DELETE or UPDATE operations, batching is essential. This code sample demonstrates how to handle this across Python, Java, and JavaScript.
What the samples cover:
Two patterns, each implemented in all three languages:
Both patterns include:
Languages and drivers:
aurora-dsql-python-connector(AuroraDSQLThreadedConnectionPool)AwsWrapperDataSource)@aws/aurora-dsql-node-postgres-connectorFiles per language:
Shared:
batch_testtable, inserts 5,000 rows in batches of 1,000)Proposed structure:
I have this organized as a
batch-operations/top-level directory following the existing repo conventions (language/driver/src/). Let me know if we should these integrated differently — for example, as additional files within the existingpython/psycopg2/,java/pgjdbc/, andjavascript/node-postgres/directories.The parallel pattern is inspired by the re:Invent DAT401 parallel worker approach.