Add batch DELETE/UPDATE samples for datasets exceeding 3k row limit by rmconstantin · Pull Request #698 · aws-samples/aurora-dsql-samples

rmconstantin · 2026-03-17T02:14:56Z

Demonstrates sequential and parallel batch processing patterns for Aurora DSQL with OCC retry logic and recommended connection management. Includes Python (psycopg2), Java (pgJDBC), and Node.js (node-postgres) implementations.
Fixes #693 .

By submitting this pull request, I confirm that my contribution is made under
the terms of the MIT-0 license.

Thank you for your contribution!

Benjscho · 2026-03-17T20:26:28Z

Can you add the pycache path to gitignore?

Benjscho · 2026-03-17T20:28:39Z

+        while (true) {
+            try (Connection conn = pool.getConnection()) {
+                conn.setAutoCommit(false);
+                String sql = "UPDATE " + table + " SET " + setClause + ", updated_at = NOW()"


How does this ensure progress over all of the source items?

Benjscho · 2026-03-17T20:29:10Z

+ *   gradle run --args="--endpoint &lt;cluster-endpoint&gt; [--user admin]
+ *              [--batch-size 1000] [--num-workers 4]"
+ */
+public class Main {


Could you add an integ test that runs these batch ops?

Benjscho · 2026-03-17T20:30:54Z

+);
+
+-- Create an asynchronous index on the category column.
+-- Aurora DSQL requires CREATE INDEX ASYNC for tables with existing rows.


For all tables, maybe delete this comment

Benjscho · 2026-03-17T20:32:07Z

@@ -0,0 +1,52 @@
+# Aurora DSQL Batch Operations


I think we might be better organizing these examples under the specific language/driver pairing instead of having it as a top level dir.

Can we also add integ tests for each example? There should be patterns for how to do that in each language

Benjscho · 2026-03-17T20:40:59Z

+     * @param connection a JDBC connection (autoCommit should be false)
+     * @param operation  the database operation to execute
+     * @param maxRetries maximum retry attempts (default 3)
+     * @param baseDelay  base delay in seconds for backoff (default 0.1)


Nit: can we make baseDelay milliseconds instead?

Benjscho · 2026-03-17T20:41:28Z

+ */
+public class Repopulate {
+
+    private static final String INSERT_SQL =


What's going on with the repopulate fn vs the batch setup script?

rmconstantin · 2026-03-18T17:31:31Z

Updated the code to address all comments.

Batch operations are now in standalone directories under each language (java/batch_operations/, javascript/batch_operations/, python/batch_operations/).
baseDelay is now base_delay_ms.
Initial table+index setup script comments updated.
Got rid of the Repopulate fn .
Integ tests added for each language.
Added an outer retry to make sure all rows are processed (keep batching until done, and if OCC conflicts persist on a single batch, get a fresh connection and try again).

Ready for another look.

Demonstrates sequential and parallel batch processing patterns for Aurora DSQL with OCC retry logic and hashtext() partitioning. Includes Python (psycopg2), Java (pgJDBC), and Node.js (node-postgres) implementations.

…dded tests.

- Add SELECT COUNT(*) post-check after each batch loop to verify all matching rows were processed (sequential and parallel, all 3 languages) - Update integration tests to seed data via psql -f batch_test_setup.sql - Add connect_timeout to Python pool creation for IPv6 fallback

Benjscho · 2026-03-20T20:22:03Z

What's this jar file for? Should we be shipping it?

Benjscho · 2026-03-20T20:34:51Z

Should this and gradelw be checked in or gitignored?

Benjscho reviewed Mar 17, 2026

View reviewed changes

rmconstantin force-pushed the batch-operations branch from 8bc247c to a3525c4 Compare March 18, 2026 16:22

rmconstantin requested a review from Benjscho March 18, 2026 17:31

ralconst added 7 commits March 19, 2026 16:43

Add batch DELETE/UPDATE samples for datasets exceeding 3k row limit

cf20d70

Demonstrates sequential and parallel batch processing patterns for Aurora DSQL with OCC retry logic and hashtext() partitioning. Includes Python (psycopg2), Java (pgJDBC), and Node.js (node-postgres) implementations.

Added cleanup instructions to READMEs.

cffd305

switch Java to DSQL JDBC connector

dd05b99

Use AuroraDSQLPool for Node.js, HikariCP for Java

d92baff

Reorganized batch operations into batch_operations/ subdirectories. A…

4a674ae

…dded tests.

Move batch operations to standalone directories under each language

3e8768c

rmconstantin force-pushed the batch-operations branch from 63b7d16 to 94efc68 Compare March 19, 2026 23:44

Add __pycache__ and *.pyc to .gitignore

b056a3d

Benjscho reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch DELETE/UPDATE samples for datasets exceeding 3k row limit#698

Add batch DELETE/UPDATE samples for datasets exceeding 3k row limit#698
rmconstantin wants to merge 8 commits intoaws-samples:mainfrom
rmconstantin:batch-operations

rmconstantin commented Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

Benjscho Mar 17, 2026

Uh oh!

rmconstantin commented Mar 18, 2026

Uh oh!

Benjscho Mar 20, 2026

Uh oh!

Benjscho Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rmconstantin commented Mar 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rmconstantin commented Mar 18, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants