Skip to content

feat(uploads): direct-to-storage presigned uploads (single + multipart)#74

Merged
zfarrell merged 6 commits into
mainfrom
feat/presigned-upload
Jun 26, 2026
Merged

feat(uploads): direct-to-storage presigned uploads (single + multipart)#74
zfarrell merged 6 commits into
mainfrom
feat/presigned-upload

Conversation

@zfarrell

Copy link
Copy Markdown
Contributor

Adds ergonomic direct-to-storage uploads: Client::upload_file opens an upload session, PUTs bytes straight to object storage via server-minted presigned URLs (single PUT for small files, bounded-concurrency multipart for large), then finalizes — zero S3/AWS deps and no legacy /v1/files path. Configurable max_concurrency (default 10) with a memory-budget-bounded in-flight cap, automatic part-size scaling, header-isolated storage PUTs, and integrity verified at finalize; validated end-to-end against prod for both single-PUT and multipart (148 tests).

Comment thread tests/presigned_uploads.rs Outdated
Comment on lines +1181 to +1192
fn respond(&self, _req: &Request) -> ResponseTemplate {
let now = self.active.fetch_add(1, Ordering::SeqCst) + 1;
self.peak.fetch_max(now, Ordering::SeqCst);
// We cannot decrement after the delay from here, so model "in-flight"
// as entries during the delay window: the delay keeps responses pending
// long enough that all admitted tasks overlap, and admission is bounded
// by the cap. Decrement immediately is fine because peak is a max.
self.active.fetch_sub(1, Ordering::SeqCst);
ResponseTemplate::new(200)
.insert_header("ETag", "\"c\"")
.set_delay(self.delay)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (not blocking) This concurrency tracker can't actually detect an unbounded run. active is incremented and decremented synchronously inside respond, before set_delay takes effect — so the increment never persists across the 50ms delay window. Two respond() calls would have to execute their three atomics in literally overlapping nanoseconds to record peak >= 2. In practice peak reads ~1 regardless of the real in-flight count, so the peak <= 2 assertion would pass even if the JoinSet bound were removed and all 6 parts ran at once. The test gives false confidence in the very property it's named for.

To genuinely measure overlap, keep active incremented for the duration of the delay (e.g. have the responder block/sleep itself while holding the count, or model in-flight with an async responder that decrements after the delay) rather than decrementing immediately.

Comment thread src/uploads.rs
Comment on lines +315 to +322
let create = models::CreateUploadRequest {
declared_size_bytes,
content_type: opts.content_type.clone().map(Some),
content_encoding: opts.content_encoding.clone().map(Some),
filename: filename.map(Some),
part_size: Some(Some(part_size_hint_i64)),
..models::CreateUploadRequest::new(declared_size_bytes)
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: (not blocking) declared_size_bytes is set explicitly and supplied via ..CreateUploadRequest::new(declared_size_bytes), so the field is written twice. Harmless, but the struct-update base is doing nothing the explicit fields don't already cover except defaulting the rest — slightly clearer to either drop the explicit declared_size_bytes line or document why both are present.

claude[bot]
claude Bot previously approved these changes Jun 26, 2026

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the presigned-upload module, wiring, and test suite. Orchestration, storage-PUT header isolation, finalize-exactly-once, retryable part PUTs, memory-budgeted concurrency, and size-overflow handling all look correct and are well documented. Two non-blocking notes left inline. LGTM.

@claude claude Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both prior nits are resolved in the code:

  • The concurrency test now uses a raw-TCP storage server that genuinely holds in-flight PUTs across a hold duration and asserts peak reaches and is bounded by the cap — it measures real overlap rather than synchronously-decremented counters.
  • The duplicate declared_size_bytes write is gone; the value now comes solely from CreateUploadRequest::new(..).

The new module is well-structured and thoroughly tested: bounded-concurrency multipart with positioned per-part reads, retry-safe buffered part PUTs, single-shot finalize, strict storage-PUT header isolation, and complete error/edge-case coverage (overflow, malformed sessions, part-URL count mismatch, missing ETag, storage non-2xx). No blocking issues.

@zfarrell zfarrell merged commit f8e6bf9 into main Jun 26, 2026
5 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant