Skip to content

serve: serialize credential_process with per-profile flock#31

Merged
jgowdy-godaddy merged 1 commit intomainfrom
fix/serve-refresh-flock
Apr 17, 2026
Merged

serve: serialize credential_process with per-profile flock#31
jgowdy-godaddy merged 1 commit intomainfrom
fix/serve-refresh-flock

Conversation

@jgowdy-godaddy
Copy link
Copy Markdown
Contributor

Summary

Two AWS CLI invocations firing `credential_process = awsenc serve` in parallel on the same profile, with the cache in Refresh or Expired state, would each independently run the transparent-reauth chain (Okta-session decrypt → SAML → STS AssumeRoleWithSAML) or fall back to printing stale cached creds. Both writers then raced the cache update. Result: duplicate STS traffic, possible Okta rate-limit hit, and the losing writer's fresh credentials silently overwritten.

Take an exclusive advisory lock on `.lock` before reading the cache, hold it across all state branches (Fresh / Refresh / Expired), and release on drop. A second caller blocks at `lock_exclusive`, waits for the first to finish, then re-reads the now-refreshed cache and prints its credentials — single STS call, single prompt chain, consistent session_start.

Uses `fs4` for cross-platform `flock` / `LockFileEx`. The lock file is a zero-byte sidecar that stays on disk; crash recovery is the OS's job (inode lock released on handle close).

Test plan

  • `cargo test --workspace` — all 7 test groups pass (272 tests).
  • `cargo clippy --workspace --all-targets -- -D warnings` clean.
  • `cargo fmt --all -- --check` clean.
  • CI green on macOS / Linux / Windows.

Two AWS CLI invocations firing `credential_process = awsenc serve`
in parallel on the same profile, with the cache in Refresh or
Expired state, would each independently run the transparent-reauth
chain (Okta-session decrypt → SAML → STS AssumeRoleWithSAML) or
fall back to printing stale cached creds. Both writers then raced
the cache update. Duplicate STS traffic, possible Okta rate-limit
hit, and the losing writer's fresh creds silently overwritten.

Take an exclusive advisory lock on `<cache>.lock` before reading
the cache, hold it across all state branches (Fresh / Refresh /
Expired), and release on drop. A second caller blocks at
lock_exclusive, waits for the first to finish, then re-reads the
now-refreshed cache and prints its credentials — single STS call,
single prompt chain, consistent session_start.

Uses fs4 for cross-platform flock / LockFileEx. The lock file is
a zero-byte sidecar that stays on disk; crash recovery is the OS's
job (inode lock released on handle close).
@jgowdy-godaddy jgowdy-godaddy merged commit da9ff05 into main Apr 17, 2026
3 checks passed
jgowdy-godaddy pushed a commit that referenced this pull request Apr 17, 2026
main's #31 landed a per-profile serve flock that overlaps my own flock
commit. Their implementation is nearly identical; I take theirs
(serve_lock_path_for_cache + matching ServeLock) and drop the duplicate
serve_lock_path + ServeLock definitions from my branch. Call sites use
theirs' slightly more tolerant cache::cache_path().map().unwrap_or_else()
fallback.

Our remaining unique changes on top of main:
- main.rs dispatch: validate profile arg before create_storage (fail-fast
  on serve/exec without --profile, cleaner error message, avoids
  popping Keychain prompts on malformed invocations)
- cache.rs: APL1 envelope wrap_for_encrypt / unwrap_after_decrypt +
  counter sidecar helpers
- auth.rs: header-first then wrap-each-secret ordering, counter
  persistence after write
- serve.rs: decrypt_aws_credentials_with_envelope used by the Fresh +
  Refresh-fallback paths; transparent-reauth re-wraps under the new
  header
- exec.rs: unwrap_after_decrypt in get_cached_credentials
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants