fix(storage): ServerCredentialStore orphan-reason discriminator + metadata-first delete#541
Open
sroussey wants to merge 2 commits into
Open
fix(storage): ServerCredentialStore orphan-reason discriminator + metadata-first delete#541sroussey wants to merge 2 commits into
sroussey wants to merge 2 commits into
Conversation
…in ServerCredentialStore
- Extend CredentialMetadataRow with optional orphanReason discriminator
("vault-write-failed" | "metadata-commit-failed" | "vault-delete-failed")
and export OrphanReason type alias.
- Tag orphan markers in both put() failure branches with the appropriate
reason so operators can distinguish vault-write rollback failures from
commit-step failures.
- Rewrite delete() to remove metadata BEFORE the vault, so a vault delete
failure can't leave a row visible-but-secret-gone. If vault delete fails
after metadata is removed, a best-effort sticky orphan marker
(orphanReason: "vault-delete-failed") is written and a wrapped error is
thrown.
- Rewrite deleteAll() to use the same per-id path and throw AggregateError
when one or more ids fail, leaving orphan markers for the failing ids
while still completing the rest.
- Wrap self-eviction calls in get()/has() in try/catch to preserve the
prior swallow-on-eviction behaviour despite delete() now being throwy.
…lete
- Assert orphanReason on both put() orphan-marker branches
("vault-write-failed" and "metadata-commit-failed").
- Add delete() coverage: vault-failure leaves a sticky orphan marker with
orphanReason "vault-delete-failed" while the row stays invisible to
readers, error wraps the vault cause.
- Add delete() coverage: metadata-failure propagates without touching the
vault (vault.deleteSecret spy must not be called).
- Add deleteAll() coverage: one failing id yields an AggregateError but
other ids still complete and the failing id keeps an orphan marker.
- Add legacy compatibility test: an orphan row without orphanReason
(pre-discriminator persisted state) remains invisible to readers.
- Update the existing "delete removes both" test to assert metadata is
deleted BEFORE the vault via a spy call-order check.
Coverage Report
File CoverageNo changed files found. |
sroussey
added a commit
that referenced
this pull request
Jun 2, 2026
…es, orphan reasons)
Three classes of issue addressed:
1. Key-injection scope escape.
userId, projectId, and every credential key are now matched against a
strict SAFE_SEGMENT grammar (^[A-Za-z0-9_-]{1,128}$). Without this, a key
like "../../other-user/other-project/leaked" could be smuggled through
vaultId() and collide with another scope's vault id. Failures are surfaced
as TypeError without echoing the invalid value back.
2. Delete races and orphan-overwrite.
- delete() is refactored onto an internal deleteById() helper that
deletes metadata FIRST, then vault. A throw on metadata.delete leaves
the entry fully intact and readable; a throw on vault.deleteSecret
reinstates a pending orphan marker (with orphanReason
"vault-delete-failed") so readers see absence and operators can
reconcile, AND surfaces a wrapped error whose message distinguishes
"orphan marker persisted" from "orphan marker also failed to persist".
- deleteById skips rows whose metadata is pending: true. This closes
both the orphan-overwrite path (delete on an existing orphan marker
used to nuke the vault entry it was tracking) and the concurrent
put(new)+delete race (delete during the pending window of a new-entry
put would wipe the vault entry mid-write).
- get()/has() wrap expiry self-eviction in try/catch so a transient
KV/vault failure during eviction does not surface as a thrown read.
- deleteAll() collects per-row failures and throws AggregateError so a
single bad row no longer silently masks deletion of the rest.
3. Orphan reason discriminator.
CredentialMetadataRow gains an optional orphanReason field
("vault-write-failed" | "metadata-commit-failed" | "vault-delete-failed")
and a corresponding exported OrphanReason type. Both put() orphan
paths and the new delete() orphan path stamp the reason so operator
tooling can pick a remediation strategy without re-running the original
write. Legacy rows without the discriminator stay invisible to readers
(covered by an explicit test).
14 new tests added; existing delete-removes-both test upgraded to assert
metadata-first ordering via spy call-order; existing put orphan tests
upgraded to assert orphanReason.
Supersedes #541.
https://claude.ai/code/session_01Wws8oZpB5imjKL2e7DRXtc
3 tasks
sroussey
added a commit
that referenced
this pull request
Jun 3, 2026
…es, orphan reasons)
Three classes of issue addressed:
1. Key-injection scope escape.
userId, projectId, and every credential key are now matched against a
strict SAFE_SEGMENT grammar (^[A-Za-z0-9_-]{1,128}$). Without this, a key
like "../../other-user/other-project/leaked" could be smuggled through
vaultId() and collide with another scope's vault id. Failures are surfaced
as TypeError without echoing the invalid value back.
2. Delete races and orphan-overwrite.
- delete() is refactored onto an internal deleteById() helper that
deletes metadata FIRST, then vault. A throw on metadata.delete leaves
the entry fully intact and readable; a throw on vault.deleteSecret
reinstates a pending orphan marker (with orphanReason
"vault-delete-failed") so readers see absence and operators can
reconcile, AND surfaces a wrapped error whose message distinguishes
"orphan marker persisted" from "orphan marker also failed to persist".
- deleteById skips rows whose metadata is pending: true. This closes
both the orphan-overwrite path (delete on an existing orphan marker
used to nuke the vault entry it was tracking) and the concurrent
put(new)+delete race (delete during the pending window of a new-entry
put would wipe the vault entry mid-write).
- get()/has() wrap expiry self-eviction in try/catch so a transient
KV/vault failure during eviction does not surface as a thrown read.
- deleteAll() collects per-row failures and throws AggregateError so a
single bad row no longer silently masks deletion of the rest.
3. Orphan reason discriminator.
CredentialMetadataRow gains an optional orphanReason field
("vault-write-failed" | "metadata-commit-failed" | "vault-delete-failed")
and a corresponding exported OrphanReason type. Both put() orphan
paths and the new delete() orphan path stamp the reason so operator
tooling can pick a remediation strategy without re-running the original
write. Legacy rows without the discriminator stay invisible to readers
(covered by an explicit test).
14 new tests added; existing delete-removes-both test upgraded to assert
metadata-first ordering via spy call-order; existing put orphan tests
upgraded to assert orphanReason.
Supersedes #541.
https://claude.ai/code/session_01Wws8oZpB5imjKL2e7DRXtc
sroussey
added a commit
that referenced
this pull request
Jun 4, 2026
…ore (sec) (#544) * fix(storage): harden ServerCredentialStore (key injection, delete races, orphan reasons) Three classes of issue addressed: 1. Key-injection scope escape. userId, projectId, and every credential key are now matched against a strict SAFE_SEGMENT grammar (^[A-Za-z0-9_-]{1,128}$). Without this, a key like "../../other-user/other-project/leaked" could be smuggled through vaultId() and collide with another scope's vault id. Failures are surfaced as TypeError without echoing the invalid value back. 2. Delete races and orphan-overwrite. - delete() is refactored onto an internal deleteById() helper that deletes metadata FIRST, then vault. A throw on metadata.delete leaves the entry fully intact and readable; a throw on vault.deleteSecret reinstates a pending orphan marker (with orphanReason "vault-delete-failed") so readers see absence and operators can reconcile, AND surfaces a wrapped error whose message distinguishes "orphan marker persisted" from "orphan marker also failed to persist". - deleteById skips rows whose metadata is pending: true. This closes both the orphan-overwrite path (delete on an existing orphan marker used to nuke the vault entry it was tracking) and the concurrent put(new)+delete race (delete during the pending window of a new-entry put would wipe the vault entry mid-write). - get()/has() wrap expiry self-eviction in try/catch so a transient KV/vault failure during eviction does not surface as a thrown read. - deleteAll() collects per-row failures and throws AggregateError so a single bad row no longer silently masks deletion of the rest. 3. Orphan reason discriminator. CredentialMetadataRow gains an optional orphanReason field ("vault-write-failed" | "metadata-commit-failed" | "vault-delete-failed") and a corresponding exported OrphanReason type. Both put() orphan paths and the new delete() orphan path stamp the reason so operator tooling can pick a remediation strategy without re-running the original write. Legacy rows without the discriminator stay invisible to readers (covered by an explicit test). 14 new tests added; existing delete-removes-both test upgraded to assert metadata-first ordering via spy call-order; existing put orphan tests upgraded to assert orphanReason. Supersedes #541. https://claude.ai/code/session_01Wws8oZpB5imjKL2e7DRXtc * fix(storage): restore ICredentialStore contract on ServerCredentialStore (sec) Follow-up to #543. Two contract regressions discovered in code review: H1: deleteAll() silently skipped pending/orphan rows because it now delegates to deleteById (added in #543) which short-circuits when pending=true. Result: no public API cleared orphan markers; a "wipe scope" call silently leaked the very rows operators need to clean up. Adds forceDeleteById that bypasses the pending-skip; deleteAll uses it. deleteById (single-key) unchanged. H2: get/has/delete now threw TypeError for invalid keys because they all routed through vaultId(). ICredentialStore documents these as returning undefined/false on missing keys; throwing breaks substitutability with InMemoryCredentialStore / EncryptedKvCredentialStore. Adds isSafeKey() predicate; readers short-circuit; put() still throws via vaultId (only path that persists a colliding id). M1: SAFE_SEGMENT widened from [A-Za-z0-9_-] to [A-Za-z0-9._:-] so natural credential names like "openai.prod" and "scope:billing" are accepted. Still rejects path separators and whitespace. 5 new tests + 1 rewritten test. Existing tests pass. * docs(storage): clarify deleteById JSDoc + fix deleteByid typo (review) Copilot review feedback on #544: - deleteById's JSDoc claimed both delete() and deleteAll() share its code path, but deleteAll() deliberately routes through forceDeleteById() to bypass the pending-row short-circuit and reach sticky orphan markers. Updated to call out the asymmetry with a forward reference to forceDeleteById. - Typo: 'deleteByid' -> 'deleteById' so the @link symbol resolves. * refactor(storage): extract commitDelete to dedupe ServerCredentialStore delete paths `deleteById` and `forceDeleteById` had nearly-identical bodies after the pending-row check — the metadata-first ordering, orphan-marker rewrite, and wrapped-error throw were duplicated, so a fix in one could silently drift from the other. Lift the shared tail into `commitDelete(id, key, existing, op)` so both callers route through one source of truth and the only meaningful divergence (pending-row policy) stays at the call site. Also corrects the stale `vaultId` JSDoc that still listed the reader methods as callers — they were moved to `isSafeKey` in the same PR for the ICredentialStore-contract substitutability fix. https://claude.ai/code/session_011KMd9sERp2rguyekAi8a3u --------- Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two high-priority correctness fixes for
ServerCredentialStoreinpackages/storage/src/credentials/.Fix 1 —
orphanReasondiscriminator on orphan markersCredentialMetadataRow.orphanedAtpreviously left operators guessing which failure branch produced a sticky orphan marker. This change adds an optionalorphanReasonfield with one of three string-literal values:"vault-write-failed"—put()new-entry rollback path: metadata row written,vault.setSecretthrew, and the follow-upmetadata.deletealso threw, so the pending row was rewritten as a sticky marker."metadata-commit-failed"—put()new-entry commit path:vault.setSecretsucceeded but thepending: true -> falseflip threw, so the vault holds bytes the metadata can no longer expose."vault-delete-failed"—delete()path (introduced by Fix 2 below):metadata.deletesucceeded butvault.deleteSecretthrew, so the vault may still hold bytes for a key no metadata row exposes.A
OrphanReasontype alias is exported (NonNullable<CredentialMetadataRow["orphanReason"]>) for downstream consumers (reconciliation tooling, dashboards). The JSDoc onorphanedAtnow points atorphanReasonfor branch semantics.Fix 2 —
delete()is metadata-first;deleteAll()aggregates per-id failuresPreviously,
delete()calledvault.deleteSecretfirst andmetadata.deletesecond. If the vault call threw, the row remained visible to readers but its secret was already gone — a torn state. The new ordering is:falseif absent.await this.metadata.delete(id)— failure bubbles, vault untouched.await this.vault.deleteSecret(id)— failure triggers a best-effort orphan marker (orphanReason: "vault-delete-failed") and throws a wrappedErrorwhosecauseis the vault error.delete()itself is now a thin wrapper around a privatedeleteById(id, key), anddeleteAll()iteratesscopedIds()calling the same path. Per-id errors are collected and surfaced as anAggregateErroronce every id has been attempted, so one bad row no longer aborts cleanup of the rest. The failing ids retain sticky orphan markers for operator reconciliation; succeeding ids are fully removed.get()andhas()self-eviction calls (await this.delete(key)on expiry) are now wrapped intry { ... } catch {}so the new throwydelete()cannot turn an expired-read miss into a thrown error for callers — preserving the prior swallow-on-eviction behaviour.Backward compatibility
orphanReasonis optional. Legacy persisted rows withorphanedAtbut noorphanReasoncontinue to be treated as orphan markers (they arepending: true, so they were already invisible to readers). A regression test covers this case.deleteAll()now throwsAggregateErroron per-id failure instead of throwing the first vault error and aborting the rest of the cleanup. Callers that previously onlyawait-eddeleteAll()continue to see a throw on failure; callers that depended on early-abort semantics will need to update.delete()now throws a wrappedError(withcause) whenvault.deleteSecretfails after metadata removal, instead of leaving a visible-but-empty row. The previous behaviour was a silent torn state; callers must now be prepared to handle this throw.Files modified
packages/storage/src/credentials/ServerCredentialStore.ts— orphanReason field + OrphanReason export, both put() branches tagged, new privatedeleteById(), rewrittendelete()/deleteAll(), swallow-wrapped self-eviction inget()/has().packages/storage/src/credentials/ServerCredentialStore.test.ts— extended two existing orphan-path tests to assertorphanReason, added four new tests (delete()vault-failure,delete()metadata-failure,deleteAll()AggregateError, legacy orphan compatibility), updated the existingdelete removes bothtest to verify metadata-before-vault ordering via spy call-order.Test plan
bun test packages/storage/src/credentials/ServerCredentialStore.test.tspasses locallytsc --noEmit) clean across the packageCredentialMetadataRowfor compile compatibility with the new optional fielddeleteAll()throwing the first vault error rather thanAggregateErrorGenerated by Claude Code