Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 26 additions & 12 deletions PROTOCOL.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,10 @@ sequenceDiagram
Prover->>Prover: Select a set of Open Challenges to answer
Prover->>Prover: Fetch corresponding PreparedFiles
Prover->>Prover: PorSystem.prove(files, challenges) → Proof
Note over Prover: Proof object internally contains challenge_ids
Note over Prover: Proof object carries compressed SNARK + ledger_root + aggregated_tree_depth

Prover-->>Verifier: Broadcast Proof to the network
Verifier->>Verifier: On receiving Proof, extract challenge_ids
Verifier->>Verifier: Fetch full Challenge objects from local database by ID
Verifier->>Verifier: On receiving Proof, look up the claimed Challenge set out-of-band
Verifier->>Verifier: PorSystem.verify(proof, challenges) → bool
alt is valid
Verifier->>Verifier: Mark covered Open challenges as Resolved
Expand All @@ -66,13 +65,15 @@ sequenceDiagram
## Ledger & Aggregation

- A `FileLedger` binds the set of files via an aggregated Merkle tree built over root commitments `rc = H(TAG_RC, root, depth)`.
- Files are ordered canonically by `file_id` (lexicographic, e.g., `BTreeMap` order). Public ledger indices refer to this canonical ordering.
- Files are committed at explicit stable append-only ledger indices. The `files` map is key-sorted for lookup, but the aggregated tree is built by `ledger_index`, not lexicographic `file_id` order.
- Multi-file proofs pin the ledger root as the aggregated root; single-file proofs pin the file root.
- The verifier provides public ledger indices; the circuit trusts these (no range checks in circuit).
- The verifier re-derives public ledger indices from its own ledger state and rejects statements whose derived indices do not fit the claimed aggregated tree depth.
- **Stateless verification.** A verifier needs only two facts about "the ledger" — the stable slot + `rc` of each challenged file, and whether the proof's `ledger_root` is accepted — so it does not need a live `FileLedger`. `verify_stateless` takes those facts as plain data (a `StatelessLedger`: a `file_id → (slot, rc)` registry snapshot plus the set of valid roots), letting a host keep the file registry in contract storage instead of an in-memory accumulator. `aggregate_root`/`aggregate_root_from_files` recompute a root from `(rc, slot)` (resp. `(root, depth, slot)`) pairs, byte-identical to `FileLedger::root`, so the registry holder can derive the current valid root itself.

## Determinism & Canonical Ordering

- Any set of files is treated deterministically by sorting by `file_id` whenever an order is required.
- Challenge sets are ordered canonically by `(file_id, challenge_id)` when a proof statement is built.
- Ledger reconstruction is deterministic only when explicit stable indices are preserved.
- The map from `file_id` → `PreparedFile` is derived internally from `Vec<PreparedFile>` to avoid user ordering mistakes.
- Seeds differ per file; domain separation prevents cross-file bias (multi-batch aggregation supported).

Expand All @@ -82,7 +83,9 @@ sequenceDiagram
// Public commitment to a file
pub struct FileMetadata {
pub root: FieldElement,
pub file_id: String, // hex(SHA256(data))
pub object_id: String, // obj_<SHA256(data)>
pub file_id: String, // file_<SHA256(domain || len(data) || data || len(nonce) || nonce)>
pub nonce: Vec<u8>, // upload-specific nonce used in file_id derivation
pub padded_len: usize, // Total Merkle leaves (power of 2)
pub original_size: usize, // Original file size in bytes
pub filename: String,
Expand Down Expand Up @@ -110,14 +113,25 @@ pub struct Challenge {
// Deterministic identity for a Challenge.
pub struct ChallengeID([u8; 32]);

// Final succinct proof that includes its coverage set.
// Final succinct proof with constant-size metadata.
pub struct Proof {
// existing CompressedSNARK payload
pub challenge_ids: Vec<ChallengeID>, // Exact ordered set of challenges covered
pub ledger_root: FieldElement,
pub aggregated_tree_depth: usize,
}

// Derivation for ChallengeID (using stable, cryptographic fields only)
challenge_id = H(TAG_CHALLENGE_ID, encode(block_height) || encode(seed) || encode(file_id) || encode(root) || encode(depth) || encode(num_challenges) || [encode(prover_id)])
challenge_id =
H(TAG_CHALLENGE_ID
|| encode(block_height)
|| encode(seed)
|| len(file_id) || file_id
|| len(nonce) || nonce
|| encode(padded_len)
|| encode(original_size)
|| encode(root)
|| encode(num_challenges)
|| len(prover_id) || prover_id)
```

## Proof Serialization
Expand All @@ -129,11 +143,11 @@ let bytes = proof.to_bytes()?;
let parsed = Proof::from_bytes(&bytes)?;
```

The format is versioned, includes the `challenge_ids` vector, and rejects any trailing data.
The format is versioned, rejects trailing data, and no longer serializes dynamic `challenge_ids` or `ledger_indices` vectors.

### Encoding Notes

- Use a network-canonical encoding with explicit versioning and magic bytes.
- Fixed-width, little-endian encodings for integers (e.g., `block_height: u64`).
- Field elements are encoded in a canonical 32-byte form.
- `challenge_ids` are serialized as a length-prefixed vector of 32-byte IDs.
- Challenge coverage is supplied out-of-band to the verifier and rebound through verifier-constructed public inputs.
46 changes: 29 additions & 17 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,17 @@ Kontor-Crypto implements Proof-of-Retrievability using recursive SNARKs. Storage

### 3. Public Input Construction (`src/api/plan.rs`, `src/api/prove.rs`, `src/api/verify.rs`)

`Plan::make_plan()` and `build_z0_primary()` bind proofs to specific files and ledger state. Public I/O layout in `src/config.rs::PublicIOLayout` (lines 46-181). Verify prover/verifier construct identical inputs, aggregated root derived from verifier's ledger only, canonical ordering, no prover-controlled values.
`Plan::make_plan()` and `build_z0_primary()` bind proofs to specific files and ledger state. Public I/O layout in `src/config.rs::PublicIOLayout` (lines 46-181). Verify prover/verifier construct identical inputs, aggregated root derived from verifier's ledger only, challenge ordering is canonical, and ledger indices are verifier-derived rather than prover-controlled.

### 4. Ledger Root Pinning (`src/api/plan.rs` lines 46-52)

Single-file proofs use file root from metadata; multi-file uses `ledger.tree.root()` from verifier's ledger. Prover cannot influence which root is used.

### 5. Proof Verification (`src/api/verify.rs`)

`verify()` performs validation before SNARK verification. Check error vs invalid proof distinction (`Ok(false)` vs `Err`), no panics on malformed input, challenge ID binding.
`verify()` performs validation before SNARK verification. Check error vs invalid proof distinction (`Ok(false)` vs `Err`), no panics on malformed input, and challenge-set binding via verifier-constructed public inputs.

`verify()` (live `FileLedger`) and `verify_stateless()` (a `StatelessLedger`: a `file_id → (slot, rc)` registry snapshot + valid-root set) share `verify_with()` over the `LedgerView` trait. The two resolve per-file indices through the same code path, so soundness does not depend on which is used. In stateless mode the registry is supplied by the caller (e.g. a contract reading its own state): a wrong slot/`rc` makes the SNARK statement fail to verify, and an attacker-chosen `ledger_root` is rejected unless it is in the caller-supplied valid-root set (multi-file). The caller is therefore responsible for keeping the registry and valid-root set consistent with the on-chain files; given a correct registry, the binding is identical to the ledger path.

### 6. Serialization (`src/api/types.rs`)

Expand Down Expand Up @@ -58,49 +60,59 @@ Verify version pinning, security advisories, no known vulnerabilities, component

# Input Validation

| Function | Inputs | DoS Risk | Handled |
|----------|--------|----------|---------|
| `prepare_file()` | data size, filename | Large files | ✅ |
| `Challenge::new()` | num_challenges, prover_id | Extreme values | ✅ |
| `prove()` | challenges.len(), num_challenges | Resource exhaustion | ✅ |
| `verify()` | proof bytes, challenges | Malformed data | ✅ |
| `reconstruct_file()` | sector count, metadata | Invalid combinations | ✅ |
| `FileLedger::load()` | file size, contents | Large/malformed files | ✅ |
| Function | Inputs | DoS Risk | Handled |
| -------------------- | -------------------------------- | --------------------- | ------- |
| `prepare_file()` | data size, filename | Large files | ✅ |
| `Challenge::new()` | num_challenges, prover_id | Extreme values | ✅ |
| `prove()` | challenges.len(), num_challenges | Resource exhaustion | ✅ |
| `verify()` | proof bytes, challenges | Malformed data | ✅ |
| `reconstruct_file()` | sector count, metadata | Invalid combinations | ✅ |
| `FileLedger::load()` | file size, contents | Large/malformed files | ✅ |

**Boundary conditions:** empty inputs, maximum sizes (file/challenge/ledger count), zero values (depth, size, count), type boundaries (usize::MAX, u64::MAX).

## Security Properties

### Soundness

Prover cannot generate valid proof without data. Check circuit constraints (`synth.rs`), Merkle path validation, state chaining (prevents step skipping), no freely-chosen witness values. Tests: `security_malicious_prover.rs` (9).

### Completeness

Honest prover always succeeds. Review error paths in `prove()`, witness generation for valid cases, no false rejections. Tests: all e2e.

### Binding
Proof bound to specific challenges and ledger. Public inputs include aggregated_root (from ledger), challenge IDs derived deterministically (includes prover_id), verified before SNARK check, state chain creates temporal binding. Tests: `security_replay_attack.rs`, `security_ledger_root_pinning.rs`.

Proof bound to specific challenges and ledger once the verifier supplies the intended `Challenge` set. Public inputs include aggregated_root (from ledger), verifier-derived ledger indices, deterministically derived challenge-set state (includes prover_id through `Challenge::id()`), and the recursive state chain. Tests: `security_replay_attack.rs`, `security_ledger_root_pinning.rs`.

### Determinism
Same inputs → same outputs. No randomness, canonical ordering (BTreeMap), fixed serialization. Tests: `regression.rs`, `api_consistency.rs`.

Same inputs → same outputs. No randomness, canonical challenge ordering, explicit stable ledger indices for reconstruction, fixed serialization. Tests: `regression.rs`, `api_consistency.rs`.

## Attack Vectors

### Malformed Proof Bytes (`Proof::from_bytes()`)

Parser vulnerabilities, buffer overflows. Mitigations: magic byte validation, length checks, trailing data rejection, version validation.

### Resource Exhaustion (`prove()`, parameter generation)

Memory exhaustion, DoS. Limits: MAX_NUM_CHALLENGES (10,000), PRACTICAL_MAX_FILES (1,024), parameter cache (50), no unbounded loops.

### Integer Overflow (sector calculations, index arithmetic)

Checked arithmetic, documented type conversions, bounds enforced before casts.

### Depth/Metadata Tampering (challenge construction, verification)

Root commitment binds (root, depth): `rc = H(TAG_RC, root, depth)`. Public depth in circuit, ledger lookup uses rc, gating prevents depth=0 abuse.

### Ledger Substitution (multi-file verification)

Verifier derives aggregated root from its own ledger, not prover-supplied, cryptographically bound via public inputs.

### State Chain Manipulation (recursive proving)

State evolution one-way: `state_new = H(TAG_STATE_UPDATE, state_old, leaf)`. Circuit enforces state threading, no skipping/reordering.

## Test Coverage
Expand Down Expand Up @@ -148,11 +160,11 @@ Verifier generates parameters from challenge shape independently. If prover used

## Dependencies

| Dependency | Version | Role | Audit Status |
|------------|---------|------|--------------|
| `nova-snark` | 0.41.0 | Nova SNARK & Poseidon hash | Microsoft Research |
| `reed-solomon-erasure` | 6.0.0 | Erasure coding | ? |
| `ff` | 0.13 | Finite field arithmetic | ? |
| Dependency | Version | Role | Audit Status |
| ---------------------- | ------- | -------------------------- | ------------------ |
| `nova-snark` | 0.41.0 | Nova SNARK & Poseidon hash | Microsoft Research |
| `reed-solomon-erasure` | 6.0.0 | Erasure coding | ? |
| `ff` | 0.13 | Finite field arithmetic | ? |

# Pre-Audit Checklist

Expand Down
2 changes: 1 addition & 1 deletion crates/kontor-crypto/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "kontor-crypto"
version = "0.2.0"
version = "0.3.0"
edition = "2021"
default-run = "kontor-crypto"
license = "MIT"
Expand Down
5 changes: 5 additions & 0 deletions crates/kontor-crypto/src/api/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,11 @@ pub use types::{
Challenge, ChallengeID, FieldElement, FileMetadata, KeyPair, PorParams, PreparedFile, Proof,
};

// Stateless verification: verify a constant-size proof from bare registry data (the
// valid-root set + a `file_id -> (slot, rc)` snapshot) instead of a live `FileLedger`.
// For hosts that keep the file registry in contract storage.
pub use verify::{verify_stateless, LedgerView, StatelessLedger};

// Internal modules can access these for implementation
// Export for testing - these are implementation details
#[doc(hidden)]
Expand Down
Loading
Loading