Skip to content

docs: add draft threat model + SECURITY.md/AGENTS.md discoverability#3734

Open
potiuk wants to merge 7 commits into
apache:mainfrom
potiuk:asf-security/threat-model-2026-06-02
Open

docs: add draft threat model + SECURITY.md/AGENTS.md discoverability#3734
potiuk wants to merge 7 commits into
apache:mainfrom
potiuk:asf-security/threat-model-2026-06-02

Conversation

@potiuk

@potiuk potiuk commented Jun 2, 2026

Copy link
Copy Markdown
Member

What this is

A draft threat model for Apache Fory, proposed by the ASF Security team for the Fory PMC to review, correct, or reject. It is a starting point for discussion, not a finished document.

This PR:

  • adds THREAT_MODEL.md — the draft model, following the ASF Security threat-model rubric;
  • adds SECURITY.md — a short security policy that links the threat model;
  • appends a ## Security section to AGENTS.md, so the chain AGENTS.md → SECURITY.md → THREAT_MODEL.md is mechanically discoverable by automated security scanners.

How to read it

Every claim is provenance-tagged: (documented) (from Fory's own docs/repo), (inferred) (reasoned from the architecture, not yet confirmed), (maintainer) (confirmed by the PMC). This v0 is ~20 documented / ~26 inferred. The §14 Open questions section collects every inferred claim into waves for the PMC to confirm or correct — that is where review time is best spent. The highest-impact ones:

  • whether "under the default requireClassRegistration(true), only registered types are instantiated from untrusted bytes" is a committed property, and whether findings that require requireClassRegistration(false) are out-of-model / non-default (wave 1);
  • per-language memory safety on malformed input — in particular the C++ decoder (wave 2);
  • the cross-language (xlang) peer-trust assumption and the resource/DoS line beyond maxDepth (waves 2).

Nothing here is a requirement — the model is for the PMC to own. Comment inline, edit the branch, or reply on the email thread.

AI Usage Disclosure

  • Substantial AI assistance: yes.
  • What: THREAT_MODEL.md was drafted by the ASF Security team's threat-model tooling (Claude) from Apache Fory's public documentation and repository, following the Scovetta rubric. SECURITY.md and the AGENTS.md Security section are templated scaffolding.
  • Review model: the document is deliberately presented as a draft for line-by-line maintainer review — every claim carries a provenance tag, and all unverified (inferred) claims are surfaced as explicit open questions (§14) for the PMC to ratify, correct, or strike. It is not offered as finished or authoritative content.
  • Provenance / licensing: content is original to this engagement, carries the ASF license header, and is intended to comply with the ASF Generative Tooling Guidance.

Comment thread THREAT_MODEL.md Outdated
| --- | --- | --- | --- |
| Object-graph serialization (native, per language) | `fory.serialize` / `deserialize` | the core; instantiates registered types from bytes | **In** *(documented)* |
| Cross-language (xlang) serialization | xlang `serialize`/`deserialize` | type mapping across languages | **In** *(documented)* |
| Row format / zero-copy | row encoders | reads fields in place from a buffer | **In** *(documented)* |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Row Format is out of scope of untrusted data, it must be used with trusted data. In https://github.com/apache/fory/blob/main/docs/security/deserialization.md, we clarify for usage for trusted data only.

Comment thread THREAT_MODEL.md Outdated

7. Confirm Fory disclaims payload integrity/authenticity/confidentiality (no MAC/sig/encryption) and is not a sandbox for registered types' own logic. → §9.
8. Any other recurring scanner/fuzzer false positives the PMC already knows about, to seed §11a (e.g. reflection/Unsafe usage, codegen)? → §11a.
9. **Meta:** Fory has no in-repo `SECURITY.md` and an `AGENTS.md` that is a developer/agent guide. This engagement adds `SECURITY.md` + `THREAT_MODEL.md` and wires `AGENTS.md → SECURITY.md → THREAT_MODEL.md`. Confirm the model should live in-repo (as proposed) vs. on the website, and who owns revisions. The existing config-guide "Security" section becomes a pointer to this model. → §1.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have https://github.com/apache/fory/blob/main/docs/security/deserialization.md now, could the THREAD_MODEL.md be moved into docs/security/threat-model.md ?

Comment thread THREAT_MODEL.md Outdated
## §3 Out of scope (explicit non-goals)

- **The integrity / authenticity / confidentiality of the serialized bytes.** Fory deserializes what it is given; it does not authenticate, MAC, or encrypt payloads. If bytes can be tampered with in transit/at rest, that is the application's problem to solve (sign/encrypt before handing to Fory). *(inferred)*
- **Anything when the caller disables class registration on an untrusted payload source.** `requireClassRegistration(false)` is a documented, deliberately-available footgun; using it against attacker-controlled bytes is out of the model's protection (see §5a/§9). *(documented — config: "Disabling may allow unknown classes to be deserialized, potentially causing security risks")*

@chaokunyang chaokunyang Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling class registration should take arbitrary class/gadget materialization out of the model, but not every security outcome. The current deserialization model still treats runtime safety, disproportionate allocation, cleanup/retained state, and explicit policy bypasses as security boundaries for untrusted bytes. A crash, OOM, retained-state bug, or native memory-safety issue should not become out-of-model only because the reproducer uses registration disabled.

Comment thread THREAT_MODEL.md Outdated
- **Registered-type-only instantiation (default).** With `requireClassRegistration(true)` (the default), deserialization instantiates only types the application registered, so attacker bytes cannot drive Fory to construct an arbitrary class. *Violation symptom:* an unregistered/unexpected type is instantiated from input under the default config. *Severity:* security-critical (this is the deserialization-RCE defense). *(documented that registration is required by default + that disabling causes risk; the unbypassability guarantee is the claim to confirm)*
- **Bounded recursion depth.** Deserialization beyond `maxDepth` (default 50) throws rather than recursing unbounded. *Violation symptom:* stack overflow / unbounded recursion from crafted nesting under the default. *Severity:* security-critical (DoS). *(documented — config table)*
- **Memory safety on malformed input — language-conditional.** In managed-runtime implementations, malformed bytes yield an exception, not memory corruption. For the **C++** implementation this is the load-bearing property to confirm (malformed-input fuzzing of the C++ decoder). *Violation symptom:* OOB read/write, crash. *Severity:* security-critical. *(inferred — confirm per language)*
- **Resource bounds beyond depth — UNSPECIFIED.** Whether a crafted payload can force large allocation / CPU blowup within the depth limit (e.g. huge declared collection sizes) is a bug or expected is open; the model needs a line (§14). *(inferred; maxDepth documented)*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is stale against the current security doc. docs/security/deserialization.md now draws the resource line: no disproportionate allocation before bytes are supplied or proven readable, no stream buffer growth to attacker-declared sizes before exact read/skip, and proportional checks before collection preallocation. I would replace this open question with a link to that model, otherwise future triage will treat an already-settled rule as undefined.

Comment thread SECURITY.md Outdated

What the project treats as in scope and out of scope, the security
properties it provides and disclaims, the adversary model, and how
findings are triaged are documented in [THREAT_MODEL.md](./THREAT_MODEL.md).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since main now has docs/security/, I would avoid making a new top-level threat model the only policy link from SECURITY.md. Root SECURITY.md is the right place for the private reporting channel and GitHub discoverability; the model entry point should be docs/security/index.md, with the detailed deserialization rules under docs/security/deserialization.md. That keeps scanners and humans on one security-doc hierarchy instead of splitting authority between root and docs/security.

@chaokunyang

chaokunyang commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

@potiuk I think we can remove duplciate/stale content with deserialization.md, then move THREAT_MODEL.md into docs/security//thread-model.md? Will it still work for security scan and discoverability?

@potiuk

potiuk commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Both changes are fine — go for it.

The scan agent doesn't care where the file physically lives; what it follows is the discoverability chain: AGENTS.md -> SECURITY.md -> the threat model. So moving the file under docs/security/ works as long as the pointer in SECURITY.md is updated to the new path. Since AGENTS.md already lists docs/security/deserialization.md as loadable guidance, adding the threat-model path there too is the natural home for it.

On the dedup: removing content that just restates deserialization.md is good hygiene — that doc can stay the authoritative source for the untrusted-deserialization boundary, and the threat model can reference it rather than duplicate it. The thing to keep in the threat model is the project-level framing the scan needs: what Fory considers in vs. out of scope, the trust boundaries and user roles, and the known false-positives — i.e. the parts deserialization.md doesn't already cover.

One tiny thing: the path you wrote is "thread-model.md" — I'm assuming that's a typo for "threat-model.md". Whatever filename you settle on is fine, just keep SECURITY.md pointing at it.

Happy to update the PR to match once you've decided on the final layout, or you can push directly — either works.

@chaokunyang chaokunyang force-pushed the asf-security/threat-model-2026-06-02 branch from a4b40e5 to 78ea160 Compare June 17, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants