diff --git a/docs/investor-faq/GRA-2535-procedural-memory-diligence.md b/docs/investor-faq/GRA-2535-procedural-memory-diligence.md new file mode 100644 index 00000000..0f52c4eb --- /dev/null +++ b/docs/investor-faq/GRA-2535-procedural-memory-diligence.md @@ -0,0 +1,188 @@ +# GRA-2535 — Investor FAQ for procedural-memory diligence objections + +Date: 2026-06-08 +Issue UUID: 4e1a1fb6-8099-4b75-9004-609cc45aa69b + +Purpose: prepare crisp, diligence-safe answers for likely YC and angel objections to Gradata's category framing. This artifact intentionally separates verified facts from hypotheses and does not claim external customers, revenue, or paid pilots. + +## Facts vs assumptions + +Verified facts used in this FAQ: + +- Gradata is positioned as procedural memory for AI agents: a system that helps agents remember and reuse corrected ways of working, not a CRM or sales-call wrapper. +- The wedge is cross-agent identity/procedural memory across multiple coding-agent surfaces, including Claude Code, Codex, Gemini, Cursor, Hermes, and OpenCode. +- The product direction favors lazy recall of relevant rules over blindly injecting large static instructions into every session. +- The motivation includes context-bloat risk: more prompt context is not automatically better, and irrelevant context can degrade agent behavior. +- Current product work spans SDK/CLI/local daemon/cloud/dashboard surfaces in Gradata repositories. +- The founder is Oliver, solo, with Account Executive (AE) experience at Sprites AI and previously PayPal. + +Assumptions/hypotheses that require validation: + +- Developers will adopt a cross-agent procedural-memory layer because they switch between multiple agent tools and hate re-teaching the same preferences. +- Procedural memory produces measurable improvement over plain context windows, MCP resources, and existing memory products in repeated coding/ops workflows. +- A narrow developer wedge can expand into a durable memory/control plane for AI employees. +- Open-source or low-friction SDK distribution can create enough usage data and workflow lock-in to defend against larger incumbents. + +## 1. Why now? Haven't people always wanted reusable instructions? + +Investor objection: "This sounds like dotfiles, CLAUDE.md, or prompt snippets. Why is now the moment?" + +Answer: + +- Verified: AI coding agents are now used as recurring workers across real repositories, not just one-off chat assistants. That creates repeated correction loops: the user tells the agent not only facts, but how to work. +- Verified: The tool landscape is fragmented. The same developer may use Claude Code, Codex, Gemini, Cursor, Hermes, and OpenCode, each with separate context and memory conventions. +- Hypothesis: As agent work becomes multi-session and multi-tool, the pain shifts from "how do I prompt this model once?" to "how do I stop re-teaching every agent the same operating procedure?" +- Hypothesis: Procedural memory becomes valuable when a developer's corrections compound across sessions and tools instead of disappearing inside a single chat transcript. + +Proof needed next: Show a before/after dataset where repeated corrections become reusable rules and reduce the same class of user intervention across at least two agent tools. + +## 2. Why won't larger context windows solve this? + +Investor objection: "If models get 1M-token context windows, why do you need a memory product?" + +Answer: + +- Verified: Gradata's product direction favors lazy recall over eagerly injecting everything into the system prompt. +- Verified: Large context capacity does not mean all context is relevant. Irrelevant instructions can conflict, distract, or bloat the working set. +- Hypothesis: The valuable primitive is not storage capacity; it is selecting the right procedural rule at the right moment and verifying whether it changed behavior. +- Hypothesis: A bigger window can make the problem worse if teams respond by dumping every old instruction into every session. + +Proof needed next: Run an evaluation comparing (a) no memory, (b) all rules injected, and (c) Gradata-style retrieved rules on the same recurring task set, measuring compliance and error rate. + +## 3. Why isn't MCP enough? + +Investor objection: "MCP already lets tools expose resources and context. Why is Gradata not just an MCP server?" + +Answer: + +- Verified: Gradata is about procedural memory: corrected ways of working and agent behavior over time. +- Verified: MCP is a transport/integration pattern; it does not by itself decide which lessons should graduate, when to recall them, or whether the recalled procedure improved outcomes. +- Hypothesis: Gradata can use MCP as one delivery surface, but the durable value is the memory lifecycle: capture correction, convert it into a rule, retrieve it when relevant, and measure whether it helped. +- Hypothesis: MCP compatibility lowers integration friction, but Gradata should not be defined as "an MCP server" because that collapses the category into plumbing. + +Proof needed next: Ship and document an MCP delivery path while keeping the core evaluation focused on correction-to-rule-to-injection outcomes. + +## 4. Why not Mem0, Letta, or another memory incumbent? + +Investor objection: "Memory for agents is crowded. Why won't Mem0 or Letta just add this?" + +Answer: + +- Verified: Gradata's wedge is procedural memory for agent work, especially cross-agent operating preferences and corrections. +- Verified: The intended surface is developer/agent workflows across agent CLIs and IDE-like tools, not generic user profile memory. +- Hypothesis: Existing memory products are more likely to optimize for factual/user memory, long-running agents, or app-level memory APIs, while Gradata can specialize in behavioral corrections and workflow rules. +- Hypothesis: The defensible wedge is not "we store memories"; it is "we turn corrections into executable operating procedure and prove repeated behavior changed." + +Proof needed next: Produce a head-to-head demo where a recurring coding-agent mistake is corrected once, recalled later, and avoided by Gradata while a baseline/incumbent setup misses or misapplies the rule. + +## 5. What is the data moat? + +Investor objection: "If this works, why isn't the data just portable JSON instructions anyone can copy?" + +Answer: + +- Verified: The candidate data asset is a corpus of corrections, rules, retrieval events, and outcome signals from real agent workflows. +- Hypothesis: The moat compounds if Gradata learns which procedural memories are useful in which contexts, not merely what text the user wrote. +- Hypothesis: Cross-agent evidence is stronger than single-tool logs because it captures whether a rule transfers across model/tool boundaries. +- Hypothesis: The proprietary asset should become a feedback loop: corrections create candidate rules, usage filters weak rules, and outcome metrics improve recall/scoring. + +Proof needed next: Define and publish internal metrics for rule graduation, recall precision, and repeated-error reduction; then show trendlines from live dogfood before claiming a moat. + +## 6. Will developers retain, or is this a nice-to-have? + +Investor objection: "Developers may try this once, then forget it. What creates retention?" + +Answer: + +- Verified: The problem is repeated re-teaching across sessions and tools. +- Hypothesis: Retention comes from pain recurrence: once a developer sees an agent remember a hard-earned correction, removing Gradata makes the workflow feel worse. +- Hypothesis: The strongest retention loop is not a dashboard habit; it is background reuse in the agent path where the user notices fewer repeat mistakes. +- Hypothesis: Teams may retain if rules become shared operational memory for a repo or engineering org. + +Proof needed next: Track cohort retention by number of successful rule recalls per week and compare retained vs churned users on repeat-correction frequency. + +## 7. What is the initial wedge? + +Investor objection: "Procedural memory for all agents is broad. What is the first narrow use case?" + +Answer: + +- Verified: The current wedge is developer agents and coding/ops workflows across Claude Code, Codex, Gemini, Cursor, Hermes, and OpenCode. +- Verified: The founder's own environment extensively uses agentic coding and operations workflows. +- Hypothesis: The first winning wedge is solo or small-team developers who use multiple coding agents and need repo-specific operating procedure to follow them. +- Hypothesis: The first killer use cases are repeated code-review preferences, repo conventions, deployment guardrails, and debugging procedures. + +Proof needed next: Pick one wedge workflow, such as "repo-specific guardrails for AI coding agents," and show repeated successful recall in a clean external repo. + +## 8. How will distribution work? + +Investor objection: "Memory infrastructure is hard to distribute. What's the go-to-market path?" + +Answer: + +- Verified: Gradata has SDK/CLI/local/cloud/dashboard surfaces in active product repositories. +- Hypothesis: Distribution should start where agent users already work: CLI install, editor/agent hooks, GitHub repos, and documentation that demonstrates immediate value. +- Hypothesis: The most credible early channel is open-source developer adoption with concrete demos, not enterprise sales language. +- Hypothesis: The founder's AE background can help with founder-led sales later, but early distribution should be product-led and proof-led. + +Proof needed next: Measure install-to-first-rule, first-rule-to-first-recall, and weekly active recall cohorts; publish a simple case study from dogfood or external alpha users. + +## 9. Is this secure enough for code and agent logs? + +Investor objection: "You're touching prompts, code context, and maybe sensitive company procedures. How do you avoid becoming a security risk?" + +Answer: + +- Verified: Gradata's positioning involves agent workflow memory and therefore may touch sensitive operational context. +- Hypothesis: The trust path should emphasize local-first storage, explicit opt-in sync, redaction controls, and clear separation between local procedural rules and cloud analytics. +- Hypothesis: Security posture must be stronger than a generic SaaS note store because agent memory can influence future actions. +- Hypothesis: The product should treat rule injection as a privileged behavior path: auditable, inspectable, and reversible. + +Proof needed next: Publish a security model covering local storage, cloud sync boundaries, redaction, rule provenance, and uninstall/delete semantics before selling to teams. + +## 10. How will pricing work? + +Investor objection: "Who pays, and for what?" + +Answer: + +- Verified: Current surfaces include SDK/CLI/cloud/dashboard work; no paid customer claims are made here. +- Hypothesis: The individual developer product should likely be free or low-friction to maximize usage and correction data. +- Hypothesis: Paid value is more plausible at the team/org layer: shared repo memory, admin controls, audit history, cloud sync, security controls, and agent-performance analytics. +- Hypothesis: A credible pricing ladder is open-source/local free, pro cloud sync for individuals, and team plans for shared procedural memory. + +Proof needed next: Validate willingness to pay with 5-10 target users after demonstrating one measurable repeated-error reduction, not before. + +## 11. How good is the proof quality today? + +Investor objection: "Is this real evidence or founder dogfood theater?" + +Answer: + +- Verified: Current known evidence is mainly product buildout and dogfood in agent-heavy workflows. +- Verified: This FAQ does not claim external customers, revenue, or third-party production deployment. +- Hypothesis: Dogfood is useful because the founder is operating in the target workflow, but it is not sufficient fundraising proof by itself. +- Hypothesis: The next proof step is a small, falsifiable external alpha with logged correction/retrieval/outcome events. + +Proof needed next: Create a public case-study seed report with anonymized correction/rule evidence and at least one external alpha workflow once available. + +## 12. Why can this become a company instead of a feature? + +Investor objection: "This sounds like a feature every agent platform will add. Why is it a standalone company?" + +Answer: + +- Verified: The product is explicitly cross-agent, not tied to one model vendor or IDE. +- Verified: Users already face tool fragmentation across multiple agent clients. +- Hypothesis: A neutral procedural-memory layer becomes more valuable as the agent ecosystem fragments, because users do not want each vendor to own a separate version of their operating identity. +- Hypothesis: The company-scale opportunity is the independent memory/control plane for AI employees: what they learned, why they learned it, where it applies, and whether it improved outcomes. + +Proof needed next: Demonstrate the same procedural memory improving behavior across at least two agent tools on the same repo, then turn that into a repeatable install/demo path. + +## Summary positioning + +The strongest diligence-safe answer is: + +Gradata is not betting that agents need another place to store facts. It is betting that as AI agents become recurring workers, users will need their corrections to compound into portable operating procedure. Context windows, MCP, and incumbent memory APIs help move information around; they do not by themselves solve the lifecycle of correction, rule graduation, relevant recall, and outcome proof across fragmented agent tools. + +The weakest current point is proof quality. Until external alpha data exists, the honest position is: strong founder-problem fit and differentiated wedge, with dogfood evidence, but still needing falsifiable external validation.