Skip to content

🎬 feat: Prime Manually-Invoked Skills via $ Popover#12709

Merged
danny-avila merged 24 commits intofeat/agent-skillsfrom
claude/phase3-manual-skill-invocation
Apr 19, 2026
Merged

🎬 feat: Prime Manually-Invoked Skills via $ Popover#12709
danny-avila merged 24 commits intofeat/agent-skillsfrom
claude/phase3-manual-skill-invocation

Conversation

@danny-avila
Copy link
Copy Markdown
Owner

Summary

I wired up the backend for Phase 3 of the Agent Skills rollout so that selecting a skill from the $ popover now deterministically primes SKILL.md into the turn before the LLM runs, instead of leaving the model to discover the skill through the catalog.

  • Added manualSkills?: string[] to TPayload and TSubmission, and propagated it through createPayload (excluded for Assistants endpoint, to match how ephemeralAgent is handled).
  • Wired useChatFunctions.ts to atomically drain pendingManualSkillsByConvoId on submit via a useRecoilCallback that reads and resets in one snapshot, so mid-flight picks aren't silently discarded. Skipped for regenerate / edit / continue so queued skills aren't drained by a history-only replay.
  • Extracted primeManualSkill and resolveManualSkills in packages/api/src/agents/skills.ts. The resolver applies ACL + active-state filtering (reusing resolveSkillActive's ownership-aware defaults), dedupes names preserving first occurrence, isolates per-name DB errors so one bad lookup doesn't drop the rest, and silently skips unresolvable entries with a warn log.
  • Refactored handleSkillToolCall in handlers.ts to reuse primeManualSkill, so model-invoked and user-invoked paths produce an identical InjectedMessage shape.
  • Threaded manualSkills through InitializeAgentParams and getSkillByName through InitializeAgentDbMethods, and wired all three initializeAgent call sites — services/Endpoints/agents/initialize.js, controllers/agents/responses.js, controllers/agents/openai.js. Resolution runs against the already-scoped accessibleSkillIds, so per-agent narrowing still applies to manual invocation.
  • Added manualSkillPrimes?: ResolvedManualSkill[] to InitializedAgent so the client layer can see what was resolved without a second DB trip.
  • Spliced the primes into initialMessages in client.js chatCompletion right before the user's current turn, using HumanMessage with additional_kwargs: { isMeta, source: 'skill', skillName }. Shifted indexTokenCountMap indices >= insertIdx by the number of primes so hydrateMissingIndexTokenCounts fills the fresh positions without corrupting token accounting for pre-existing messages.

Change Type

  • New feature (non-breaking change which adds functionality)

Testing

Verified with the two affected Jest suites in packages/api:

  • src/agents/__tests__/skills.test.ts — 49 passing, including 13 new for primeManualSkill (shape, empty-body passthrough) and resolveManualSkills (empty inputs, empty ACL, ownership default, silent skip on ACL miss, name dedup, shared skill behavior under defaultActiveOnShare, explicit skillStates override precedence, empty-body skip, per-name error isolation).
  • src/agents/__tests__/initialize.test.ts — 14 passing, including 5 new covering primes attached for resolved names, manualSkillPrimes undefined when no names supplied, empty array when every name unresolvable, skip when accessibleSkillIds is empty, and no-op when getSkillByName is absent from db methods.

Full packages/api agents suite shows only the same two pre-existing failures (handlers.spec.ts, summarization.e2e.test.ts) that exist on the base feat/agent-skills branch — confirmed by stashing my changes and re-running. Both stem from the @librechat/agents version mismatch in node_modules, not this change.

A manual end-to-end check against a live chat is worth doing on review: type $brand-guidelines show me the colors, submit, and confirm the assistant response demonstrates knowledge of the skill body (same effect as if the model had discovered the skill via the catalog, but guaranteed for the turn).

Test Configuration

  • Node 20.19.0, MongoDB via mongodb-memory-server in test, Jest per-workspace.
  • Branch based on the latest feat/agent-skills at 08d02c1dc (post-Phase-1 merges).

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes

@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24a97e83aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread api/server/controllers/agents/client.js Outdated
additional_kwargs: { isMeta: true, source: 'skill', skillName: p.name },
}),
);
initialMessages.splice(insertIdx, 0, ...primeMessages);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Scope manual skill primes to primary agent only

Injecting manualSkillPrimes into initialMessages here causes those skill bodies to be included in the single messages array later passed to runAgents(...), which executes both the primary agent and any agentConfigs (handoffs / added-convo agents). In multi-agent runs, this leaks per-turn manual skill context to secondary agents even though this feature is intended to apply only to the user’s primary submission path, and it can change downstream agent behavior unexpectedly.

Useful? React with 👍 / 👎.

Comment thread packages/api/src/agents/skills.ts Outdated
Comment on lines +325 to +326
const resolved = await Promise.all(
uniqueNames.map(async (name) => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bound concurrent manual skill lookups

This resolves every unique manualSkills entry with an unbounded Promise.all, and the request path only checks Array.isArray(...) before forwarding user input. A large manualSkills array can therefore trigger a large burst of concurrent getSkillByName database calls (plus warning logs) in a single request, which is a straightforward way to degrade agent latency and pressure DB connections; add a hard size limit and/or bounded concurrency.

Useful? React with 👍 / 👎.

@danny-avila danny-avila mentioned this pull request Apr 17, 2026
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbdd248fe0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +786 to +790
const primeResult = injectManualSkillPrimes({
initialMessages,
indexTokenCountMap,
manualSkillPrimes,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exclude manual skill primes from memory extraction window

Injecting manualSkillPrimes into initialMessages here means those synthetic HumanMessages are passed to runMemory(messages) later in the same chatCompletion flow, so SKILL.md bodies are treated as user chat when memory extraction is enabled. In practice, selecting several skills (or a large skill body) can crowd out real conversation turns in the memory window and increase memory-model token/cost overhead, producing polluted memories unrelated to the user’s intent.

Useful? React with 👍 / 👎.

@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

@danny-avila danny-avila force-pushed the claude/phase3-manual-skill-invocation branch from 64c2fa5 to a5643bc Compare April 17, 2026 11:57
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.
Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.
…imes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.
Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.
- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.
Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.
The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.
@danny-avila danny-avila force-pushed the claude/phase3-manual-skill-invocation branch from 9a30dc8 to a83fa9d Compare April 17, 2026 16:27
@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a83fa9d39b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +785 to +790
const manualSkillPrimes = this.options.agent?.manualSkillPrimes;
if (manualSkillPrimes && manualSkillPrimes.length > 0) {
const primeResult = injectManualSkillPrimes({
initialMessages,
indexTokenCountMap,
manualSkillPrimes,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict manual skill priming to primary-agent runs

This injects manualSkillPrimes into the shared initialMessages array before runAgents fans out to [primary, ...agentConfigs], so added/handoff agents receive the primary agent’s SKILL.md bodies too. In multi-agent or added-convo runs, that leaks/bleeds skill instructions across agent boundaries even when those agents were never scoped to or selected for the skill.

Useful? React with 👍 / 👎.

Comment on lines +930 to +935
const primedSkills = this.options.agent?.manualSkillPrimes;
if (primedSkills && primedSkills.length > 0) {
const primeParts = buildSkillPrimeContentParts(primedSkills, {
runId: this.responseMessageId ?? 'manual-skill',
});
this.contentParts.unshift(...primeParts);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid persisting manual primes as synthetic skill tool_calls

Prepending these synthetic skill tool_call parts causes manual selections to be stored in assistant history, and extractInvokedSkillsFromPayload later treats them as previously invoked skills. That makes a one-turn manual $ selection sticky across future turns (auto-re-primed without re-selection), which increases prompt bloat/cost and changes turn-scoped behavior.

Useful? React with 👍 / 👎.

Comment on lines 259 to 263
codeEnvAvailable: enabledCapabilities.has(AgentCapabilities.execute_code),
skillStates,
defaultActiveOnShare,
manualSkills,
},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply manual skill primes in OpenAI-compatible runs

This path now resolves manualSkills via initializeAgent, but the controller never injects primaryConfig.manualSkillPrimes into formattedMessages before the run starts. The /agents/openai flow therefore does the extra DB resolution work but still ignores manual priming behavior, creating endpoint-inconsistent results versus the main AgentClient path.

Useful? React with 👍 / 👎.

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.
…ffset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.
The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.
Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.
Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.
…Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.
… Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.
…s Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.
The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.
…Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.
Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow
…ent Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb2826c2ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +166 to +167
: isRegenerate || isContinued || isEdited
? []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve manual skill context on continue requests

When a user clicks Continue, continueGeneration calls ask(..., { isContinued: true, isRegenerate: true, isEdited: true }), and this branch forces manualSkills to [] unless an explicit override is provided. That means the follow-up request omits the top-level manualSkills payload, so SKILL.md primes are not re-injected for the continuation turn. Because those primes are synthetic (not persisted in history as chat messages), continuing a response that originally depended on manual $ skills can run with different context and produce inconsistent output.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb2826c2ac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +166 to +167
: isRegenerate || isContinued || isEdited
? []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve manual skills when continuing a response

The new manual-skill resolution path forces manualSkills to [] whenever isContinued is true unless an explicit override is provided. The continue flow calls ask without overrideManualSkills, so continuing a response after a $-primed turn drops the original skill list from the payload and the backend no longer injects the same SKILL.md context. That makes Continue runs for the same user turn behave inconsistently with the original generation.

Useful? React with 👍 / 👎.

Comment on lines 224 to 225
setPendingManualSkills((prev) =>
prev.includes(mention.value) ? prev : [...prev, mention.value],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cap queued manual skills before submission

This enqueue path allows an unbounded number of selected skills, but backend resolution (resolveManualSkills) truncates to MAX_MANUAL_SKILLS (10). Users can therefore select and see more than 10 skills in UI/history while only the first 10 are actually applied during inference, which silently drops intent and makes the pills misleading. The client should enforce the same cap (or provide overflow feedback) before storing/submitting the list.

Useful? React with 👍 / 👎.

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.
@danny-avila
Copy link
Copy Markdown
Owner Author

Resolved all seven findings from the comprehensive review in 0d2c94439:

MAJOR

  • Finding 1 (sticky re-priming): Verified behavior, documented as intentional in buildSkillPrimeContentParts. The synthetic skill tool_call's shape parity with model-invoked calls is deliberate — extractInvokedSkillsFromPayload treats them identically, which matches the UX (assistant skill card = persistent signal that the skill is active for the conversation). Called out explicitly so future maintainers don't read it as a bug. A hypothetical one-shot mode would need a distinct marker; noted in the comment.
  • Finding 2 (missing ContentParts tests): New ContentParts.test.tsx with 7 cases — assistant-only rendering, user-message suppression, undefined-content safety, pending-progress state, parallel + sequential branch integration. Child components mocked so the test exercises only the branching and prop wiring ContentParts owns.

MINOR

  • Finding 3 (localization): Added com_ui_skills_manual_invoked + com_ui_skills_queued keys. Reused existing com_ui_remove_skill_var for the remove-button. Both components now call useLocalize().
  • Finding 4 (skill-name length): MAX_SKILL_NAME_LENGTH = 200 constant + filter in extractManualSkills. Pathological strings can't reach the DB query planner.

NIT

  • Finding 5 (BaseClient comment): Rewrote to describe the filter as defense-in-depth on top of Mongoose schema validation.
  • Finding 6 (memo): ManualSkillPills now React.memo'd for consistency with peers.
  • Finding 7 (redundant guard): Collapsed the dual null-check into a single useMemo'd pendingSkills array; hasPendingSkills derives from its length. One source of truth.

All passing locally (where env permitted): ContentParts 7/7, ManualSkillPills 4/4. Full verification via CI.

@danny-avila
Copy link
Copy Markdown
Owner Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d2c94439a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +504 to +507
const rawManualSkills = this.options.req?.body?.manualSkills;
if (Array.isArray(rawManualSkills) && rawManualSkills.length > 0) {
const skills = rawManualSkills.filter((s) => typeof s === 'string' && s.length > 0);
if (skills.length > 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce manual-skill limits before persisting message metadata

This persistence path only filters manualSkills to non-empty strings, so a crafted request can still attach arbitrarily many and arbitrarily long skill names to userMessage.manualSkills. Unlike the runtime resolver path (which caps count/length), this can bloat stored message documents, inflate history payloads/UI rendering, and in extreme cases trigger Mongo document-size save failures for otherwise valid messages. Reuse the same bounded sanitizer when writing manualSkills to the message record.

Useful? React with 👍 / 👎.

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.
This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.
F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.
…ontent

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
@danny-avila danny-avila merged commit f13d95b into feat/agent-skills Apr 19, 2026
13 checks passed
@danny-avila danny-avila deleted the claude/phase3-manual-skill-invocation branch April 19, 2026 15:56
danny-avila added a commit that referenced this pull request Apr 21, 2026
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.

* 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input

Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.

* 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.

* 🔖 refactor: Single source for the skill-message source marker

Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.

* ♻️ refactor: Drop Multi-Agent Guard + Review Polish

- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.

* 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks

Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.

* 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills

The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.

* ⚡ feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.

* 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.

* 🎗️ feat: Replace \$skill-name Text with Pills on the User Message

The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.

* 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips

Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.

* 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding

Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.

* 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.

* 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.

* 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.

* 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts

The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.

* 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.

* 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator

Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow

* ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.

* 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.

* 🔧 fix: Update ParallelContent to Handle Optional Content Prop

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.

* 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer

This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.

* 🧹 polish: Address Audit Follow-Up (F1/F3/F6)

F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.

* ✨ feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
danny-avila added a commit that referenced this pull request Apr 22, 2026
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.

* 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input

Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.

* 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.

* 🔖 refactor: Single source for the skill-message source marker

Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.

* ♻️ refactor: Drop Multi-Agent Guard + Review Polish

- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.

* 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks

Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.

* 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills

The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.

* ⚡ feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.

* 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.

* 🎗️ feat: Replace \$skill-name Text with Pills on the User Message

The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.

* 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips

Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.

* 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding

Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.

* 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.

* 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.

* 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.

* 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts

The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.

* 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.

* 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator

Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow

* ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.

* 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.

* 🔧 fix: Update ParallelContent to Handle Optional Content Prop

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.

* 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer

This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.

* 🧹 polish: Address Audit Follow-Up (F1/F3/F6)

F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.

* ✨ feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
danny-avila added a commit that referenced this pull request Apr 22, 2026
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.

* 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input

Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.

* 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.

* 🔖 refactor: Single source for the skill-message source marker

Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.

* ♻️ refactor: Drop Multi-Agent Guard + Review Polish

- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.

* 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks

Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.

* 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills

The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.

* ⚡ feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.

* 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.

* 🎗️ feat: Replace \$skill-name Text with Pills on the User Message

The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.

* 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips

Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.

* 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding

Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.

* 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.

* 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.

* 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.

* 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts

The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.

* 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.

* 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator

Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow

* ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.

* 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.

* 🔧 fix: Update ParallelContent to Handle Optional Content Prop

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.

* 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer

This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.

* 🧹 polish: Address Audit Follow-Up (F1/F3/F6)

F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.

* ✨ feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
danny-avila added a commit that referenced this pull request Apr 25, 2026
* 🎬 feat: Prime Manually-Invoked Skills via $ Popover

Lands the backend for manual skill invocation, making the $ popover
deterministically prime SKILL.md before the LLM turn instead of leaving
the model to discover the skill via the catalog.

Flow: popover drains pendingManualSkillsByConvoId on submit, attaches
names to the ask payload, controllers forward to initializeAgent, and
initialize resolves each name to its body (ACL + active-state filtered,
reusing the same rules as catalog injection). AgentClient splices the
primes as meta HumanMessages before the user's current message.

- Extract primeManualSkill / resolveManualSkills in packages/api/src/agents/skills.ts
  and reuse primeManualSkill inside handleSkillToolCall for a single shape source.
- Thread manualSkills + getSkillByName through InitializeAgentParams / DbMethods
  and all three initializeAgent call sites (initialize.js, responses.js, openai.js).
- Splice HumanMessage primes in client.js chatCompletion after formatAgentMessages,
  shifting indexTokenCountMap so hydrate still fills fresh positions correctly.
- Carry isMeta / source / skillName in additional_kwargs for downstream filtering.

* 🛡️ fix: Scope manual skill primes to single-agent + cap resolver input

Two follow-ups to the Phase 3 priming path flagged in Codex review.

Multi-agent runs: skipping the splice when agentConfigs is non-empty.
`initialMessages` is shared across every agent in `createRun`, so splicing
a skill body there would bypass Phase 1's per-agent `scopeSkillIds`
contract — a handoff / added-convo agent with a different skill scope
would see content its configuration excludes. Warn + skip is the minimal
correct behavior; lifting this to per-agent initial state is a follow-up.

Input bounding: `resolveManualSkills` now truncates to `MAX_MANUAL_SKILLS`
(10) after dedup, with a warn listing the dropped tail. Controllers only
validate `Array.isArray(req.body.manualSkills)`, so a crafted payload
could otherwise fan out into an unbounded `Promise.all` of concurrent
`getSkillByName` DB lookups. Cap lives in the resolver so every caller
(including future `always-apply` in Phase 5) inherits it.

* 🧪 refactor: Testable Helpers + Payload Validation for Manual Skill Primes

Follow-ups from the comprehensive review. No behavior change for the
happy path — these are architectural and defensive improvements that
shrink the JS surface in /api, tighten the request-body contract, and
cover the delicate splice logic with proper unit tests.

- Extract `injectManualSkillPrimes` into packages/api/src/agents/skills.ts
  so the message-array splice and `indexTokenCountMap` shift are unit-
  testable in TS. client.js now calls the helper. Tests pin the `>=`
  vs `>` boundary condition — a regression here would silently corrupt
  token accounting for every message after the insertion point.
- Extract `extractManualSkills(body)` and use in all three controllers
  (initialize.js, responses.js, openai.js). Replaces copy-pasted
  `Array.isArray(...) ? ... : undefined` with a helper that also filters
  non-string / empty elements — closes a type-safety gap where a crafted
  payload like `{"manualSkills": [123, {"$gt":""}]}` would otherwise reach
  `getSkillByName` and waste DB round-trips.
- Rename `primeManualSkill` → `buildSkillPrimeMessage`. The helper serves
  three invocation modes (`$` popover, `always-apply`, model-invoked);
  the old name misled readers coming from `handleSkillToolCall`.
- Add `loadable.state === 'hasValue'` guard in `drainPendingManualSkills`
  — defensive, since the atom has a synchronous `[]` default, but the
  previous `.contents` cast would have been unsound under loading/error.
- Document why `resolveManualSkills` honors the active-state filter even
  for explicit `$` selections (Phase 2 popover filter + API-direct
  hardening).
- Remove stray `void Types;` in initialize.test.ts — `Types` is already
  consumed elsewhere in that test.

* 🔖 refactor: Single source for the skill-message source marker

Export `SKILL_MESSAGE_SOURCE = 'skill'` and use it in both construction
paths that stamp skill-primed messages — `buildSkillPrimeMessage` (for
the model-invoked tool path) and `injectManualSkillPrimes` (for the
user-invoked splice path). Downstream filtering and telemetry read this
marker, so the two paths must agree; keeping the literal in one place
removes the risk of them drifting when Phase 5's `always-apply` adds a
third caller.

* ♻️ refactor: Drop Multi-Agent Guard + Review Polish

- Remove the multi-agent skip in `AgentClient.chatCompletion`. Leaking
  primes to handoff / added-convo agents via shared `initialMessages` is
  the agents SDK's concern to scope; this layer should just inject and
  let the graph handle agent-scoped state. The guard was well-intended
  but produced a silent-drop UX where `$skill` in a multi-agent run did
  nothing.
- Bound the `[resolveManualSkills] Truncating ...` warn output to the
  first 5 dropped names plus a count suffix. A malicious payload of
  1000 names was previously spilling all ~990 names into the log line.
- Remove dead `?? []` from the `hasValue`-guarded loadable read in
  `drainPendingManualSkills` — the atom always yields a string[] when
  resolved, so the nullish fallback was unreachable.
- Reorder skills.ts imports to follow the style guide: value imports
  shortest-to-longest (`data-schemas` → `langchain/core/messages` →
  multi-line `@librechat/agents`), type imports longest-to-shortest.

* 🧠 fix: Strip Skill Primes from Memory Window + Unbreak CI Mocks

Two fixes after the last push.

CI unbreak: `responses.unit.spec.js` and `openai.spec.js` mock
`@librechat/api` and the mock didn't expose the new `extractManualSkills`
symbol, so every test in those files crashed before reaching the
`recordCollectedUsage` assertion. Added `extractManualSkills: jest.fn()`
returning `undefined` to both mocks; the controllers now no-op on
manualSkills as the tests expect.

Codex P2: `runMemory` passes `messages` straight through to the memory
processor, so after the splice in `injectManualSkillPrimes`, SKILL.md
bodies ride along as if they were real user chat. That pollutes memory
extraction with synthetic instruction content and crowds out real turns
from the window.

- Export `isSkillPrimeMessage(msg)` from `packages/api/src/agents/skills.ts`
  — a predicate keyed on the shared `SKILL_MESSAGE_SOURCE` marker.
- Filter `chatMessages = messages.filter(m => !isSkillPrimeMessage(m))`
  at the top of `runMemory` before the window-sizing logic. Keeps the
  primes visible to the LLM (they still ride in `initialMessages`) but
  invisible to the memory layer.
- 5 new tests for the predicate covering marker-present, plain messages,
  different source, non-object inputs, and array filter integration.

* 📜 feat: Show Skill-Loaded Cards for Manually-Invoked Skills

The $ popover was priming SKILL.md bodies into the turn but leaving no
visible trace on the assistant response — from the user's view it looked
like the `$name ` cosmetic text did nothing. Now each manually-invoked
skill renders the same "Skill X loaded" tool-call card that model-invoked
skills already produce via PR #12684's SkillCall renderer.

Approach: post-run prepend to `this.contentParts`. The aggregator owns
per-step indices during the run, so pre-seeding collides; waiting until
`await runAgents(...)` returns lets the graph settle before synthetic
parts slot in at the front.

- Export `buildSkillPrimeContentParts(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Returns completed tool_call parts
  (`progress: 1`, args JSON-encoded with `{skillName}`, output matching
  the model-invoked path's wording) that the existing `SkillCall.tsx`
  renderer draws identically.
- In `AgentClient.chatCompletion`, prepend the built parts to
  `this.contentParts` immediately after `await runAgents`. Persistence
  and the final-event reconcile come for free — `sendCompletion` already
  reads `this.contentParts` verbatim.
- Card ordering: skills appear first in the assistant message, reflecting
  that priming ran before the LLM's turn.

Live-during-streaming cards are a separate follow-up — the graph's
index-based aggregator makes that a bigger lift and this change delivers
the core UX win without fighting the stream ordering.

6 new unit tests covering part shape, args JSON contract, output text,
unique IDs, empty input, and startOffset ID differentiation.

* ⚡ feat: Emit Optimistic Skill Cards + Wire Primes in OpenAI/Responses

Two follow-ups from testing.

Optimistic card emit: the main chat path was only showing "Skill X
loaded" cards at final-reconcile time, so the user saw nothing happen
until the stream finished. Now emit synthetic ON_RUN_STEP +
ON_RUN_STEP_COMPLETED events right before `runAgents` starts — same
pattern the MCP OAuth flow uses in `ToolService` — so the cards appear
immediately. The graph's content at index 0 may overwrite them during
streaming, but the post-run `contentParts` prepend (unchanged) restores
them on final reconcile.

OpenAI + Responses parity: both controllers were resolving
`manualSkillPrimes` via `initializeAgent` but never injecting them into
`formattedMessages` before the run. Manual invocation silently did
nothing on `/v1/chat/completions` and the Responses API path. Now both
call `injectManualSkillPrimes` on the formatted messages so the model
sees SKILL.md bodies on every path. LibreChat-style card SSE events
don't apply to these OpenAI-shaped responses, so the live-emit is
chat-path-only.

- Export `buildSkillPrimeStepEvents(primes, { runId })` from
  `packages/api/src/agents/skills.ts`. Uses `Constants.USE_PRELIM_RESPONSE_MESSAGE_ID`
  by default so the frontend maps events to the in-flight preliminary
  response message, matching the OAuth emitter.
- In `AgentClient.chatCompletion`, emit via `sendEvent` (or
  `GenerationJobManager.emitChunk` in resumable mode) after
  `injectManualSkillPrimes` runs, before the LLM turn begins.
- Wire `injectManualSkillPrimes` into `openai.js` + `responses.js` after
  `formatAgentMessages`. Refactored the destructure to `let` on
  `indexTokenCountMap` so the injector's returned map is usable.
- 8 new unit tests covering the step-event builder: pair cardinality,
  default/custom runId, TOOL_CALLS shape + JSON args, progress:1 on
  completion, index ordering, stepId/toolCallId pairing, empty input.

* 🎯 fix: Route Skill Prime Events to the Real Response + Sparse-Array Offset

Two bugs in the optimistic-card emit from the last pass.

1. Wrong runId. The events used `USE_PRELIM_RESPONSE_MESSAGE_ID` (the
   MCP OAuth pattern), but OAuth emits DURING tool loading — before the
   real response messageId exists. By the time skill priming fires, the
   graph is about to emit with `this.responseMessageId`, so the PRELIM
   runId orphaned every card onto the client's placeholder response
   entry in `messageMap`, separate from the one the LLM's events were
   building. Net effect: cards never rendered mid-stream.

   Now passing `this.responseMessageId` — the same ID `createRun`
   receives — so synthetic and real steps land on the same `messageMap`
   entry.

2. Index 0 collision. With the runId fixed, card-at-0 would have hit
   `updateContent`'s type-mismatch guard when the LLM's text delta
   arrived at the same index, suppressing the whole text stream.

   New `SKILL_PRIME_INDEX_OFFSET` = 100 placed on both the live SSE
   emit and the server-side `contentParts` assignment. Sparse array
   during streaming renders as `[llm_text, ..., card]` (skip-holes via
   `Array#filter` / `Array#map`). `filterMalformedContentParts` from
   `sendCompletion` compacts to dense `[text, card]` before persistence,
   so streaming UI and saved message agree on order — no finalize
   reorder jank. Post-run switches from `contentParts.unshift` to
   `contentParts[OFFSET + i] = part` to mirror the live placement.

- Add `startIndex` option to `buildSkillPrimeStepEvents` with
  `SKILL_PRIME_INDEX_OFFSET` default. Export the constant from
  `@librechat/api` so `client.js` can reuse it for the post-run splice.
- Update the existing index-ordering test to the new default and add a
  new test for the explicit `startIndex` override.

* 🎗️ feat: Replace \$skill-name Text with Pills on the User Message

The `$skill-name ` cosmetic text the popover was inserting into the
textarea had two problems: it lingered in the user message forever (the
card is a more meaningful marker), and it implied that free-form text
invocation like \"\$foo help me\" should work — which it doesn't, and
supporting it would mean another parsing layer nobody asked for.

Dropped the textarea insertion. Visual confirmation after submit now
comes from a compact `ManualSkillPills` row on the user bubble that
self-extinguishes once the backend's live skill-card stream
(`buildSkillPrimeStepEvents` from the last commit) populates the sibling
assistant response. Multiple skills render as multiple pills — the atom
was already a string array, so multi-select works for free.

- `SkillsCommand.tsx`: select handler no longer writes to the textarea.
  Still drops the trigger `$` via `removeCharIfLast`, still pushes to
  `pendingManualSkillsByConvoId`, still flips `ephemeralAgent.skills`.
- `families.ts`: new `attachedSkillsByMessageId` atomFamily keyed by
  user messageId. `useChatFunctions.ask` writes the drained skill list
  here on every fresh submit (regenerate/continue/edit still skip).
- `ManualSkillPills.tsx` renders pills conditionally: hidden when the
  message isn't a user message, when no skills are attached, or when
  the sibling assistant response already carries a `skill` tool_call
  content part (the live card took over). Reads messages via React Query
  so we don't re-render on every message-state keystroke.
- `Container.tsx` mounts the pills above the user message text, parallel
  to the existing `Files` slot.
- Updated the SkillsCommand select-flow spec to assert the textarea is
  cleared of `$` instead of populated with `\$name `. 5 new tests for
  `ManualSkillPills` covering empty state, non-user message guard,
  multi-skill rendering, the skill-card hide condition, and the
  text-only-content-doesn't-hide case.

* 🎛️ feat: Manual Skills as Persisted Message Field + Compose-Time Chips

Three problems with the previous pass:
1. Cards rendered BELOW the LLM text on the assistant message (and
   stayed there on reload) because the sparse index-100 offset put them
   after the model's content. Now back to `unshift` — cards at the top,
   same as before the live-emit detour.
2. Pills on the user message disappeared the moment the live card
   arrived, so users barely saw them. The live-emit channel also added
   meaningful complexity and relied on a per-message Recoil atom that
   had no clean cleanup story.
3. No visual cue at all during new-chat compose — the `$name ` text was
   removed, the submitted-message pills weren't there yet, and the
   popover closes after selection. User had no way to see what they'd
   queued up before sending.

New architecture: `manualSkills` is a first-class field on `TMessage`,
persisted by the backend on the user message. `ManualSkillPills` reads
straight from `message.manualSkills` — no atom, no sibling-lookup — so
pills survive reload, show in history, and stay for the lifetime of the
message. Compose-time chips above the textarea read the existing
`pendingManualSkillsByConvoId` atom and let users × skills out before
submitting.

Backend reverts:
- `client.js`: dropped the `ON_RUN_STEP` live-emit loop, restored
  `this.contentParts.unshift(...primeParts)` so cards sit at the top of
  the persisted assistant response.
- `skills.ts`: removed `buildSkillPrimeStepEvents` and
  `SKILL_PRIME_INDEX_OFFSET` (both unused now). `GraphEvents`,
  `StepTypes`, and `Constants` imports went with them. Removed 8 tests.

Field persistence:
- `tMessageSchema` gains `manualSkills: z.array(z.string()).optional()`.
- Mongoose message schema gains `manualSkills: { type: [String] }` with
  matching `IMessage` TS field.
- `BaseClient.js` reads `req.body.manualSkills` on user-message save,
  filters to non-empty strings, pins onto `userMessage` before
  `saveMessageToDatabase`. Mirrors the existing `files` pattern right
  above it. Runtime resolution still reads top-level `req.body.manualSkills`
  — persistence and resolution are separate concerns.

Frontend:
- `useChatFunctions.ask` sets `currentMsg.manualSkills` directly; the
  drained atom value goes onto the message, not a separate atom.
  Removed the `attachSkillsToMessage` Recoil callback.
- `ManualSkillPills`: pure render of `message.manualSkills`. No more
  `useQueryClient`, no sibling scan, no atom read. Loses the
  auto-hide-when-card-arrives behavior — pills stay on the user
  bubble, cards live on the assistant bubble, both are informative.
- Dropped the `attachedSkillsByMessageId` atomFamily and its export.
- New `PendingManualSkillsChips` above the textarea reads the
  compose-time atom and renders chips with × to remove. Mounted in
  `ChatForm` right after `TextareaHeader`. Naturally hides on submit
  when the atom drains.

Tests: updated `ManualSkillPills` suite to the new field-based reads
(5 passing). New `PendingManualSkillsChips` suite covering empty state,
multi-chip render, single × removal, and full-clear (4 passing).
Backend suite trimmed to 89 (was 97) from the step-events test
removal — no regressions on the remaining helpers.

* 🧪 feat: Assistant-Side Skill-Loading Chips + Pill Padding

Two small UX fixes on top of the field-on-message architecture.

Pill padding: bumped the user-side `ManualSkillPills` from `py-0.5` to
`py-1` on each chip and added `py-0.5` to the wrapper so the row
breathes a little without feeling tall.

Mid-stream indicator: new `InvokingSkillsIndicator` mirrors the parent
user message's `manualSkills` onto the assistant bubble as transient
"Running X" chips while the real card is in flight. Renders above
`ContentParts` in `MessageParts`. Hides itself when the assistant's
own `content` grows a `skill` tool_call — the authoritative card from
`buildSkillPrimeContentParts.unshift` is showing, so the placeholder
steps aside. No SSE emit, no aggregator injection, no index
collision with the LLM's streaming content: just a render slot keyed
off the parent's field.

Why not stream the cards live: whichever content index we'd choose
either blocks the LLM's text stream (`updateContent` type-mismatch at
index 0) or lands below the response after sparse compaction (index
100+). Mirroring the parent field sidesteps the aggregator entirely
and gives the user an immediate "skill is loading" signal that
naturally gives way to the real card at finalize.

Covers the gap the user flagged: pills on the user message said "I
asked for these" but nothing on the assistant side said "we're
working on it" until the stream finished. 5 new tests for the
indicator: user-msg guard, missing parent-field guard, multi-chip
render, hides-on-card-landing, orphan-parent guard.

* 🔁 fix: Indicator Visibility + Carry Manual Skills Through Regenerate/Edit

Two bugs.

Indicator never rendered: `InvokingSkillsIndicator` looked up the parent
user message via `queryClient.getQueryData([QueryKeys.messages, convoId])`,
but on a new chat the React Query cache is keyed by `"new"` (the URL
`paramId`) until the server assigns a real conversation ID — while
`message.conversationId` on the assistant message is already the server
ID. Lookup missed, `skills.length === 0`, nothing rendered. Switched
to `useChatContext().getMessages()`, which reads from the same
`paramId` the rest of the UI uses, so new-chat and existing-chat cases
both resolve to the correct message list.

Regenerate / save-and-submit dropped manual skills: the compose-time
`pendingManualSkillsByConvoId` atom is drained on the first submit,
so replaying that turn later found an empty atom and sent `manualSkills: []`.
The pills were still on the user bubble, so from the user's point of
view the model was running primed — but the backend saw nothing and
produced an unprimed response.

- Added `overrideManualSkills?: string[]` to `TOptions`. Callers with a
  reference message pass its persisted `manualSkills`; `useChatFunctions.ask`
  uses the override verbatim when present, otherwise falls back to the
  existing drain-or-empty logic.
- `regenerate` in `useChatFunctions` passes `parentMessage.manualSkills`
  — the user message being regenerated has the field persisted by the
  backend, so the second turn primes the same skills as the first.
- `EditMessage.resubmitMessage` covers both edit branches:
  - User-message save-and-submit: forwards the edited message's own
    `manualSkills` so the new sibling turn primes identically.
  - Assistant-response edit: forwards the parent user message's
    `manualSkills` for the same reason.

Indicator test suite converted from `@tanstack/react-query` harness to
a jest-mocked `useChatContext().getMessages()`. 6 tests (was 5), added
a cache-miss case.

* 🧭 fix: Drive Mid-Stream Skill Chips from Submission Atom, Not Message Lookup

Message-ID-keyed lookups kept racing the stream: the user message flips
from its client-side intermediate UUID to the server-assigned ID mid-run,
conversation IDs flip from the URL `paramId="new"` to the real convo
ID on brand-new chats, and the React Query cache splits briefly between
the two. Previous attempts — direct `queryClient.getQueryData` and then
`useChatContext().getMessages()` — each missed a different window.

`TSubmission.manualSkills` is already populated at `ask()` time and the
submission atom (`store.submissionByIndex(index)`) is the single stable
anchor across the whole lifecycle: set once at submit, lives through
every SSE event, cleared when the stream ends. No ID lookups, no cache
timing.

- `InvokingSkillsIndicator` now reads `submissionByIndex(index)` via
  Recoil. Shows chips when:
    • the message is assistant-side,
    • a submission is in flight with non-empty `manualSkills`,
    • the assistant's `parentMessageId` matches
      `submission.userMessage.messageId` (so chips appear only on the
      bubble for the current turn, never on siblings),
    • the assistant's own content doesn't yet carry a `skill`
      tool_call (real card takes over from the server's post-run
      `contentParts.unshift`).
- Drops the `useChatContext().getMessages()` dependency and the
  `useQueryClient` dependency before that. No more lookups by
  conversationId or messageId.

Test suite now mocks `useChatContext` to supply `index: 0` and seeds
the `submissionByIndex(0)` atom via Recoil initializer. 6 cases cover
user-side, no-submission history, empty `manualSkills`, multi-chip
render, hides-on-card-landing, and wrong-turn guard.

* 🌱 fix: Seed Response manualSkills in createdHandler, Indicator Becomes Pure

The mid-stream indicator kept getting wired off state I don't own: first
`queryClient.getQueryData` (raced the new-chat paramId flip), then
`useChatContext().getMessages()` (same cache, same race), then
`useRecoilValue(submissionByIndex)` (pulled every message into the
submission subscription — re-renders all indicators on any submission
change, exactly the "limit hooks in rendering" concern).

Cleanest path is the one the user pointed at: the submission owns the
data, `useSSE` / `useEventHandlers` owns the save points, so seed the
field ONTO the response message at the save site and let the indicator
be a pure prop-read.

- `createdHandler` now writes `manualSkills` onto the initial response
  from `submission.manualSkills` at the moment the placeholder enters
  the messages array. The field rides through the normal mutation
  pipeline via spreads (`useStepHandler` response creation,
  `updateContent` result returns) — no special handling needed.
- `InvokingSkillsIndicator` drops the Recoil / context / queryClient
  reads. Pure function of `message`: if assistant, has `manualSkills`,
  and `content` hasn't grown a `skill` tool_call yet, render chips.
  Only `useLocalize` left, which was already unavoidable for the i18n
  string.
- Renders decouple: no single state change (`submissionByIndex` flip,
  React Query cache update) forces every indicator in the message list
  to re-render anymore. Only the message whose prop changed re-runs.

Finalize story unchanged: server's `responseMessage` doesn't carry the
frontend-only `manualSkills` field, so `finalHandler`'s replacement
drops it — but by then the real `skill` tool_call is in `content`
and the indicator's content-scan hides itself anyway.

Test suite back to pure prop mocks: 7 cases covering user-guard,
no-seed, multi-chip render, skill-card-hide, non-skill-tool-call-keeps,
text-only-keeps, and missing message.

* 🪞 fix: Render Skill Indicator Inside ContentParts, Adjacent to Parts

The indicator still wasn't showing because even though MessageParts
mounted it as a sibling of ContentParts, ContentParts is a `memo`'d
component that owns the only rendering path that refreshes in lockstep
with content deltas. Mounting above it put the indicator one layer
further out — reachable, but not exercised on the same render cycle
that processes the streaming `message` prop.

Moved the indicator into ContentParts itself, rendered at the top of
both the sequential and parallel branches. Reads the `message` prop
(newly threaded through as an optional prop alongside `content`), so:

- Same render cycle as Parts — updates from the SSE pipeline flow
  through the same pathway.
- Lives outside the `content.map`, so delta-driven content reshuffles
  never wipe it.
- Still a pure prop-read inside the indicator itself (no Recoil,
  queryClient, context hooks). The only dep is `useLocalize`.

Thread:
- `ContentPartsProps` gains `message?: TMessage`.
- `MessageParts` passes `message={message}` through, drops its own
  indicator mount + import.
- `ContentParts` renders `<InvokingSkillsIndicator message={message} />`
  in both the parallel-content and sequential-content branches, right
  under `MemoryArtifacts` and before the empty-cursor / parts map.

Companion data flow (unchanged): `createdHandler` seeds
`initialResponse.manualSkills` from `submission.manualSkills`; the
field rides through `useStepHandler` via spreads; indicator hides on
`skill` tool_call landing in `content`.

* 🔎 refactor: Narrow Skill Components to Scalar skills Prop, Kill Memo Churn

Passing the full `message` object into presentational components busts
`React.memo` shallow comparisons every time the message reference changes
for unrelated reasons. Swap to scalar `skills?: string[]` throughout:

- `InvokingSkillsIndicator`: props-only (`skills?: string[]`); visibility
  logic (user-vs-assistant, skill tool_call arrival) now lives in the
  caller so this stays pure presentational.
- `ManualSkillPills`: props-only (`skills?: string[]`).
- `ContentParts`: takes `manualSkills?: string[]` scalar, computes
  `showInvokingSkills` once per render from `manualSkills` + content scan
  for the `skill` tool_call, then mounts the indicator with `skills=`
  prop in both parallel and sequential branches.
- `MessageParts`: passes `manualSkills={message.manualSkills}` through
  to `ContentParts`.
- `Container`: passes `skills={message.manualSkills}` to `ManualSkillPills`.
- Tests updated to exercise the narrowed prop surface.

* 📜 feat: Mid-Stream Skill Cards via SkillCall, Drop Custom Indicator

Instead of a separate `InvokingSkillsIndicator` chip component, render
pending skill placeholders through the existing `SkillCall` renderer —
same component the backend's finalized prime part uses. The loading
visual (`progress < 1` + empty output → pulsing "Running X") and the
completed visual ("Ran X") now come from one source of truth.

`ContentParts` computes `pendingSkillNames` from `manualSkills` minus
any `skill` tool_call already in `content` (dedupe by `args.skillName`
since the synthetic's id differs from the real one). Those names
render through a separate slot ABOVE the Parts iteration — not
prepended to the content array, which would shift React keys on
every downstream streaming text / tool part and force unmount/remount
mid-stream.

When the real prime `tool_call` lands at finalize (backend unshifts to
content[0..]), `collectExistingSkillNames` picks it up, the pending
set empties, and the real part takes over rendering in the Parts
iteration. Layout is identical either way because primes are always
at the top of content.

- `InvokingSkillsIndicator.tsx` + test deleted (no longer referenced)
- `ContentParts.tsx` renders `<SkillCall .../>` directly for pending
  names, mirrors `Part.tsx`'s usage of the same component
- `createdHandler` doc comment updated to reflect the new flow

* ✂️ fix: Render Interim Skill Cards From manualSkills Only, Leave Content Untouched

Previous revision read `content` to de-dupe pending cards against real
`skill` tool_calls, so any optimistic skill part streamed from the
backend would race our placeholder off the screen mid-turn — exactly
the "getting overridden" symptom.

Now: interim `SkillCall` cards are driven purely by the response
message's `manualSkills` field. `content` is never inspected here,
so no backend delta can pull the cards down. The field is now seeded
directly onto the assistant placeholder in `useChatFunctions` (not
only in `createdHandler`) so the cards appear from the first render,
before the `created` SSE event round-trip.

Lifecycle:
- `useChatFunctions` puts `manualSkills` on the freshly-minted
  `initialResponse` — cards render the instant the placeholder lands.
- `createdHandler` keeps its own re-seed (idempotent; safe) so a
  regenerate / save-and-submit flow that hits that path still works.
- `useStepHandler` spread operations preserve the field through every
  content update.
- `finalHandler` replaces the message with the server-backed
  `responseMessage` (no `manualSkills`) — cards disappear, and the
  real `skill` tool_call part in `content` takes over.

ContentParts changes:
- Drop `collectExistingSkillNames` / `parseJsonField` dedupe path.
- `renderPendingSkills` reads only `manualSkills` + `isCreatedByUser`.
- Simpler control flow — one boolean (`hasPendingSkills`) gates the
  early return, one function renders.

* 🩹 fix: Codex Review Resolutions — Localization, Guards, Tests, Docs

Addresses seven findings from comprehensive code review:

Finding 1 (MAJOR) — Document sticky re-priming as intentional
- `buildSkillPrimeContentParts`: expanded doc comment explaining
  synthetic `skill` tool_calls persist and get re-primed on every
  subsequent turn via `extractInvokedSkillsFromPayload` (shape parity
  with model-invoked skills). This matches the UX: the assistant
  skill card is a visible, persistent signal that the skill is active
  for the conversation. Not a bug — called out explicitly so future
  maintainers don't mistake it for one.

Finding 2 (MAJOR) — Add ContentParts render tests
- New `ContentParts.test.tsx` with 7 cases covering the interim skill
  card logic: assistant-only rendering, user-message suppression,
  undefined-content safety, parallel+sequential branch integration,
  progress<1 (pending) state. Child components mocked so the test
  exercises only the branching and prop wiring ContentParts owns.

Finding 3 (MINOR) — Localize hardcoded aria-labels
- Added `com_ui_skills_manual_invoked` + `com_ui_skills_queued` keys.
- Reused existing `com_ui_remove_skill_var` for the remove-button
  aria-label.
- `PendingManualSkillsChips` and `ManualSkillPills` now call
  `useLocalize()`. Test mocks updated to the label-echo pattern.

Finding 4 (MINOR) — Max-length guard in `extractManualSkills`
- New `MAX_SKILL_NAME_LENGTH = 200` constant and filter. Blocks a
  crafted payload like `{ manualSkills: ['a'.repeat(100000)] }` from
  reaching `getSkillByName` / Mongo's query planner.

Finding 5 (NIT) — `BaseClient.js` comment contradicted itself
- Rewrote to call the filter what it is: defense-in-depth on top of
  Mongoose schema validation, not a redundant second layer.

Finding 6 (NIT) — `ManualSkillPills` now wrapped in `React.memo`
- Consistent with peer components (`PendingManualSkillsChips`,
  `ContentParts`). Rendered inside `Container`, which re-renders on
  every content update, so the memo is a real cycle savings.

Finding 7 (NIT) — Redundant guard in `ContentParts.renderPendingSkills`
- Collapsed the duplicate null-check by computing `pendingSkills` as
  a `useMemo`'d array (`[]` when not applicable), and mapping
  directly. `hasPendingSkills` now derives from the array length —
  one source of truth, no redundant gate inside the render function.

* 🔧 fix: Update ParallelContent to Handle Optional Content Prop

Modified the `ParallelContentRendererProps` to make the `content` prop optional, ensuring safer access within the component. Adjusted the calculation of `lastContentIdx` to handle cases where `content` may be undefined, preventing potential runtime errors. This change enhances the robustness of the component when dealing with varying message structures.

* 🎯 fix: Thread manualSkills Through ContentRender — The Real Renderer

This is why the interim skill cards never appeared across many rounds of
iteration: `ContentRender.tsx` (the memo'd renderer used by most paths,
including the agents endpoint) was calling `ContentParts` without the
`manualSkills` prop. Only `MessageParts.tsx` had it wired up — and
that's not the component that actually renders the assistant response
in production.

Two fixes:
1. Pass `manualSkills={msg.manualSkills}` to the `ContentParts` call.
2. Extend the `areContentRenderPropsEqual` memo comparator to include
   `manualSkills.length`, otherwise a message update that adds the
   field (seeded by `useChatFunctions` on the initialResponse) would
   be bailed out by the memo and never re-render.

Verified the two ContentParts call sites are now consistent; Container
usages for `ManualSkillPills` on the user side were already correct.

* 🧹 polish: Address Audit Follow-Up (F1/F3/F6)

F1 — Clarify sticky re-priming opt-out path.
  The previous comment said "regenerate without the pick" as one
  opt-out, but `useChatFunctions.regenerate` forwards the original
  picks via `overrideManualSkills`, so regeneration alone keeps the
  skill sticky. Updated to: edit the originating message to remove
  the pills and resubmit, or start a new conversation.

F3 — Add DOM-order assertions to the parallel + sequential tests.
  The two "alongside" tests verified both elements existed but
  didn't pin the ordering contract. Both now use
  `compareDocumentPosition` to assert the pending SkillCall
  precedes the real content, matching the backend semantic
  (`contentParts.unshift(...primeParts)` puts primes at the top).

F6 — Fix package import order in PendingManualSkillsChips.
  `recoil` (58 chars) was listed before `lucide-react` (45 chars)
  which violates the "shortest to longest after react" rule in
  AGENTS.md. Swapped order; no behavior change.

F2 / F4 / F5 from the audit were confirmed as non-issues
(React-safe empty map, cosmetic test-mock artifact, accepted
memo tradeoff) and require no change.

* ✨ feat: Dedicated PendingSkillCall + Running→Ran Transition on Real Content

UX polish on the interim skill card now that it's actually rendering:

1. New `PendingSkillCall` component (mirrors `SkillCall` visually but
   drops the expand affordance). `SkillCall`'s underlying `ProgressText`
   always renders a chevron + clickable button when any input is
   present, which on a card with empty output points at nothing —
   misleading cursor:pointer and a no-op toggle. The pending variant
   has only the icon + label, no button wrapper, no chevron.

2. "Running X" → "Ran X" transition when real content lands.
   `ContentParts` computes `hasRealContent` (any non-text part, or a
   text part with non-empty content — placeholder empty-text parts
   don't count) and passes `loaded={hasRealContent}` to
   `PendingSkillCall`. Matches what users see for model-invoked skills
   as they finish priming: pulsing shimmer → static icon.

3. Cleanup:
   - Dropped direct `SkillCall` import from `ContentParts` (replaced
     by `PendingSkillCall`). `SkillCall` is still used by `Part` for
     real `skill` tool_call content parts — no behavior change there.
   - Removed the now-redundant explicit `manualSkills` assignment
     in `createdHandler`. `useChatFunctions` seeds the field on
     `initialResponse` at construction, so the `...submission.initialResponse`
     spread already carries it through — the re-assignment was
     defensive belt-and-suspenders doing the same work twice. Comment
     rewritten to describe the actual lifecycle.

Tests updated to the new component (12/12 pass): two new cases pin
the loaded-state transition (unloaded when content has no real parts,
flips to loaded once a non-empty text part lands).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant