Commit f030edc
🤝 fix: Load Handoff Agents for Agents API (danny-avila#12740)
* 🤝 fix: load handoff sub-agents on OpenAI-compat endpoints (danny-avila#12726)
Extracts the BFS discovery + ACL-gated initialization of handoff sub-agents
into a shared `discoverConnectedAgents` helper in `@librechat/api` and
wires it into the OpenAI-compatible `/v1/chat/completions` and Open
Responses `/v1/responses` controllers. These endpoints previously only
passed the primary agent config to `createRun` while keeping
`primaryConfig.edges` intact, which forced `MultiAgentGraph` into
multi-agent mode without loading the referenced sub-agents and caused
StateGraph to throw "Found edge ending at unknown node <id>".
The discovery helper also filters orphaned edges (deleted sub-agents or
those the caller lacks VIEW permission on), so API users see the same
graceful fallback the chat UI already had.
* 🧪 fix: use ServerRequest in discovery spec helpers
CI `tsc --noEmit -p packages/api/tsconfig.json` caught that the test
helpers typed `req` as `express.Request`, which is not assignable to
`DiscoverConnectedAgentsParams.req` (typed as `ServerRequest` whose
`user` is `IUser`). Local jest passed because ts-jest is transpile-only,
but the CI typecheck uses the full compiler.
* 🪲 fix: drop orphan edges on both endpoints, not just `to`
Addresses the P1 codex finding on danny-avila#12740: `filterOrphanedEdges`
previously only removed edges whose `to` referenced a skipped agent.
Edges whose `from` was a skipped agent — the symmetric case in a
bidirectional graph like `A <-> B` where `B` is deleted or the user
lacks VIEW on it — leaked through to `createRun` and re-triggered
`Found edge ending at unknown node <id>` at StateGraph compile time.
The filter now drops an edge if either endpoint references a skipped
id, and the existing `to`-only test cases were updated to reflect the
stricter behavior. Adds a bidirectional-graph regression test in
`discovery.spec.ts`.
* 🔒 fix: enforce REMOTE_AGENT ACL on handoff sub-agents for API routes
Addresses the second P1 codex finding on danny-avila#12740: the OpenAI-compat
`/v1/chat/completions` and Open Responses `/v1/responses` routes gate
the primary agent on `REMOTE_AGENT` (via `createCheckRemoteAgentAccess`),
but `discoverConnectedAgents` was checking handoff sub-agents against
the looser in-app `AGENT` resource type. That allowed a remote caller
who could reach the orchestrator but had only in-app visibility on a
sub-agent to invoke it via the API — bypassing the remote-sharing
boundary.
Adds an optional `resourceType` param to `discoverConnectedAgents`
(defaulting to `AGENT` for the chat UI path) and passes
`ResourceType.REMOTE_AGENT` from both API controllers so every
discovered sub-agent clears the same sharing boundary enforced at
route entry.
* 🧯 fix: enforce allowedProviders for discovered sub-agents
Addresses the third P1 codex finding on danny-avila#12740: `discoverConnectedAgents`
forwarded the caller's `endpointOption` verbatim into `initializeAgent`,
but on the OpenAI-compat routes that option's `endpoint` is the primary
agent's provider (e.g. `openai`), not `agents`. `initializeAgent` only
enforces `allowedProviders` when `isAgentsEndpoint(endpointOption.endpoint)`
is true, so handoff sub-agents silently bypassed the provider allowlist
configured under `endpoints.agents.allowedProviders`.
Override `endpointOption.endpoint` to `EModelEndpoint.agents` for every
per-sub-agent init call. The primary agent still uses the caller's
endpointOption as before — this only affects the BFS-loaded handoff
targets. Regression test asserts the override.
* ✂️ fix: prune unreachable sub-agents after orphan-edge filtering
Addresses the fourth P1 codex finding on danny-avila#12740: BFS eagerly initializes
every sub-agent referenced in the primary's edge scan, but once
`filterOrphanedEdges` drops edges whose endpoints were skipped, some of
those sub-agents end up disconnected from the primary. In an `A -> B ->
C` graph (edges stored directly on A) where B is skipped (missing or
no VIEW), both edges are filtered, but C was already loaded and would
still be passed to `createRun` — which flips into multi-agent mode on
`agents.length > 1` and turns C into an unintended parallel start node.
After filtering edges, compute the set of agent ids reachable from the
primary through the surviving edge set and prune `agentConfigs` to that
set. Two regression tests added: one for the pruning case, one that
confirms agents connected via surviving edges are still kept.
* 🔁 fix: don't seed initialize.js agentConfigs from the pre-pruning callback
Addresses the fifth P1 codex finding on danny-avila#12740: `onAgentInitialized`
fires during BFS, BEFORE the helper prunes agents that become
disconnected once `filterOrphanedEdges` runs. Writing the sub-agent
straight into the outer `agentConfigs` there and then only additively
merging the pruned `discoveredConfigs` left stranded entries in the
outer map, and `AgentClient` would still hand them to `createRun` as
extra parallel start nodes (the exact failure mode the pass-4 prune
was meant to eliminate for the API controllers).
Drop the `agentConfigs.set` from the callback and replace the additive
merge with a direct copy from `discoveredConfigs`, which is now the
single authoritative source of what the run should see. The
per-agent tool context map is still populated during BFS — stale
entries there are harmless because they're only read by closure inside
`ON_TOOL_EXECUTE` and are unreachable once the agent is not in
`agentConfigs`.
* 🔬 fix: address audit findings on discovery helper
Resolves findings from a comprehensive external audit of danny-avila#12740.
**Finding 1 (CRITICAL) — stale edges survive the reachability prune.**
The pass-4 prune removed unreachable agents from `agentConfigs` but left
matching edges in the return value. In an `A -> B -> C -> D` graph (all
edges stored on A) where B is skipped, `filterOrphanedEdges` drops A->B
and B->C but keeps C->D (neither endpoint is skipped). The caller then
sees `agentConfigs` without C/D but `edges` still references them,
flipping `createRun` into multi-agent mode with mismatched agents/edges
— the exact crash this PR is supposed to fix. Now filter the edge list
to the reachable set in the same pass, so the returned shape is
self-consistent: every edge endpoint is either the primary id or a key
of `agentConfigs`. New regression test covers A->B->C->D with B skipped.
**Finding 2 (MAJOR) — unconditional `getModelsConfig` on every API
request.** The OpenAI-compat and Responses controllers called
`getModelsConfig(req)` and `discoverConnectedAgents` even when the
primary agent had no edges (the common single-agent API case). Gate
both behind `primaryConfig.edges?.length > 0` so single-agent runs
don't pay that cost.
**Finding 5 (MINOR) — silent mutation of caller's
`primaryConfig.userMCPAuthMap`.** The helper aliased that object and
then `Object.assign`'d sub-agent entries into it, changing the caller's
config in-place. Shallow-clone up front so the returned merged map is
the only destination.
**Finding 7 (NIT) — dead `?? []` coalescing.**
`filterOrphanedEdges` always returns a concrete array, so the
`discoveredEdges ?? []` fallback was never reached. Simplified the
`primaryConfig.edges = …` assignment.
Also adds a test that verifies `primaryConfig.userMCPAuthMap` is not
mutated in-place.
* 🧹 chore: address audit NITs on discovery helper
Addresses two NIT findings from the post-fix audit:
**F1** — the shallow clone on `primaryConfig.userMCPAuthMap` was only
applied on the primary side; the `else` branch (hit when the primary
had no MCP auth and the first sub-agent seeds the map) assigned the
sub-agent's `config.userMCPAuthMap` directly, so a later sub-agent's
`Object.assign` mutated the first one's map in place. Harmless in
practice (per-request ephemeral objects) but asymmetric. Clone in the
else branch too. Test added.
**F2** — `initialize.js` had a defensive `if (agentConfigs.size > 0 &&
!edges) edges = []` normalizer. Pre-existing dead code: the helper now
always returns a concrete array from `filteredEdges.filter(...)`.
Removed for clarity.
* 🕸 fix: require all sources reachable when traversing fan-in edges
Addresses the seventh P1 codex finding on danny-avila#12740: the reachability BFS
advanced through an edge as soon as any of its `from` endpoints matched
the current frontier node (`sources.includes(current)`), but the
subsequent edge filter required ALL sources to be reachable (`every`).
The two-semantics mismatch let a fan-in edge like `{from: ['A','B'],
to: 'C'}` mark C reachable purely via A even when B had no path from
the primary, then drop the edge itself at filter time. Result: C
survived in `agentConfigs` with no surviving edge connecting it to A,
so `createRun` flipped into multi-agent mode on `agents.length > 1`
and C ran as an unintended parallel root.
Replace the BFS with a fixed-point iteration keyed on the same
all-sources-reachable predicate used by the filter, so traversal and
filtering stay aligned and multi-source edges only fire once every
source is in the reachable set.
Two regression tests added:
- `{from: ['A','B'], to: 'C'}` with B having no incoming path — asserts
neither B nor C leak into the result.
- `A -> B`, `A -> C`, `['B','C'] -> D` — asserts the fan-in edge fires
and D becomes reachable once both B and C are.
* 🔀 fix: match SDK OR semantics for multi-source edge reachability
Reverts the all-sources-required reachability gate from 4982f1c and
replaces it with an any-source-reachable model, which matches how
`@librechat/agents`'s `MultiAgentGraph.createWorkflow` actually wires
multi-source edges at runtime (per-source `builder.addEdge(source,
destination)`). With the previous `every` gate, a legitimate handoff
edge `{ from: ['A', 'B'], to: 'C' }` where B had no incoming path was
pruned along with C, regressing OR-semantics routing that the SDK
would otherwise handle correctly.
New behavior:
1. Reachability: an edge advances when ANY of its `from` endpoints is
already reachable. Fixed-point iteration over `filteredEdges`.
2. Edge filter: keep an edge when it has at least one reachable source
AND all destinations are reachable (a missing destination would
still crash `StateGraph.compile` with `Found edge ending at unknown
node`).
3. Agent prune: keep agents that are reachable OR referenced on any
endpoint of a surviving edge. The second clause preserves co-sources
in multi-source edges (B in `{ from: ['A','B'], to: 'C' }` when
nothing else reaches B) so the SDK's per-source `addEdge` — and the
`validateEdgeAgents` safety-net I added to the SDK in danny-avila#111 — still
finds B as a node.
The pass-audit A->B->C->D regression test continues to pass: with B
skipped, `filterOrphanedEdges` drops both B-adjacent edges, reachability
never expands past A, C->D has no reachable source so it gets filtered,
and C/D are pruned because they're neither reachable nor referenced.
* ✂️ fix: strip skipped co-members from multi-source/multi-dest edges
Addresses codex pass-9 P2 on danny-avila#12740. `filterOrphanedEdges` previously
dropped an edge whenever any `from` id was skipped, which was correct
for scalar edges but over-aggressive for multi-source ones: the agents
SDK adds one `builder.addEdge(source, destination)` per source, so
`{ from: ['A','B'], to: 'C' }` with B skipped still has a valid
`A -> C` route that was being thrown away.
Now sanitize each endpoint:
- Scalar skipped → drop the whole edge (no route survives).
- Array with some skipped → strip the skipped ids, keep the edge with
the surviving members. If the array empties out, drop the edge.
Symmetric handling for `to` covers multi-destination fan-out when one
co-destination is skipped. Tests updated/added:
- `strips skipped co-sources from multi-source edges…`
- `strips skipped co-destinations from multi-destination edges`
- `drops multi-member edges only when every member on a side is skipped`
- Discovery-side: `preserves valid routes when one co-source of a
multi-source edge is skipped` asserts the end-to-end behavior —
skipped co-source B gets stripped from the edge, A->C routing
survives, and C remains in `agentConfigs`.
* 🔓 fix: respect SHARE-on-AGENT fallback for handoff ACL on API routes
Addresses codex pass-10 P1 on danny-avila#12740. The API controllers were handing
`discoverConnectedAgents` a raw `PermissionService.checkPermission` call
against `ResourceType.REMOTE_AGENT`, but the route-level middleware
(`createCheckRemoteAgentAccess`) authorizes the primary agent via
`getRemoteAgentPermissions`, which first consults the AGENT ACL and
treats owners with the SHARE bit as remotely authorized even without
an explicit REMOTE_AGENT grant. The mismatch meant a user could open
the primary via `/v1/chat/completions` or `/v1/responses`, but their
own owned handoff sub-agents were silently skipped — breaking
multi-agent handoffs for the common "owner runs their own multi-agent
orchestrator" case.
Both controllers now pass `discoverConnectedAgents` a `checkPermission`
wrapper that delegates to `getRemoteAgentPermissions` (with
`getEffectivePermissions` injected from `PermissionService`) and
compares the returned bitmask against the required permission via
`hasPermissions`. Sub-agents are now authorized by the exact same
rules the route middleware applies to the primary.
* 🌱 fix: preserve user-defined parallel-start branches
Addresses codex pass-11 P2 on danny-avila#12740. The post-filter reachability
prune seeded only from `primaryConfig.id`, which killed
`MultiAgentGraph`'s legitimate multi-start pattern — a user-defined
edge like `X -> Y` where X has no incoming path (X is an intentional
parallel starting node, run alongside the primary) was being dropped
because neither X nor Y was reachable from the primary.
Reconcile the tension with pass-4 ("prune accidental orphans when an
intermediate is skipped") by using pre-filter reachability as the
signal:
- An agent that WAS reachable from the primary via the original
(pre-filter) edges but loses that path when `filterOrphanedEdges`
runs is an accidental orphan (a skipped hop broke the chain) — prune.
- An agent that was NEVER reachable from the primary, even pre-filter,
is an intentional parallel start — seed it into post-filter
reachability so its component survives.
Surviving-edge endpoint references still keep an agent (co-sources in
multi-source edges). New test `preserves user-defined parallel-start
branches disconnected from the primary` covers the pass-11 scenario;
the existing `A->B->C->D, B skipped` regression test continues to
pass because C/D were pre-filter reachable through B and lose that
reachability after filtering.
* 🎯 fix: tighten parallel-start seed criterion to 'no pre-filter incoming edge'
Addresses codex pass-12 P1 on danny-avila#12740. The pass-11 seed heuristic — 'agent
is in `agentConfigs` but was not pre-filter reachable from the primary' —
was too permissive. A downstream agent like Y in `X -> Y` where X gets
skipped (missing / no VIEW) was never pre-filter reachable from the
primary either, so the old rule promoted Y to a parallel start node and
discovery returned `agents: [primary, Y]` with no connecting edge. The
SDK then ran Y as an unintended parallel root — exactly the orphan
behavior pass-4 wanted to prevent.
Tighter criterion: seed a post-filter reachability root only when the
agent had NO incoming edge in the pre-filter graph. That matches
`MultiAgentGraph.analyzeGraph`'s "no-incoming-edge" definition of a
start node applied to the user's original declared topology, so:
- `A -> B` plus a user-defined `X -> Y` parallel branch: X has no
incoming pre-filter → seeded → X and Y both survive.
- `A -> B` plus `X -> Y` with X skipped: Y had an incoming pre-filter
(`X -> Y`) → NOT seeded → Y is pruned as the orphan it is.
- `A -> B -> C` with B skipped: C had an incoming pre-filter (`B -> C`)
→ NOT seeded → C is pruned.
New test `does not promote a downstream orphan to a parallel start when
its only upstream is skipped` locks in the pass-12 scenario. The pass-11
`preserves user-defined parallel-start branches` test continues to hold.
* 📁 fix: don't enforce AGENT-only file ACL on REMOTE_AGENT API callers
Addresses codex pass-13 P1 on danny-avila#12740. When I refactored the API
controllers' DB-method bundle, I inadvertently started forwarding
`filterFilesByAgentAccess` into `initializeAgent`. That helper calls
`checkPermission` with `resourceType: ResourceType.AGENT`, but these
routes authorize callers through `REMOTE_AGENT` (via
`getRemoteAgentPermissions`). A user granted `REMOTE_AGENT_VIEWER` on
a shared agent but lacking direct `AGENT_VIEW` could invoke the agent
yet all its owner-attached context files would get silently filtered
out — breaking `file_search`/context retrieval for remote consumers.
Drop `filterFilesByAgentAccess` from the OpenAI-compat and Responses
controllers' `dbMethods` (and remove the now-unused import). The chat
UI's `initialize.js` keeps it since that path legitimately authorizes
at the AGENT level. No functional change inside the helper — passing
`undefined` simply tells `primeResources` to skip the per-file ACL
filter, restoring the pre-refactor API behavior.
* 🪓 fix: strip unreachable co-sources from surviving multi-source edges
Addresses codex pass-14 P1 on danny-avila#12740. The earlier pass-8 fix kept any
agent referenced as an endpoint of a surviving edge (via a
`referencedByEdge` fallback) to avoid the SDK's `validateEdgeAgents`
failing on missing nodes. But that fallback propped up unreachable
co-sources too: with `[A -> C, X -> B, [B,C] -> D]` and X skipped,
`X -> B` gets filtered, the `[B,C] -> D` fan-in survives because C is
reachable, and B stays in `agentConfigs` solely because the fan-in
still lists it. `MultiAgentGraph.analyzeGraph` then sees B with no
incoming edge and runs it as an unintended parallel root.
Sanitize surviving edges instead: for a kept edge whose `from` is an
array, filter out any co-source that isn't reachable. The SDK's
per-source `addEdge` fires independently, so dropping an unreachable
co-source doesn't invalidate the remaining routes — in the scenario
above `[B,C] -> D` becomes `[C] -> D`, every endpoint of every
surviving edge is now reachable, and the agent prune collapses to a
strict `reachable.has(agentId)` check. No more referenced-by-edge
fallback.
Regression test added: `strips unreachable co-sources from surviving
multi-source edges (no stray parallel root)` — asserts B is absent
from every surviving edge endpoint and the fan-in's `from` is just
`['C']`. All 22 prior discovery tests still pass unchanged.1 parent dda39a6 commit f030edc
11 files changed
Lines changed: 1931 additions & 201 deletions
File tree
- api/server
- controllers/agents
- __tests__
- services/Endpoints/agents
- packages/api/src/agents
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
72 | 74 | | |
73 | 75 | | |
74 | 76 | | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
75 | 95 | | |
76 | 96 | | |
77 | 97 | | |
| |||
91 | 111 | | |
92 | 112 | | |
93 | 113 | | |
| 114 | + | |
94 | 115 | | |
95 | 116 | | |
96 | 117 | | |
| |||
Lines changed: 21 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
46 | 47 | | |
47 | 48 | | |
48 | 49 | | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
49 | 57 | | |
50 | 58 | | |
51 | 59 | | |
| |||
121 | 129 | | |
122 | 130 | | |
123 | 131 | | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
124 | 145 | | |
125 | 146 | | |
126 | 147 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
5 | 10 | | |
6 | 11 | | |
7 | 12 | | |
| |||
21 | 26 | | |
22 | 27 | | |
23 | 28 | | |
| 29 | + | |
| 30 | + | |
24 | 31 | | |
25 | 32 | | |
26 | 33 | | |
| |||
29 | 36 | | |
30 | 37 | | |
31 | 38 | | |
32 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
33 | 45 | | |
34 | 46 | | |
35 | 47 | | |
| |||
207 | 219 | | |
208 | 220 | | |
209 | 221 | | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
210 | 240 | | |
211 | 241 | | |
212 | 242 | | |
| |||
220 | 250 | | |
221 | 251 | | |
222 | 252 | | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
| 253 | + | |
234 | 254 | | |
235 | 255 | | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
236 | 339 | | |
237 | 340 | | |
238 | 341 | | |
| |||
270 | 373 | | |
271 | 374 | | |
272 | 375 | | |
273 | | - | |
| 376 | + | |
| 377 | + | |
274 | 378 | | |
275 | 379 | | |
276 | 380 | | |
277 | | - | |
278 | 381 | | |
| 382 | + | |
279 | 383 | | |
280 | | - | |
281 | | - | |
282 | | - | |
283 | | - | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
284 | 388 | | |
285 | 389 | | |
286 | 390 | | |
| |||
467 | 571 | | |
468 | 572 | | |
469 | 573 | | |
470 | | - | |
471 | | - | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
472 | 579 | | |
473 | 580 | | |
474 | | - | |
| 581 | + | |
475 | 582 | | |
476 | 583 | | |
477 | 584 | | |
| |||
0 commit comments