fix(core): Trace sub-workflow executions within their parent workflow trace#28593
fix(core): Trace sub-workflow executions within their parent workflow trace#28593BCSHroyer wants to merge 3 commits inton8n-io:masterfrom
Conversation
7afe865 to
eaf4c0b
Compare
|
@CLAassistant recheck |
`getLifecycleHooksForSubExecutions` was the only lifecycle-hooks factory that did not call `ModulesHooksRegistry.addHooks`. As a result, every backend module that reacts to workflow lifecycle events through `@OnLifecycleEvent` (OpenTelemetry tracing, and any future module) was silently skipped whenever a sub-workflow ran via the Execute Workflow node. Also plumb `parentExecution` through `ExecutionLifecycleHooks` and the `WorkflowExecuteBeforeContext` so consumers can link a sub-execution to its parent. Made-with: Cursor
When a workflow is invoked via the Execute Workflow node, its `workflow.execute` span is now parented to the calling execution's `workflow.execute` span, producing a single connected trace instead of two disconnected root traces. Falls back to a root span when no parent execution is supplied or when the parent span is not found in the registry (for example, if the parent ran in a different process). Made-with: Cursor
eaf4c0b to
f704d73
Compare
There was a problem hiding this comment.
No issues found across 6 files
Architecture diagram
sequenceDiagram
participant Node as Execute Workflow Node
participant Factory as Hooks Factory
participant Registry as Modules Hooks Registry
participant Handler as OTEL WorkflowStartHandler
participant Spans as Span Registry
participant Tracer as OTEL Tracer
Note over Node, Tracer: Sub-workflow Execution Initialization
Node->>Factory: getLifecycleHooksForSubExecutions(parentExecution)
Factory->>Registry: NEW: addHooks(hooks)
Note right of Registry: Registers module-specific handlers<br/>(like OpenTelemetry) for the sub-execution
Registry-->>Factory:
Factory-->>Node: return hooks instance
Note over Node, Tracer: Sub-workflow Start Hook Phase
Node->>Factory: runHook('workflowExecuteBefore')
Factory->>Handler: handle(ctx)
Note right of Handler: CHANGED: ctx now includes parentExecution
Handler->>Spans: CHANGED: getWorkflow(ctx.parentExecution.id)
alt Parent Span Found
Spans-->>Handler: parentSpan
Handler->>Handler: Create parent context from span
else Parent Span Missing (e.g. Scaling Mode)
Spans-->>Handler: undefined
Handler->>Handler: Use active/root context
end
Handler->>Tracer: CHANGED: startSpan('workflow.execute', ..., parentCtx)
Note right of Tracer: Sub-workflow span is now nested<br/>under parent in the same trace
Tracer-->>Handler: childSpan
Handler->>Spans: addWorkflow(childExecutionId, childSpan)
Handler-->>Factory:
Factory-->>Node:
Note over Node, Tracer: Result: Unified Trace with correct hierarchy
|
Hey @geemanjs @alielkhateeb — since you both built out the original OTel tracing support (#27528, #27789, #27568), figured you'd be the best folks to sanity-check this follow-up. It's a small fix: No rush — flagging whenever you've got a moment. Thanks! |
|
Hi @BCSHroyer Thanks for raising this PR! I have been working on trace propagation (inbound - webhooks / outbound - http requests and across instances in queue mode) using That work might actually have solved this problem also (but I will validate tomorrow and report back here). |
|
Hi @geemanjs That's great, The inbound outbound propagation was another PR I was actually looking into but wanted to wait till this one was resolved first. Thanks! |
|
Hey @BCSHroyer, Thank you for your contribution. We appreciate the time and effort you’ve taken to submit this pull request. Before we can proceed, please ensure the following: Regarding new nodes: If your node integrates with an AI service that you own or represent, please email nodes@n8n.io and we will be happy to discuss the best approach. About review timelines: Thank you again for contributing to n8n. |
|
Thanks @BCSHroyer I've confirmed this working in the work I did - which can be found here: We also added support for "wait/resume" nodes as well as inbound/outbound propagation! I will close this PR unless you have something more to add? Appreciate your patience and thanks for doing the work here! James |
|
That's great I'll take a look at this today. Do you what version of n8n it
will be in so I can test it after its released?
Thanks!
Brian
…On Fri, Apr 24, 2026 at 4:51 AM James Gee ***@***.***> wrote:
*geemanjs* left a comment (n8n-io/n8n#28593)
<#28593 (comment)>
Thanks @BCSHroyer <https://github.com/BCSHroyer>
I've confirmed this working in the work I did - which can be found here:
#28720 <#28720>
We also added support for "wait/resume" nodes as well as inbound/outbound
propagation!
I will close this PR unless you have something more to add? Appreciate
your patience and thanks for doing the work here!
James
—
Reply to this email directly, view it on GitHub
<#28593 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYJD26AS5WEYHP5ZNAPDLT4XMTJBAVCNFSM6AAAAACX4DJ3JCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGMJRHA3TEMBSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Hi Brian! It will be in Tuesday's beta release - I believe this will be a But if you want to test it today there is a nightly build which is built from the I hope it delivers on what you are hoping for! James |
|
Hi @BCSHroyer just to let you know this was just released under 2.19.0 Have a great week and thanks for your help on this |
|
Great! Thank you for letting me know! |
Context: follow-up bugfix that completes the OTel tracing coverage introduced in #27528 and #27789 — without this, sub-workflow executions were invisible in the trace tree.
Summary
Sub-workflow executions (workflows invoked via the Execute Workflow node) were not being traced by the OpenTelemetry module. The parent workflow's trace showed only the parent's spans, and the sub-workflow's
workflow.execute/node.executespans never appeared. This affected any backend module that relies on@OnLifecycleEventto observe the lifecycle of a sub-execution; OpenTelemetry was the most visible symptom.Root cause.
getLifecycleHooksForSubExecutionswas the only lifecycle-hooks factory that did not callContainer.get(ModulesHooksRegistry).addHooks(hooks). The other three factories (Main,ScalingMain,ScalingWorker) all register module hooks, so sub-executions silently bypassed every module-registered@OnLifecycleEventhandler.Secondary gap.
WorkflowExecuteBeforeContextdid not carryparentExecution, so even with module hooks registered, the OTEL handler had no way to link the sub-workflow'sworkflow.executespan to its parent's trace.Changes
Commit 1 —
fix(core): Register module lifecycle hooks for sub-workflow executionsContainer.get(ModulesHooksRegistry).addHooks(hooks)togetLifecycleHooksForSubExecutions, matching the other three factories.parentExecutionthroughExecutionLifecycleHooks(new optional constructor parameter) and throughWorkflowExecuteBeforeContextso modules can observe the parent of a sub-execution.workflowExecuteBeforehandler now fires for sub-executions and seesparentExecutionin the context.Commit 2 —
fix(core): Link sub-workflow traces to parent workflow execution spanWorkflowStartHandlerusesctx.parentExecution(when present) to look up the parent'sworkflow.executespan inSpanRegistryand start the childworkflow.executespan with the parent as its OTEL context. Falls back to a root span when no parent is supplied or the parent span is not in the registry.WorkflowStartHandlercovering attributes, root-span fallback, child-of-parent linking, and missing-parent fallback.Resulting trace
Before: two disconnected root traces — one for the parent, one for the sub-workflow.
After: one trace with the sub-workflow's
workflow.executespan nested under the parent'sworkflow.executespan, sharing a singletraceId:Known limitation
In scaling mode where parent and child executions run on different worker processes, the parent's
workflow.executespan isn't present in the child process'sSpanRegistry. In that case the child is still emitted as a root span — no worse than today, but a follow-up could propagate OTEL trace context through the execution queue if cross-process linking is desired. Happy to file a separate issue for that if reviewers agree.How to test
N8N_OTEL_ENABLED=trueandN8N_OTEL_EXPORTER_OTLP_ENDPOINT=http://<collector>:4318on an n8n instance.workflow.executespan should share the Parent'straceIdand itsparentSpanIdshould match the Parent'sworkflow.executespan id.Related Linear tickets, Github issues, and Community forum posts
Fixes #28592
https://linear.app/n8n/issue/GHC-7752
Review / Merge checklist
Backport to Beta,Backport to Stable, orBackport to v1(if the PR is an urgent fix that needs to be backported)