Skip to content

fix(core): Trace sub-workflow executions within their parent workflow trace#28593

Closed
BCSHroyer wants to merge 3 commits inton8n-io:masterfrom
BCSHroyer:fix/ghc-7752-otel-sub-workflow-tracing
Closed

fix(core): Trace sub-workflow executions within their parent workflow trace#28593
BCSHroyer wants to merge 3 commits inton8n-io:masterfrom
BCSHroyer:fix/ghc-7752-otel-sub-workflow-tracing

Conversation

@BCSHroyer
Copy link
Copy Markdown

@BCSHroyer BCSHroyer commented Apr 16, 2026

Context: follow-up bugfix that completes the OTel tracing coverage introduced in #27528 and #27789 — without this, sub-workflow executions were invisible in the trace tree.

Summary

Sub-workflow executions (workflows invoked via the Execute Workflow node) were not being traced by the OpenTelemetry module. The parent workflow's trace showed only the parent's spans, and the sub-workflow's workflow.execute / node.execute spans never appeared. This affected any backend module that relies on @OnLifecycleEvent to observe the lifecycle of a sub-execution; OpenTelemetry was the most visible symptom.

Root cause. getLifecycleHooksForSubExecutions was the only lifecycle-hooks factory that did not call Container.get(ModulesHooksRegistry).addHooks(hooks). The other three factories (Main, ScalingMain, ScalingWorker) all register module hooks, so sub-executions silently bypassed every module-registered @OnLifecycleEvent handler.

Secondary gap. WorkflowExecuteBeforeContext did not carry parentExecution, so even with module hooks registered, the OTEL handler had no way to link the sub-workflow's workflow.execute span to its parent's trace.

Changes

Commit 1 — fix(core): Register module lifecycle hooks for sub-workflow executions

  • Add Container.get(ModulesHooksRegistry).addHooks(hooks) to getLifecycleHooksForSubExecutions, matching the other three factories.
  • Plumb parentExecution through ExecutionLifecycleHooks (new optional constructor parameter) and through WorkflowExecuteBeforeContext so modules can observe the parent of a sub-execution.
  • Unit test: a module-registered workflowExecuteBefore handler now fires for sub-executions and sees parentExecution in the context.

Commit 2 — fix(core): Link sub-workflow traces to parent workflow execution span

  • WorkflowStartHandler uses ctx.parentExecution (when present) to look up the parent's workflow.execute span in SpanRegistry and start the child workflow.execute span with the parent as its OTEL context. Falls back to a root span when no parent is supplied or the parent span is not in the registry.
  • Unit tests for WorkflowStartHandler covering attributes, root-span fallback, child-of-parent linking, and missing-parent fallback.

Resulting trace

Before: two disconnected root traces — one for the parent, one for the sub-workflow.

After: one trace with the sub-workflow's workflow.execute span nested under the parent's workflow.execute span, sharing a single traceId:

workflow.execute  (Parent)
├── node.execute  Trigger
├── node.execute  Set
├── node.execute  Execute Workflow
└── workflow.execute  (Child)
    ├── node.execute  Set
    └── node.execute  Code

Known limitation

In scaling mode where parent and child executions run on different worker processes, the parent's workflow.execute span isn't present in the child process's SpanRegistry. In that case the child is still emitted as a root span — no worse than today, but a follow-up could propagate OTEL trace context through the execution queue if cross-process linking is desired. Happy to file a separate issue for that if reviewers agree.

How to test

  1. Set N8N_OTEL_ENABLED=true and N8N_OTEL_EXPORTER_OTLP_ENDPOINT=http://<collector>:4318 on an n8n instance.
  2. Create two workflows: a "Child" with any pair of nodes, and a "Parent" that ends with an Execute Workflow node calling Child.
  3. Execute Parent and inspect the trace in your OTLP backend. The Child's workflow.execute span should share the Parent's traceId and its parentSpanId should match the Parent's workflow.execute span id.

Related Linear tickets, Github issues, and Community forum posts

Fixes #28592

https://linear.app/n8n/issue/GHC-7752

Review / Merge checklist

  • I have seen this code, I have run this code, and I take responsibility for this code.
  • PR title and summary are descriptive. (conventions)
  • Docs updated or follow-up ticket created.
  • Tests included.
  • PR Labeled with Backport to Beta, Backport to Stable, or Backport to v1 (if the PR is an urgent fix that needs to be backported)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 16, 2026

CLA assistant check
All committers have signed the CLA.

@BCSHroyer BCSHroyer force-pushed the fix/ghc-7752-otel-sub-workflow-tracing branch from 7afe865 to eaf4c0b Compare April 17, 2026 14:37
@BCSHroyer
Copy link
Copy Markdown
Author

@CLAassistant recheck

`getLifecycleHooksForSubExecutions` was the only lifecycle-hooks factory
that did not call `ModulesHooksRegistry.addHooks`. As a result, every
backend module that reacts to workflow lifecycle events through
`@OnLifecycleEvent` (OpenTelemetry tracing, and any future module) was
silently skipped whenever a sub-workflow ran via the Execute Workflow
node.

Also plumb `parentExecution` through `ExecutionLifecycleHooks` and the
`WorkflowExecuteBeforeContext` so consumers can link a sub-execution to
its parent.

Made-with: Cursor
When a workflow is invoked via the Execute Workflow node, its
`workflow.execute` span is now parented to the calling execution's
`workflow.execute` span, producing a single connected trace instead of
two disconnected root traces. Falls back to a root span when no parent
execution is supplied or when the parent span is not found in the
registry (for example, if the parent ran in a different process).

Made-with: Cursor
@BCSHroyer BCSHroyer force-pushed the fix/ghc-7752-otel-sub-workflow-tracing branch from eaf4c0b to f704d73 Compare April 17, 2026 14:39
@BCSHroyer BCSHroyer marked this pull request as ready for review April 17, 2026 14:41
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 6 files

Architecture diagram
sequenceDiagram
    participant Node as Execute Workflow Node
    participant Factory as Hooks Factory
    participant Registry as Modules Hooks Registry
    participant Handler as OTEL WorkflowStartHandler
    participant Spans as Span Registry
    participant Tracer as OTEL Tracer

    Note over Node, Tracer: Sub-workflow Execution Initialization

    Node->>Factory: getLifecycleHooksForSubExecutions(parentExecution)
    
    Factory->>Registry: NEW: addHooks(hooks)
    Note right of Registry: Registers module-specific handlers<br/>(like OpenTelemetry) for the sub-execution
    Registry-->>Factory: 
    Factory-->>Node: return hooks instance

    Note over Node, Tracer: Sub-workflow Start Hook Phase

    Node->>Factory: runHook('workflowExecuteBefore')
    Factory->>Handler: handle(ctx)
    Note right of Handler: CHANGED: ctx now includes parentExecution
    
    Handler->>Spans: CHANGED: getWorkflow(ctx.parentExecution.id)
    alt Parent Span Found
        Spans-->>Handler: parentSpan
        Handler->>Handler: Create parent context from span
    else Parent Span Missing (e.g. Scaling Mode)
        Spans-->>Handler: undefined
        Handler->>Handler: Use active/root context
    end

    Handler->>Tracer: CHANGED: startSpan('workflow.execute', ..., parentCtx)
    Note right of Tracer: Sub-workflow span is now nested<br/>under parent in the same trace
    Tracer-->>Handler: childSpan
    
    Handler->>Spans: addWorkflow(childExecutionId, childSpan)
    Handler-->>Factory: 
    Factory-->>Node: 

    Note over Node, Tracer: Result: Unified Trace with correct hierarchy
Loading

@BCSHroyer
Copy link
Copy Markdown
Author

Hey @geemanjs @alielkhateeb — since you both built out the original OTel tracing support (#27528, #27789, #27568), figured you'd be the best folks to sanity-check this follow-up. It's a small fix: getLifecycleHooksForSubExecutions wasn't calling Container.get(ModulesHooksRegistry).addHooks(hooks), so sub-workflow executions never got the OTel lifecycle hooks attached and their spans ended up orphaned from the parent workflow trace. Added a regression test as well.

No rush — flagging whenever you've got a moment. Thanks!

@geemanjs
Copy link
Copy Markdown
Contributor

Hi @BCSHroyer

Thanks for raising this PR!

I have been working on trace propagation (inbound - webhooks / outbound - http requests and across instances in queue mode) using traceparent / tracestate the last few days which has led to some significant rework of the OTEL module.

That work might actually have solved this problem also (but I will validate tomorrow and report back here).

@BCSHroyer
Copy link
Copy Markdown
Author

Hi @geemanjs That's great, The inbound outbound propagation was another PR I was actually looking into but wanted to wait till this one was resolved first.

Thanks!

@n8n-assistant n8n-assistant Bot added community Authored by a community member core Enhancement outside /nodes-base and /editor-ui in linear DEPRECATED labels Apr 21, 2026
@n8n-assistant
Copy link
Copy Markdown
Contributor

n8n-assistant Bot commented Apr 21, 2026

Hey @BCSHroyer,

Thank you for your contribution. We appreciate the time and effort you’ve taken to submit this pull request.

Before we can proceed, please ensure the following:
• Tests are included for any new functionality, logic changes or bug fixes.
• The PR aligns with our contribution guidelines.

Regarding new nodes:
We no longer accept new nodes directly into the core codebase. Instead, we encourage contributors to follow our Community Node Submission Guide to publish nodes independently.

If your node integrates with an AI service that you own or represent, please email nodes@n8n.io and we will be happy to discuss the best approach.

About review timelines:
This PR has been added to our internal tracker as "GHC-7809". While we plan to review it, we are currently unable to provide an exact timeframe. Our goal is to begin reviews within a month, but this may change depending on team priorities. We will reach out when the review begins.

Thank you again for contributing to n8n.

@geemanjs
Copy link
Copy Markdown
Contributor

Thanks @BCSHroyer

I've confirmed this working in the work I did - which can be found here:
#28720

We also added support for "wait/resume" nodes as well as inbound/outbound propagation!

I will close this PR unless you have something more to add? Appreciate your patience and thanks for doing the work here!

James

@geemanjs geemanjs closed this Apr 24, 2026
@BCSHroyer
Copy link
Copy Markdown
Author

BCSHroyer commented Apr 24, 2026 via email

@geemanjs
Copy link
Copy Markdown
Contributor

geemanjs commented Apr 27, 2026

Hi Brian!

It will be in Tuesday's beta release - I believe this will be a 2.19.x (unless anything major comes up)

But if you want to test it today there is a nightly build which is built from the master branch each day.
ghcr.io/n8n-io/n8n:nightly

I hope it delivers on what you are hoping for!

James

@geemanjs
Copy link
Copy Markdown
Contributor

Hi @BCSHroyer just to let you know this was just released under 2.19.0
https://github.com/n8n-io/n8n/releases/tag/n8n%402.19.0

Have a great week and thanks for your help on this
James

@BCSHroyer
Copy link
Copy Markdown
Author

Great! Thank you for letting me know!

@BCSHroyer BCSHroyer deleted the fix/ghc-7752-otel-sub-workflow-tracing branch April 28, 2026 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Authored by a community member core Enhancement outside /nodes-base and /editor-ui in linear DEPRECATED

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenTelemetry sub-workflow executions not traced (missing ModulesHooksRegistry.addHooks in getLifecycleHooksForSubExecutions)

3 participants