Skip to content

fix(core): Harden event log parse against malformed input#29004

Merged
afitzek merged 1 commit intomasterfrom
afitzek/iam-565-investigation
Apr 27, 2026
Merged

fix(core): Harden event log parse against malformed input#29004
afitzek merged 1 commit intomasterfrom
afitzek/iam-565-investigation

Conversation

@afitzek
Copy link
Copy Markdown
Contributor

@afitzek afitzek commented Apr 23, 2026

Summary

Hardens MessageEventBusLogWriter read-side so heap usage stays bounded when parsing event log files that contain malformed lines. Two defensive changes:

  • Aggregate malformed-line warnings. Previously every invalid line produced its own logger.error with the raw line interpolated into the message template. In tight-loop parses this allocates per-line memory faster than the runtime can reclaim. readLoggedMessagesFromFile now counts skipped lines in stack-local state and emits a single aggregated logger.warn at the end of each parse pass, with a 200-char truncated sample for ops triage.
  • Absolute ceiling on total lines processed per file. The existing maxMessagesPerParse guard (PR fix(core): Guard event log parsing against unbounded memory growth #28594) only applies in non-'all' modes and only counts successful parses, so it doesn't bound pathological files at all. A new N8N_EVENTBUS_LOGWRITER_MAXTOTALMESSAGESPERFILE (default 500_000) caps total lines processed in every mode, with skipped lines counted toward the total so 100%-malformed files also trip it.

Empirically validated with a micro-benchmark: 100k tight-loop logger.error calls with a 10 kB line interpolated allocate ~1 GB of heap; collapsing to a single aggregated warn drops the delta to 0 MB and the wall time from 2 s to 1 ms.

How to test manually

Aggregate-warn path:

  1. Create a malformed event log file on a dev instance:
    for i in $(seq 1 10000); do echo "not-json-$i"; done > ~/.n8n/n8nEventLog.log
  2. Start n8n and observe startup logs.
  3. Before this change: ~10k error lines and visible memory growth during boot. After: a single warn of the form Event log parse skipped 10000 malformed line(s) in <file>. Sample (truncated): not-json-..., no per-line errors, flat memory.

Total-ceiling path:

  1. Export N8N_EVENTBUS_LOGWRITER_MAXTOTALMESSAGESPERFILE=100.
  2. Seed a file with more than 100 lines (seq 1 500 > ~/.n8n/n8nEventLog.log).
  3. Start n8n. Expect a single warn: Event log <file> exceeded 100 total lines during parse; aborting to prevent out-of-memory.

Healthy-file regression:

With no config changes and a healthy event log, behavior is byte-identical — no new warns, no dropped messages. Covered by the four pre-existing unit tests, which pass unmodified.

Key implementation decisions

  • processLoggedLine lets exceptions propagate instead of logging-and-swallowing. Aggregation lives in the caller (readLoggedMessagesFromFile) where stack-local counters are natural. The raw line is no longer baked into any log message — the truncated sample in the aggregate warn is sufficient for triage.
  • Counters are stack-local per parse pass, not instance state. Two concurrent parse passes each get their own counters; no process-global retention.
  • Skipped lines count toward maxTotalMessagesPerFile. Closes the 100%-malformed-file case the existing maxMessagesPerParse guard misses. Existing guard left intact to preserve PR fix(core): Guard event log parsing against unbounded memory growth #28594's semantics for the orphaned-messages scenario.
  • Out of scope, flagged as follow-ups: the post-parse send() fanout amplification via MessageEventBus.initialize (re-emits every parsed message to all log-streaming destinations including Sentry NodeClient) and the unbounded ErrorReporter.seenErrors Set. This PR is partial-defence-in-depth against both (bounded parse → bounded fanout) but doesn't fix them at their source. Separate tickets to follow.

Related Links

Review / Merge checklist

  • I have seen this code, I have run this code, and I take responsibility for this code.
  • PR title and summary are descriptive. (conventions)
  • Docs updated or follow-up ticket created.
  • Tests included.
  • PR Labeled with Backport to Beta, Backport to Stable, or Backport to v1 (if the PR is an urgent fix that needs to be backported)

@afitzek afitzek marked this pull request as ready for review April 23, 2026 14:51
@afitzek afitzek requested review from a team, BGZStephen, cstuncsik, guillaumejacquart and phyllis-noester and removed request for a team April 23, 2026 14:51
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 81.08108% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...e-event-bus-writer/message-event-bus-log-writer.ts 80.55% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/cli/src/eventbus/message-event-bus-writer/message-event-bus-log-writer.ts">

<violation number="1" location="packages/cli/src/eventbus/message-event-bus-writer/message-event-bus-log-writer.ts:262">
P1: The new max-total-per-file check undercounts processed lines, so the parser can still do unbounded work on files with many confirm lines.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

// Absolute ceiling applied in every mode, counting skipped lines too
// so 100%-malformed files also trip it.
if (
results.loggedMessages.length - baselineCount + parseSkipped >=
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The new max-total-per-file check undercounts processed lines, so the parser can still do unbounded work on files with many confirm lines.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/cli/src/eventbus/message-event-bus-writer/message-event-bus-log-writer.ts, line 262:

<comment>The new max-total-per-file check undercounts processed lines, so the parser can still do unbounded work on files with many confirm lines.</comment>

<file context>
@@ -239,10 +254,29 @@ export class MessageEventBusLogWriter {
+					// Absolute ceiling applied in every mode, counting skipped lines too
+					// so 100%-malformed files also trip it.
+					if (
+						results.loggedMessages.length - baselineCount + parseSkipped >=
+						maxTotalMessagesPerFile
+					) {
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, but this PR still leaves so code safer, so i'd keep it like that for now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming—understood, we’ll keep it as is for now.

@n8n-assistant n8n-assistant Bot added core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team labels Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 23, 2026

Performance Comparison

Comparing currentlatest master14-day baseline

Idle baseline with Instance AI module loaded

Metric Current Latest Master Baseline (avg) vs Master vs Baseline Status
instance-ai-rss-baseline 352.70 MB 352.70 MB 367.22 MB (σ 21.72) +0.0% -4.0%
instance-ai-heap-used-baseline 191.66 MB 191.66 MB 187.42 MB (σ 1.77) +0.0% +2.3% 🔴

Memory consumption baseline with starter plan resources

Metric Current Latest Master Baseline (avg) vs Master vs Baseline Status
memory-rss-baseline 295.84 MB 295.84 MB 290.89 MB (σ 29.40) +0.0% +1.7%
memory-heap-used-baseline 118.45 MB 118.45 MB 115.30 MB (σ 1.79) +0.0% +2.7% ⚠️

docker-stats

Metric Current Latest Master Baseline (avg) vs Master vs Baseline Status
docker-image-size-n8n 1269.76 MB 1280.00 MB 1277.44 MB (σ 12.68) -0.8% -0.6%
docker-image-size-runners 387.00 MB 388.00 MB 390.69 MB (σ 8.62) -0.3% -0.9%
How to read this table
  • Current: This PR's value (or latest master if PR perf tests haven't run)
  • Latest Master: Most recent nightly master measurement
  • Baseline: Rolling 14-day average from master
  • vs Master: PR impact (current vs latest master)
  • vs Baseline: Drift from baseline (current vs rolling avg)
  • Status: ✅ within 1σ | ⚠️ 1-2σ | 🔴 >2σ regression

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Bundle Report

Bundle size has no change ✅

@afitzek afitzek added this pull request to the merge queue Apr 27, 2026
Merged via the queue into master with commit b2b1370 Apr 27, 2026
110 of 113 checks passed
@afitzek afitzek deleted the afitzek/iam-565-investigation branch April 27, 2026 09:44
@n8n-assistant n8n-assistant Bot mentioned this pull request Apr 28, 2026
@n8n-assistant
Copy link
Copy Markdown
Contributor

n8n-assistant Bot commented Apr 28, 2026

Got released with n8n@2.19.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Enhancement outside /nodes-base and /editor-ui n8n team Authored by the n8n team Released

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants