Skip to content

Fix VRAM leak when monitors are off#3910

Open
phuongdpham wants to merge 1 commit into
niri-wm:mainfrom
phuongdpham:fix/3295-vram-leak-screens-off
Open

Fix VRAM leak when monitors are off#3910
phuongdpham wants to merge 1 commit into
niri-wm:mainfrom
phuongdpham:fix/3295-vram-leak-screens-off

Conversation

@phuongdpham
Copy link
Copy Markdown
Contributor

@phuongdpham phuongdpham commented Apr 26, 2026

Per discussion in #3295: with monitors off, several paths kept driving the redraw loop and leaking memory.

This patch breaks the loop by:

  • Scheduling the same estimated-vblank throttle Tty::render schedules on NoDamage, so commit-driven queue_redraw doesn't busy-loop.
  • Skipping send_frame_callbacks while monitors_active == false - we never present, so inviting clients to commit just queues buffers that go nowhere.
  • Skipping the 1s fallback frame-callback timer while monitors are off - ticking smithay's throttle bookkeeping serves no purpose when nothing is being presented.
  • Clearing unfinished_animations_remain in the inactive-monitors branch so the estimated-vblank timer doesn't keep re-queuing redraws at refresh-rate cadence.

The throttle helpers moved from Tty to impl Niri so the inactive-monitors call site doesn't need a backend-type check.

Credit to @sys-rq for empirical testing on i915 + Lenovo T490 that surfaced the three additional drivers; the original patch only addressed the per-commit busy-loop.

Testing

  • cargo check / cargo clippy --all --all-targets / cargo +nightly fmt --all -- --check clean.
  • cargo test --all --exclude niri-visual-tests - 217 tests pass.
  • I'm on NVIDIA RTX 3060 Ti and can't reproduce locally. @sys-rq has reproduced + tested the partial patch; would appreciate a re-test of this updated version.

Refs #3295.

@phuongdpham phuongdpham force-pushed the fix/3295-vram-leak-screens-off branch from c4ea5b0 to 67a53d8 Compare April 26, 2026 16:55
@YaLTeR
Copy link
Copy Markdown
Member

YaLTeR commented Apr 26, 2026

well this clearly looks kinda cursed with that check and pub(crate)

@phuongdpham phuongdpham force-pushed the fix/3295-vram-leak-screens-off branch from 67a53d8 to 2678cc1 Compare April 26, 2026 17:15
@phuongdpham
Copy link
Copy Markdown
Contributor Author

phuongdpham commented Apr 26, 2026

Refactored: both functions moved to impl Niri, no more pub(crate) or Backend::Tty check.

Heads up: with the throttle now backend-agnostic, monitors_active=false can leave Winit's redraw_state at WaitingForEstimatedVBlank{,AndQueued}, which Winit::render still marks unreachable!() (winit.rs:289-290). Not reachable in practice (no screencast on Winit), but flagging since the precondition shifted.

@YaLTeR
Copy link
Copy Markdown
Member

YaLTeR commented Apr 26, 2026

I'm gonna have to take a good look at this and at surrounding code to verify that it's right. Sometime in the future

@phuongdpham
Copy link
Copy Markdown
Contributor Author

Thank you.

@Sempyos Sempyos added area:output Monitors, scaling, VRR, DPMS pr kind:fix Issue fixes, code cleanups labels Apr 27, 2026
@bczhc
Copy link
Copy Markdown

bczhc commented Apr 27, 2026

This also fixes #3691. Though in that discussion the issue is RAM climbing but not VRAM, which is considered a bit different symptom to issue #3295. But anyway this also fixes 3691!

@phuongdpham phuongdpham force-pushed the fix/3295-vram-leak-screens-off branch from 2678cc1 to 3cfd6b2 Compare April 27, 2026 15:51
@Sempyos Sempyos added the kind:leak Memory and VRAM leaks label Apr 27, 2026
@sys-rq
Copy link
Copy Markdown

sys-rq commented May 3, 2026

Tested on lenovo t490/i915 + arch + niri-git + 3910. Still observing the leak leading to eventual overnight OOMs. First time observed on Dec 12 (though it might be that I wasn't updating the system for up to 2 weeks at the time, take it as an approx date). Powering off monitors frequently (Mod+Shift+P { power-off-monitors; }). Tried to downgrade to the oldest available locally cached version (25.08-2), did not helped, free -h reports +2Gi increase in shared every 15 mins with monitors off.

@sys-rq
Copy link
Copy Markdown

sys-rq commented May 4, 2026

What helped:
• prevented send_frame_callbacks() while monitors_active == false
• disabled the fallback frame callback timer while monitors are off.
• clearing unfinished_animations_remain in the monitors-off redraw branch so the estimated-vblank timer doesn’t keep scheduling animation redraw

@phuongdpham phuongdpham force-pushed the fix/3295-vram-leak-screens-off branch from 3cfd6b2 to 9b10d10 Compare May 5, 2026 11:25
Stop driving the redraw loop while monitors are off: schedule the
estimated-vblank throttle, skip send_frame_callbacks and the fallback
timer, and clear unfinished_animations_remain.

Refs niri-wm#3295.
@phuongdpham phuongdpham force-pushed the fix/3295-vram-leak-screens-off branch from 9b10d10 to 164c957 Compare May 5, 2026 11:43
@phuongdpham
Copy link
Copy Markdown
Contributor Author

@sys-rq Thanks for the testing and the diagnosis - all three paths you identified check out against the code, and they explain why my original patch didn't close the leak (it only stopped the per-commit busy-loop, not the timer-driven re-queuing).

Folded all three into this PR (force-pushed, see updated description):

  • Skip send_frame_callbacks while monitors_active == false
  • Skip the 1s fallback frame-callback timer while monitors are off
  • Clear unfinished_animations_remain in the inactive-monitors branch

If you have time, a re-test on the new revision would be much appreciated.

@as3ii
Copy link
Copy Markdown

as3ii commented May 6, 2026

Per my own tests with AMD iGPU, commit 3cfd6b2 fixed the VRAM leak related to monitor powered off. In the next hours/days I'll test the latest commit

@phuongdpham
Copy link
Copy Markdown
Contributor Author

Per my own tests with AMD iGPU, commit 3cfd6b2 fixed the VRAM leak related to monitor powered off. In the next hours/days I'll test the latest commit

Thank you @as3ii.

@sys-rq
Copy link
Copy Markdown

sys-rq commented May 6, 2026

looks promising, tested for 1h with monitor off and, just in case, locked but turned on, free -h is not showing increase in shared

@phuongdpham
Copy link
Copy Markdown
Contributor Author

Thanks for re-testing - good to have confirmation on the hardware where the leak was reproducible. Appreciate the legwork on the original diagnosis too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:output Monitors, scaling, VRR, DPMS kind:leak Memory and VRAM leaks pr kind:fix Issue fixes, code cleanups

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants