Skip to content

core: Defer BitmapData.draw into the target's pending GPU batch#23530

Open
jarca0123 wants to merge 2 commits intoruffle-rs:masterfrom
jarca0123:perf/bitmap-draw-defer
Open

core: Defer BitmapData.draw into the target's pending GPU batch#23530
jarca0123 wants to merge 2 commits intoruffle-rs:masterfrom
jarca0123:perf/bitmap-draw-defer

Conversation

@jarca0123
Copy link
Copy Markdown
Contributor

Assisted by Claude Opus 4.7.

Route draw() through append_gpu_commands instead of calling
render_offscreen synchronously at the end. Consecutive draw() calls
onto the same target then share one render_offscreen submission at
flush time.

Without this, scenes that cache a lot of content via bmd.draw() (some
games issue tens of thousands of draws per frame) hit the backend's
per-frame draw cap repeatedly, causing a fresh command-encoder
allocation per submit cycle.

@danielhjacobs danielhjacobs added llm The PR contains mostly LLM-generated code T-perf Type: Performance Improvements A-core Area: Core player, where no other category fits labels Apr 24, 2026
Add infrastructure on BitmapData so callers can accumulate render
commands and flush them as a single render_offscreen call instead of
one submit per call. Sub-batches become separate render passes, letting
MSAA-overlap cases insert resolve boundaries while non-overlapping
draws merge into one pass.

Use the infrastructure to keep copyPixels on the GPU when both sides
are GPU-resident:

- Plain replace (no blend) from a GPU-resident source to a different
  target uses a raw texture-to-texture copy through the new
  RenderBackend::copy_pixels_to_texture method.
- A blend where the original src_rect covers the full source bitmap
  and the target is GPU-resident queues a render_bitmap command onto
  the target's pending batch. Contiguous copyPixels calls merge into
  one batch and submit as a single render_offscreen at flush time;
  StageQuality::Low keeps the blit out of MSAA.

The CPU path stays as the fallback for same-bitmap copies, backends
without offscreen support, partial-source blends, and any case the
fast paths reject.
Route draw() through append_gpu_commands instead of calling
render_offscreen synchronously at the end. Consecutive draw() calls
onto the same target then share one render_offscreen submission at
flush time.

Without this, scenes that cache a lot of content via bmd.draw() (some
games issue tens of thousands of draws per frame) hit the backend's
per-frame draw cap repeatedly, causing a fresh command-encoder
allocation per submit cycle.
@jarca0123 jarca0123 force-pushed the perf/bitmap-draw-defer branch from 01c3f13 to a3ae8a4 Compare April 25, 2026 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-core Area: Core player, where no other category fits llm The PR contains mostly LLM-generated code T-perf Type: Performance Improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants