Async bitmap present #138

almarklein · 2025-11-11T14:44:51Z

PRs that led up to this PR

First, we moved the context classes to rendercanvas, allowing rendercanvas to implement custom behavior.

Then we applied several changes, allowing wgpu and and rendercanvas to interoperate for their async work, and to make it efficient and fast using threading:

And some PRs related to present-method selection:

What this PR contributes

General:

The bitmap present method is now asynchronous, which significantly improves the performance, making it a viable method for cases where the 'screen' method is fragile.
The scheduling logic has undergone major refactoring to support this.
Improved support for forced drawing.
On Qt with the bitmap method, the canvas does not draw when minimized.
Numpy was added as a dependency, to allow faster array handling for bitmap-present.
Lays the foundation for more sophisticated bitmap-present submethods, like jpeg, jpeg encoding om the GPU, mpeg, etc.

Internal API changes:

BaseRenderCanvas._draw_frame_and_present() is removed.
BaseRenderCanvas._rc_request_draw() is split in _rc_request_draw() and _rc_request_paint().
BaseRenderCanvas._rc_request_draw() should eventually or directly call _time_to_draw(), when the canvas is ready to receive another frame.
BaseRenderCanvas._rc_request_paint() should eventually call _time_to_paint(),
inside the paint-event if applicable.
BaseRenderCanvas._set_visible() can be used by subclasses to disbale drawing
while the canvas is invisible (e.g. mimimized).
BaseRenderCanvas._rc_force_draw() is renamed to _rc_force_paint().
Context._rc_present() becomes Context._rc_present(*, force_sync:bool=False).
- The method may return an async result when force_sync is not set.
- The new param disallows async in cases like forced draws and manual offscreen canvas.
Add Context._rc_set_present_params(**present_params).
- This allows backends to influence the presentation details.
- The plan is to build on this later for the bitmap present, e.g. encoding to jpeg.
- This logic is made part of the context, because some more sophisticated methods may use extra GPU steps before downloading the result, plus any encoding can be done on the mapped data, avoding one data-copy.

Timings

All numbers are in FPS, on a full-screen window on a Retina display (physical size 5120x2774).
The Cocoa is a WIP native backend for MacOS that uses Metal to display a texture that's stored in RAM.
The Null backend has _rc_present_bitmap as a no-op, so the bitmap is downloaded to CPU and then discarted.

Cube example

Screen:
- Glfw: 180-200
- Qt: 120-190
Sync bitmap:
- Qt: ~51
- Cocoa: 65-70
- Null: 70-80
Async bitmap:
- Qt: ~70
- Cocoa: 110
- Null: 140

Heavy example

Screen:
- Glfw: 48-51
- Qt: 48-51
Sync bitmap:
- Qt: 21-23
- Cocoa: 25-27
- Null: 27-30
Async bitmap:
- Qt: 46-51
- Cocoa: 46-51
- Null: 46-51

Interpretation

By doing the presentation asynchronous, the performance can be significantly increased.
For light visualizations, where presenting to screen yields 100+ FPS
- For a delay of 1 frame, the result is faster, but not as fast as screen.
- TODO: what if we have larger delays?
For light visualizations, where presenting to screen yields about 50-60 FPS
- The sync bitmap present is about twice as slow.
- The async present with 1 delay is nearly as fast as screen.

hmaarrfk · 2025-11-11T22:21:53Z

wow!

hmaarrfk · 2025-11-12T02:41:10Z

I would love to help benchmark on some unique systems configurations i have:

i was having a hard time getting pyside6 to show fps numbers above 60 fps on my (linux) machine.

I feel like I must set:

An environment variale to select which GPU
Is there an environment variable to select the "null" backend??
Would it print the FPS on the terminal for me?

Happy to setup a development environment for this, i can readily test on:

Linux + Intel integrated GPU
Linux + Nvidia dedicated GPU
Linux + AMD integrated GPU
Linux Laptop + Intel Integrated GPU.

almarklein · 2025-11-12T09:11:34Z

Create a canvas like this (e.g. taking our cube.py example):

canvas = RenderCanvas(
    title= $backend - $fps fps",
    update_mode="fastest",
    vsync=False,
    present_method='bitmap'  # bitmap or screen
)

Then the fps is shown in the title bar.

There is no actual null backend. I just temporarily made _rc_present_bitmap (the method that consumes the bitmap) return immediately. You can do this with any backend I guess. The idea is that it measures how fast bitmap rendering could be if the consumption of the bitmap were infinitely fast :)

My first benchmarks were on my Mac M1. I will add some tests with Intel and NVidia GPU's later.

almarklein · 2026-01-23T16:29:54Z

This piece of work was brutal, but it's nearly done now.

Apart from higher fps, this also reduces the delay between processing events and drawing. Below are two movies with an artificial low fps (10 fps). The first shows the previous behaviour:

Screen.Recording.2026-01-23.at.16.42.41.mov

In the new situation, even though the fps is low, the delay is small, which is really important to get a 'smooth' experience (maybe more so than high fps):

Screen.Recording.2026-01-23.at.16.45.46.mov

hmaarrfk · 2026-01-25T19:29:45Z

does your example mean that you've somehow found a way to shed 1 frame without sacrificing usability?

almarklein · 2026-01-26T11:34:11Z

does your example mean that you've somehow found a way to shed 1 frame without sacrificing usability?

It's more a question of timing. In remote rendering, you're pushing frames (over the network), and you want a mechanism for that downstream system to throttle the fps. Jupyter-rfb already has such a feedback mechanism based on the number of in-flight frames.

The naive way. By the time you send it, it may already be 'old'.

process events -> render frame -> request draw -> ready -> send

In current main we do this, but the frame can still be 'outdated':

process events -> request draw -> ready -> render frame -> send

In this PR, we have request_draw in addition to request_paint. Previously they were the same, which meant we could not process events (because you don't want to do that in a paint event). Now that they're separeted, we can:

request draw -> ready -> process events -> render frame -> send

hmaarrfk · 2026-01-26T13:27:29Z

I see, so placing process events between ready and render frame makes sense as to why it feel like you won "one frame".

almarklein · 2026-01-27T14:36:58Z

Ready; added notes to top post.

rendercanvas/base.py

rendercanvas/contexts/wgpucontext.py

Co-authored-by: Jan <Vipitis@users.noreply.github.com>

Korijn · 2026-01-28T08:44:51Z

I don't have time to carefully review, go ahead :)

almarklein added 2 commits November 11, 2025 14:34

Use buffer mapping instead of queue.read_texture

a22815e

bitmap async by lagging one frame

540c2d8

delete files I accidentally added

e8456fe

This was referenced Nov 12, 2025

Add notes on performance for Qt bitmap present #139

Merged

Allow displaying frame-time in ms in title ($ms instead of $fps) #140

Merged

almarklein added 22 commits November 14, 2025 11:18

Merge branch 'main' into async-bitmap

eff9217

add note

c8ac9a8

Refactor to implement basic ring buffer

cccdb42

polishing a bit

7fafc89

Re-implement precise sleep

6dbb1e1

improve accuracy of raw loop

ecd3804

ruff

b91a4ac

docs

39903dc

Try a scheduler thread util

dd3b330

improvinh

2b74282

Also apply for trio

8554cda

tiny tweak

493924e

Implement for wx

aba63cc

Make precise timers and threaded timers work for all qt backends

2591017

Clean up

19aa7a4

add comment

9a01953

simplify thread code a bit

6270abb

Avoid using Future.set_result, which we are not supposed to be calling

c06c5c5

clean

5326bab

comment

eeabd01

Using the thread, the raw loop can become dead simple

8f6bbda

cleanup

886e6d4

almarklein added 10 commits January 19, 2026 13:23

Merge branch 'main' into async-bitmap

167c05b

progress

51550ab

Some cleanuo

cb48dbe

fix for pyqt5

a271969

Fix wx bitmap present

d0aefc8

Merge branch 'main' into async-bitmap

604e59c

fix force-draw for qt and wx on macos

819b8c0

Better moment to process events to reduce delay

697608c

cleanup

0d3f78c

restore example

3c7368a

fix for pyqt6

cf0d40a

almarklein added 2 commits January 26, 2026 13:43

Tweak api a bit

e799cdd

Just one downloader

7286e95

almarklein added 6 commits January 27, 2026 12:08

Clean up low-level presentation

5ab755c

fixes

6abd8c1

Undo earlier change

fb40c2f

Also use numpy in bitmap contexts

d9007ee

minor tweaks

0e540ba

fix

03b6345

almarklein marked this pull request as ready for review January 27, 2026 12:26

almarklein requested a review from Korijn January 27, 2026 12:26

Vipitis reviewed Jan 27, 2026

View reviewed changes

rendercanvas/base.py Outdated Show resolved Hide resolved

Vipitis reviewed Jan 27, 2026

View reviewed changes

rendercanvas/contexts/wgpucontext.py Outdated Show resolved Hide resolved

Apply suggestions from code review

9392644

Co-authored-by: Jan <Vipitis@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async bitmap present #138

Async bitmap present #138

Uh oh!

almarklein commented Nov 11, 2025 •

edited

Loading

Uh oh!

hmaarrfk commented Nov 11, 2025

Uh oh!

hmaarrfk commented Nov 12, 2025

Uh oh!

almarklein commented Nov 12, 2025 •

edited

Loading

Uh oh!

almarklein commented Jan 23, 2026

Uh oh!

hmaarrfk commented Jan 25, 2026

Uh oh!

almarklein commented Jan 26, 2026

Uh oh!

hmaarrfk commented Jan 26, 2026

Uh oh!

almarklein commented Jan 27, 2026

Uh oh!

Uh oh!

Uh oh!

Korijn commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Async bitmap present #138

Are you sure you want to change the base?

Async bitmap present #138

Uh oh!

Conversation

almarklein commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PRs that led up to this PR

What this PR contributes

Timings

Cube example

Heavy example

Interpretation

Uh oh!

hmaarrfk commented Nov 11, 2025

Uh oh!

hmaarrfk commented Nov 12, 2025

Uh oh!

almarklein commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

almarklein commented Jan 23, 2026

Uh oh!

hmaarrfk commented Jan 25, 2026

Uh oh!

almarklein commented Jan 26, 2026

Uh oh!

hmaarrfk commented Jan 26, 2026

Uh oh!

almarklein commented Jan 27, 2026

Uh oh!

Uh oh!

Uh oh!

Korijn commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

almarklein commented Nov 11, 2025 •

edited

Loading

almarklein commented Nov 12, 2025 •

edited

Loading