Scene Items in KWin

If your background includes game development, the concept of a scene should sound familiar. A scene is a way to organize the contents of the screen using a tree, where parent nodes affect their child nodes. In a game, a scene would typically consist of elements such as lights, actors, terrain, etc.

KWin also has a scene. With this blog post, I want to provide a quick glimpse at the current scene design, and the plan how it can be improved for Wayland.

Current state

Since compositing functionality in KWin predates Wayland, the scene is relatively simple, it’s just a list of windows sorted in the stacking order. After all, on X11, a compositing window manager only needs to take window buffers and compose them into a single image.

With the introduction of Wayland support, we started hitting limitations of the current scene design. wl_surface is a quite universal thing. It can used to represent the contents of a window, or a cursor, or a drag-and-drop icon, etc.

Since the scene thinks of the screen in terms of windows, it needs to have custom code paths to cover all potential usages of the wl_surface interface. But doing that has its own problems. For example, if an application renders cursors using a graphics api such as OpenGL or Vulkan, KWin won’t be able to display such cursors because the code path that renders cursors doesn’t handle hardware accelerated client buffers.

Another limitation of the current scene is that it doesn’t allow tracking damage more efficiently per each wl_surface, which is needed to avoid repainting areas of the screen that haven’t changed and thus keep power usage low.

Introducing scene items

The root cause of our problems is that the scene thinks of the contents of the screen in terms of windows. What if we stop viewing a window as a single, indivisible object? What if we start viewing every window as something that’s made of several other items, e.g. a surface item with window contents, a server-side decoration item, and a nine-tile patch drop shadow item?

A WindowItem is composed of several other items – a ShadowItem, a DecorationItem, and a SurfaceItem

With such a design, the scene won’t be limited only to windows, for example we could start putting drag-and-drop icons in it. In addition to that, it will be possible to reuse the code that paints wl_surface objects or track damage per individual surface

Besides windows, the scene contains a drag-and-drop icon and a software cursor

Another advantage of the item-based design is that it will provide a convenient path towards migration to a scene/render graph, which is crucial for performing compositing on different threads or less painful transition to Vulkan.

Work done so far

At the end of March, an initial batch of changes to migrate to the item-based design was merged. We still have a lot of work ahead of us, but even with those initial changes, you will already see some improves in the Wayland session. For example, there should less visual artifacts in applications that utilize sub-surfaces, e.g. Firefox.

The end goal of the transition to the item-based design is to have a more flexible and extensible scene. So far, the plan is to continue doing refactorings and avoid rewriting the entire compositing machinery, if possible. You can find out more about the scene redesign progress by visiting https://invent.kde.org/plasma/kwin/-/issues/30.

Conclusion

In short, we still have some work to do to make rendering abstractions in KWin fit well all the cases that there are on Wayland. However, even with the work done so far, the results are very promising!

Compositing Scheduling in KWin: Past, Present, and Future

From time to time, we receive regular complaints about frame scheduling. In particular, compositing not being synchronized to vblanks, missed frames, repainting monitors with different refresh rates, etc. This blog post will (hopefully) explain why these issues are present and how we plan to fix them.

Past & Present

With the current scheduling algorithm, compositing /should/ start immediately right after a vblank. A vblank is the time between the vertical front porch and the vertical back porch, or simply put, it’s the time when the display starts scanning out the contents of the next frame.

One thing that’s worth point out is that buffers are not swapped after finishing a compositing cycle, they are swapped at the start of the next compositing cycle, in other words, at the next vblank

KWin assumes that glXSwapBuffers() and eglSwapBuffers() will always block until the next vblank. By delaying the buffer swap, we have more time to process input events, do some window manager things, etc. But, this assumption is outdated, nowadays, it’s rare to see a GLX or an EGL implementation where a buffer swap operation blocks when rendering double buffered.

In case the buffer swap operation doesn’t block, which is typically the case with Mesa drivers, glXSwapBuffers() or eglSwapBuffers() will be called at the end of a compositing cycle. There is a catch though. Compositing won’t be synchronized to vblanks.

Since compositing is not synchronized with vblanks anymore, you may notice that animations in some application don’t look butter smooth as they should. This issue can be easily verified using the black frame insertion test [1].

Another problem with our compositing scheduling algorithm is latency. Ideally, if you press a key, the corresponding symbol should show up on the screen as soon as possible. In practice, things are slightly different

With the current compositing timing, if you press a key on the keyboard, it may take up to two frame before the corresponding symbol shows up on the screen. Same thing with videos, the audio might be playing two frames ahead of what is on the screen.

Monitors With Different Refresh Rates

Things get trickier if you have several monitors and they have different refresh rates. On X11, compositing is throttled to the lowest common refresh rate, in other words if you have two monitors with a refresh rate of 60Hz and one with a refresh rate of 120Hz, compositing will be performed at a rate of 60Hz. There is probably nothing that we can do about it.

On Wayland, it’s a completely different situation. From the technical point of view, we don’t have anything that prevents compositing being performed separately per screen at different refresh rates. But due to historical reasons, compositing on Wayland is throttled similar to the X11 case.

Future

Our main goals are to unlock true per screen rendering on Wayland and reduce latency caused by compositing (both on X11 and Wayland). Some work [2] has already been started to fix compositing timing and if things go smoothly, you should be able to enjoy improved frame timings in KDE Plasma 5.21.

If we start compositing as close as possible to the next vblank, then applications, such as video players, will be able to get their contents on the screen in the shortest amount of time without inducing any screen tearing.

The main drawback of this approach is that the compositor has to know how much time exactly it will take to render the next frame. In other words, we need a reliable way to predict the future, easy, no problem!

The main idea behind the compositing timing rework is to introduce a new class, called RenderLoop, that notifies the compositor when it’s a good time to start painting the next frame. On X11, there is going to be only one RenderLoop. On Wayland, every output is going to have its own RenderLoop.

As it was mentioned previously, the compositor needs to predict how long it will take to render the next frame. We solve this inconvenient problem by making two guesses:

  • The first guess is based on a desired latency level that comes from a config. If the desired latency level is high, the predicted render time will be longer; on the other hand, if the desired latency level is low, the predicted render time will be shorter;
  • The second guess is based on the duration of previous compositing cycles.

The RenderLoop makes two guesses and the one with the longest render time is used for scheduling compositing for the next frame. By making two estimates rather than one, hopefully, animations will be more or less stable.

There is no “silver bullet” solution for the render time prediction problem, unfortunately. In the end, it all comes down to making a trade-off between latency and stability. The config option lets the user decide what matters the most. It’s worth noting that with the default latency level, the compositor will make a compromise between frame latency and animation stability that should be good enough for most of users.

The introduction of the RenderLoop helper is only half of the battle. At the moment, all compositing is done on the main thread and it can get crowded. For example, if you have several outputs with different refresh rates, some of them will have to wait until it’s their turn to get repainted. This may result in missed vblanks, and thus laggy frames. In order to address this issue, we need to put compositing on different threads. That way, monitors will be repainted independently of each other. There is no concrete milestone for compositing on different threads, but most likely, it’s going to be KDE Plasma 5.22.

Conclusion

Currently, compositing infrastructure in KWin is heavily influenced by the X11 requirements, e.g. there is only one compositing clock, compositing is throttled to the lowest refresh rate, etc. Besides that, incorrect assumptions were made about the behavior of glXSwapBuffers() and eglSwapBuffers(), unfortunately, which result in frame drops and other related issues. With the ongoing Wayland improvements, we hope to fix the aforementioned issues.

Links

[1] https://www.testufo.com/blackframes#count=2&bonusufo=1&equalizer=1&background=000000

[2] https://invent.kde.org/plasma/kwin/-/issues/22