-
Notifications
You must be signed in to change notification settings - Fork 6
2024 Web Engines Rendering in Linux
- GitHub issue: https://github.com/Igalia/webengineshackfest/issues/26
Alex García Castro (Igalia) introduces the session. We will have representatives of WebKit, Chromium, Gecko and Servo.
Martin Stransky (Red Hat) presents for Firefox. https://github.com/stransky/slides/tree/master/WebEngines2024/Dmabuf Some points:
- DMABuf provides access to GPU memory, but some operations are slow--specially reading. Used it to share buffers between processes.
- Creation: imported from VA-API, direct creation (tricky) or derived from EGLImage/EGL frame buffer (latter: only NVIDIA, not working correctly Intel/AMD)
- Buffer can be reclycled as long as there is any fd open.
- Used in FF only for video decode and playback, and WebGL rendering.
- on VA-API playback: depending on driver you can do zero-copy or not.
- on WebGL: manual buffer creation through libgbm (slides) + importing them as EGLImage backing through the EGL dmabuf import extension.
Questions:
- alex: slow to write, compared to what?
- Two times slower, than a regular system memory copy (as it uploads to GPU).
- Best way is to paint with the GPU and buffer contents always stay in the GPU. Rendering with CPU and then uploading the buffer contents is what's slower.
- alex: what do you use for buffer sharing between processes, how you pass frames around?
- Passing the file descriptors and a few flags (modifiers, etc), the re-import as EGLImages.
- martin stransky about firefox: we pass the display list to the GPU, we don't use buffers for most stuff.
- Carlos Garcia asks: does the GPU Process render directlyo the screen, using a display surface
- There is the WebRender compositor, need to double-check; my understanding is that it paints directly on a GPU surface, but not sure about displaying.
- We get display lists from the content process and painting happens in the parent one.
Martin Robinson (Igalia) talks about Servo (no slides)
-
Servo is simpler; the GPU access is in the content process, the architecture is from before having different content processes was a thing in browsers.
-
They do have a system (surfman) to broker surfaces around (IOSurface on Mac, EGL on Android, etc.)
-
All the rasterization is done with WebRender, including both painting and composition.
-
All the multimedia is done with GStreamer, so in theory it should be possible to use VA-API through that as long as target surfaces for decoded content are setup properly.
-
Alex asks: you don't have a multi-process architecture:
- Yes, but instead of passing rendered content we pass WebRender display lists.
- One process handles CSS, others handle canvas and WebGL.
- There is no dedicated GPU process right now.
- WebRender is written against OpenGL so currently it does not support other, more modern APIs like Vulkan or Metal.
-
The browser process does the UI? Yes
-
Alex: does WebRender create the display lists or processes them?
- We create WebRender display lists, we serialize them (it's fast) and ship that to the UI/browser process and WebRender deserializes this. This serialization/deserialization is supported by WebRender.
-
Alex: When it comes to do CSS transforms, how does that work?
- WebRender can determine when it needs to make what we consider a layer based on the contents of the display list vs. the contents of the CSS. Based on the contents it can decide when to make a separate layer when e.g. transforms are involved.
- Emilio (Mozilla): WebRender internally decides if stacking contexts are needed or not by a layer
-
Alex: you do the same in Mozilla?
- Emilio (Mozilla): the only time we need to pass a buffer to webrender is when we need to pass WebGL, video.
-
Jose Dapena (Igalia)
- If you use wayland you have the additional composition dome by the system compositor, is that right?
- Emilio: we have some compositor integration that make things a bit better but I don't have all the details.
Emilion mentions in the hackfest chat that there is a #gfx:mozilla.org Matrix room where to ask more about these topics.
Antonio Gomes (Igalia), talks about what Chromium does (no slides)
- We support the three major desktop OSes (Windows, Linux, Mac) and
- The graphics stack for linux also uses similar technologies
- The chromium compositor talks to the GPU process and makes the overlays, promotes things to hardware... In ChromeOS, the GPU talks back and talks to the browser.
- In ChromeOS there is the internal Exo Wayland compositor. It mimics the architecture of the browser: it has its own GPU process and so on.
- Trying to make shortcuts, talk directly
- The complete process of getting web content on the screen is definitel non-trivial.
- Also software rendering may be used most of the times for Linux users, only a small subset of configurations is supported and otherwise defaults to software. It requires tinkering to enable it.
- Conversions take place to bridge integer-based calculations in Chromium with floats used in Wayland protocols.
- Lots of investment in debugging tools, the project does a good job in terms of tooling.
-
Alex: do you use FFMpeg for media? Are you happy with it or does it cause integration issues in Linux?
- Yes, Chromium uses FFmpeg.
- Chromium has its own wrapper/media stack, that at some point was also able to use GStreamer.
- On some embedded devices rendering media content through FFmpeg was problematic.
-
Jose Dapena: chrome has its own media stack including high integration with compositor, manages WebRTC... It's a big part of the codebase, there's delegation to software (ffmpeg) for decodign but also a number of abstractions for hardware acceleration (which doesn't go through ffmpeg). THere are a number of codecs plugged to the media stack so you don't always use ffmpeg for decoding.
-
Alex: Do you use display lists to share rendering information among processes or do you pass buffers around?
- Dape: There's a mixture of both things. With software rendering the rasterization happens in the render process and the tiles is what gets shared among processes; in other cases it's display lists being shared.
- Anything else uses display lists, even using tiles you generate display lists to pass to the compositor.
-
The visualzation process is a specific process?
- Yes, it's called the GPU process. In the traditional path it would be the process that issues GL commands. In more modern stacks we generate display lists and higher level abstractions.
-
Do you have layers that are later composited?
- In the end, the layers idea is still there but the decision of what's a layer it's made in the very last stage. Sometimes you are going to end painting a huge alpha channel for background with a few strings of text; in the very last stage we will only decide to paint the fonts on top. It's able to reorder things, squash them, etc. in the last moment.
- tiles take memory but also bandwidth when being uploaded to the GPU instead of zero-copy.
- Many reasons to make decisions in the very last moment.
-
DO you use Skia for rendering?
- Servo: For CSS content we use WebRender which supports css primitives; for canvas we use __ but we are considering other canvas API or the GPU directly. Text is done by webrender, it uses system APIs for glyphs rasterization.
- Firefox: Skia. In all platforms? Emilio: we use Skia for canvas but otherwise use WebRender.
- Servo uses Skia for color fonts but that's a corner case.
- Firefox uses Cairo for printing and PDF generation. Why? There was a project at some point to port but it's not finished. Main reason is that nobody has ported it yet.
-
Chromium has abstractions because of multiplatform. Does this cause issues in Linux support? Effort adapting?
- We briefly introduced Ozone in the Wayland talk. It wraps the windowing system (Wayland, X11). Previously the GPU process was simply a GPU process but today is way more, it has some logic, it can defer things to the last moment, makes decisions like what's going to be a texture, overlay, etc.
- In the case of Chromium OS, the most complicated is that we have the client and the server side both within the browser and communicate through the browser and introduces penalities to performance.
- Dape: the philosophy was for a long time haivng a single architecture and implementation for compositing, one for rasterization, etc. The adaptations are smaller in the sense there are smaller abstractions and there is a boundary for window adaptation, buffer adaptation; the common part is far bigger which makes things easier to maintain (issues tend to be common). Ozone has been great providing a boundary and it could be the same for other platforms but .e.g. in Mac there are bigger chances to delegate to the system compositor so Ozone is not used. Skia takes a more prominent role, there is more delegation to Skia than in other browsers. Main problem is the architecture is always moving.
-
Why are they changing so much?
- Vulkan is one of the big reasons, security too; the visualization process still has the same idea that it would be only an output process, it does not talk back to the browser. Graphic drivers are considered unsafe so this code is not allowed to be used in other parts of the browser.
-
Dape: one thing I forgot that's interesting for all backends, overlay support. with this architecture it can be decided in the last moement what goes to an overlay in case the hardware platform supports it. Overlays are also important for secure paths, hardware enconding... Hardware overlays? Yes, although it depends on the architecture; some platforms provide them on the GPU, other provide software equivalents with fast exchange, in raspberry pi you can have layers and blending among them is very fast.
-
Alex: in Chromium, you support DRM as a backend?
- Dape: Yes, it's possible to render to DRM/KMS. it's for ChromeOS. As Chrome is the main compositor in that platform, it talks directly to the frame buffers.
-
Georges (Igalia): All the roundtrips and IPC mechanisms in Chrome, has anyone measured the latency, input delay, etc.
- It's being measured all the time, there's a big fight between performance, security and conditions imposed by the process model. That's one of the reasons why it's constantly changing because based on the measures they are all the time trying new approaches. That's a big pain for embedders and downstreams.
-
Georges: what's the current status of first-frame delay.
- Maksim (Igalia): we were fighting against the latency in ChromeOS when implementing Lacros, it's acceptable but depending on how lucky we get when sending the frame it can be from 0 to 2 frames, depending on many different factors and we are working on this.
- Antonio: one of the goals of the move to lacros is that users don't notice any difference in the transition. Measurement these days focused on difference between original implementation and lacros.
- Maksim: in some scenarios lacros is better. And we have improved 11% battery usage in the last two milestones.
-
Dape: latency must be really good in operations like dragging a finger and the solution for this is some heuristics to estimate the position of the finger. Other extreme case is streaming games where we need to send the inputs. There are tracing tools to measure the delays. A bigger penalty is normally the JS application and not the IPC.
-
ALex: UI integration.
- Martin: use GTK3 to render widgets but moved to native. Can you render in a Wayland compositor with no GTK? GTK uses Cairo and paints to a surface, Firefox customizes their own Cairo, we paint UI by GTK to Cairo surface and just put it to the screen, before we moved to painting in the renderer. We don't use GTK3 in EGL. There are other parts; GTK3 used for system integration (get info about the system) we don't use it like a typical GNOME application.
Alex: so you actually use wayland for rendering and use GTK3 for other things? * The answer we remember (alex) is that firefox uses GTK3 wayland integration, it uses the subsurfaces wayland to allow the rendering in the wayland surface that gtk provides.
Nikolas: asking if there any larger changes incoming in other browsers? In WebKit we are doing quite traditional things, we render and share the buffer, we don't share display lists. We current look into understanding how others do and work towards a similar architecture with display lists, but is this the final design for other browsers? * Emilio: we are doing a bunch of changes in Gecko but not the general architecture is not going to change. We deal with display lists in gecko and we would like to offload more to webrenderer but the bigger picture is not changing. * Martin: we don't have anything in the near future of servo. We would like to have a GPU process but that's a larger change.
- ANtonio: you have plans for WebKit?
- Niko: definitely, we want to introduce a visualization process and try to avoid painting in the browser process and sending things over, separate GPU access. WE are doing a Skia migration. We have an architecture allocating many small buffers in the GPU and that doesn't escale.
Alex: there is a session tomorrow by Carlos to explain the integration of Skia, our (WebKit) architecture is very based on CPU rendering and we are finding limitations, allocating many buffers in the GPU does not escale, we are very interested in seeing how rendering has evolved in other browsers. For us Linux is the main platform so we are in a different position and we would like to have the best solution for linux and the easiest to integrate. We want to increase complexity becuase we want to provide the best solution for the particular conditions of the platform.
We have display lists and we have prototypes to e.g. render the canvas contents from display lists.
Antonio: does it change much the architecture in desktop, embedded? Does it complicate things? Alex: APIs are the same but you are going to use them differently depending on your goals. DMABuf is a good example, it's complex but you have more flexibility to squeeze the hardware. IN case of WPE we want to have the complexity and we want to be able to decide how to allocate all the different buffers (media, webgl, tiles from rendering).
Dape links a presentation about Chromium rendering architecture: https://www.youtube.com/watch?v=K2QHdgAKP-s