Hi Erik,
On Mon, 6 Apr 2020 at 20:01, Erik Jensen rkjnsn@google.com wrote:
Screen scraping like that will have big problems trying to a) synchronize to the display updates correctly (was the screen updated, did you get old or new frame, and you have to poll rather than be notified), and b) synchronizing framebuffer reads vs. writes (is the display server re-using the buffer when you are still reading it). You also get to handle each KMS plane individually.
We're not too concerned with every frame being perfect, as long as there aren't frequent annoying artifacts and the user receives feedback to typing and mouse movement in a reasonable amount of time. (Think browsing the web, not playing a video game.) I'll play around with ffmpeg's kmsgrab and see what we might expect on that front. Obviously we'd have to handle the hardware cursor in addition to the primary plane at the very least. Not sure how common video overlays are these days? It seems most players render via GL, now.
A lot, but not all. X11 makes that the only reasonable choice thanks to its compositing design, but Wayland makes it possible to handle video externally, and that is what is encouraged.
You have to adapt to what the display server does and you have no way to negotiate better configurations. The framebuffers could be tiled and/or compressed, and quite likely are the kind of memory that is very slow to read by CPU, at least directly.
Yeah, I see ffmpeg has some examples of feeding frames through VAAPI to handle situations where the buffer isn't CPU mapped. Maybe EGL_EXT_image_dma_buf_import could also be useful here?
Don't forget modifiers!
The curtaining goes against the policy that the current DRM master is in full control of the display. It also means the kernel has to lie to the DRM master to make the display server unaware of the funny business, and I don't like that at all.
The hope was that this could be done without interfering with the DRM master at all. The DRM master could still control resolutions, displays, determine which CRTCs go to what outputs, et cetera. It's just that the content wouldn't actually be visible on the screen while curtaining was enabled, conceptually similarly to if the physical displays themselves were configured not to display anything (e.g., switched to a different input, or brightness set to zero), which also wouldn't affect output and mode selection.
If this could be implemented in a relatively simple way (i.e., curtaining sets a flag that suppresses the actual scan out to the display, but everything else stays the same), it seems like it could be a worthwhile avenue to explore. On the other hand, if it requires adding a lot of complexity (e.g., maintaining a completely separate physical configuration for the graphics card and "shadow" configuration to report to the DRM master), I would certainly concur that it doesn't make sense to do. Which is closer to the truth is one of the things I was hoping to find out from this e-mail.
I think you just end up inventing too much fake hardware in the kernel. If you handle curtaining by requiring the screen to be on and showing a black buffer, you have to allocate and show that (not as trivial as you might hope), and then keep a whole set of shadow state. If you handle it by having the CRTC be off, you have to spin a fake vblank loop in a shadow CRTC. I don't think this is something we would really want to keep.
I believe it would much better to cooperate with display servers than trying to bypass and fool them. Maybe look towards Pipewire at least for the screen capturing API?
I agree that this could create a better experience for some use cases if supported by all components. Unfortunately, the variety of graphical login managers, display servers, and desktop environments with different amounts of resources and priorities means that coming up with a solution that works for all of them seems untenable. It would also preclude being able to use the console remotely.
[... separate sessions aren't viable ...]
Our hope is that interacting at the kernel level can avoid all of these issues, especially given that frame grabbing (albeit imperfect) and input injection are already supported by the kernel, with curtaining being the only thing that does not already have an existing interface.
Well, it solves the issue of needing to fix userspace, but it definitely leaves you with a worse experience.
Userspace has largely standardised on PipeWire for remote streaming, which also handles things like hardware encoding for you, if desired. This is used in the xdg-desktop-portal (as used by GNOME, Flatpak, Chromium, Firefox, others) in particular, and implemented by many desktop environments. I think continuing to push the userspace side-channel is a far more viable long-term path. I would suggest starting with a single target desktop environment to design exemplary use and semantics, and then pushing that out into other environments as you come to rely on them.
Cheers, Daniel