On Sat, Jan 9, 2021 at 12:11 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, Jan 8, 2021 at 11:13 AM Phillip Susi phill@thesusis.net wrote:
Could we pause this madness? Scrollback is still useful. I needed it today... it was too small, so command results I was looking for already scrolled away, but... life will be really painful with 0 scrollback.
You'll need it, too... as soon as you get oops and will want to see errors just prior to that oops.
If it means I get to maintain it... I'm not happy about it but that's better than no scrollback.
Amen! What self respecting admin installs a gui on servers? What do we have to do to get this back in? What was so buggy with this code that it needed to be removed? Why was it such a burden to just leave it be?
It really was buggy, with security implications. And we have no maintainers.
So the scroll-back code can't come back until we have a maintainer and a cleaner and simpler implementation.
And no, maintaining it really doesn't mean "just get it back to the old broken state".
So far I haven't actually seen any patches, which means that it's not coming back.
The good news? If you have an actual text VGA console, that should still work just fine.
Also on anything that is remotely modern (i.e. runs a drm kernel modesetting driver undearneath the fbdev/fbcon stack) there's a pile more issues on top of just the scrollback/fbcon code being a mess. Specifically the locking is somewhere between yolo and outright deadlocks. This holds even more so if the use case here is "I want scrollback for an oops". There's rough sketches for how it could be solved, but it's all very tricky work.
Also, we need testcases for this, both in-kernel unit-test style stuff and uapi testcases. Especially the full interaction on a modern stack between /dev/fb/0, /dev/drm/card0, vt ioctls and the console is a pure nightmare.
Altogether this is a few years of full time hacking to get this back into shape, and until that's happening and clearly getting somewhere the only reasonable thing to do is to delete features in response to syzkaller crashes.
Also adding dri-devel since defacto that's the only place where display people hang out nowadays. -Daniel
Hi Daniel,
CC linux-fbdev
On Tue, Jan 12, 2021 at 5:00 PM Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Jan 9, 2021 at 12:11 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, Jan 8, 2021 at 11:13 AM Phillip Susi phill@thesusis.net wrote:
Could we pause this madness? Scrollback is still useful. I needed it today... it was too small, so command results I was looking for already scrolled away, but... life will be really painful with 0 scrollback.
You'll need it, too... as soon as you get oops and will want to see errors just prior to that oops.
If it means I get to maintain it... I'm not happy about it but that's better than no scrollback.
Amen! What self respecting admin installs a gui on servers? What do we have to do to get this back in? What was so buggy with this code that it needed to be removed? Why was it such a burden to just leave it be?
It really was buggy, with security implications. And we have no maintainers.
So the scroll-back code can't come back until we have a maintainer and a cleaner and simpler implementation.
And no, maintaining it really doesn't mean "just get it back to the old broken state".
So far I haven't actually seen any patches, which means that it's not coming back.
The good news? If you have an actual text VGA console, that should still work just fine.
IIRC, all of this was written for systems lacking VGA text consoles in the first place...
Also on anything that is remotely modern (i.e. runs a drm kernel modesetting driver undearneath the fbdev/fbcon stack) there's a pile more issues on top of just the scrollback/fbcon code being a mess.
Would it help to remove DRM_FBDEV_EMULATION (instead)?
Specifically the locking is somewhere between yolo and outright deadlocks. This holds even more so if the use case here is "I want scrollback for an oops". There's rough sketches for how it could be solved, but it's all very tricky work.
When an oops happens, all bets are off. At that point, all information you can extract from the system is valuable, and additional locking issues are moot.
Also adding dri-devel since defacto that's the only place where display people hang out nowadays.
Please keep on CCing linux-fbdev, especially for patches removing fbdev features.
Thanks!
Gr{oetje,eeting}s,
Geert
On Thu, Jan 14, 2021 at 4:56 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Daniel,
CC linux-fbdev
On Tue, Jan 12, 2021 at 5:00 PM Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Jan 9, 2021 at 12:11 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, Jan 8, 2021 at 11:13 AM Phillip Susi phill@thesusis.net wrote:
Could we pause this madness? Scrollback is still useful. I needed it today... it was too small, so command results I was looking for already scrolled away, but... life will be really painful with 0 scrollback.
You'll need it, too... as soon as you get oops and will want to see errors just prior to that oops.
If it means I get to maintain it... I'm not happy about it but that's better than no scrollback.
Amen! What self respecting admin installs a gui on servers? What do we have to do to get this back in? What was so buggy with this code that it needed to be removed? Why was it such a burden to just leave it be?
It really was buggy, with security implications. And we have no maintainers.
So the scroll-back code can't come back until we have a maintainer and a cleaner and simpler implementation.
And no, maintaining it really doesn't mean "just get it back to the old broken state".
So far I haven't actually seen any patches, which means that it's not coming back.
The good news? If you have an actual text VGA console, that should still work just fine.
IIRC, all of this was written for systems lacking VGA text consoles in the first place...
Also on anything that is remotely modern (i.e. runs a drm kernel modesetting driver undearneath the fbdev/fbcon stack) there's a pile more issues on top of just the scrollback/fbcon code being a mess.
Would it help to remove DRM_FBDEV_EMULATION (instead)?
It's a problem with the hardware. "Write some registers and done" isn't how display blocks work nowadays. So your proposal amounts to "no fbdev/fbcon for anything modern-ish".
Also I said "a pile more", most of the issues in fbcon/fbdev code apply for all drivers.
Specifically the locking is somewhere between yolo and outright deadlocks. This holds even more so if the use case here is "I want scrollback for an oops". There's rough sketches for how it could be solved, but it's all very tricky work.
When an oops happens, all bets are off. At that point, all information you can extract from the system is valuable, and additional locking issues are moot.
Except the first oops then scrolls aways because it's getting buried under further fail. Your locking needs to be minimally good enough to not make the situation worse. -Daniel
Also adding dri-devel since defacto that's the only place where display people hang out nowadays.
Please keep on CCing linux-fbdev, especially for patches removing fbdev features.
Thanks!
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Hi Daniel,
On Thu, Jan 14, 2021 at 5:11 PM Daniel Vetter daniel@ffwll.ch wrote:
On Thu, Jan 14, 2021 at 4:56 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Jan 12, 2021 at 5:00 PM Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Jan 9, 2021 at 12:11 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, Jan 8, 2021 at 11:13 AM Phillip Susi phill@thesusis.net wrote:
Could we pause this madness? Scrollback is still useful. I needed it today... it was too small, so command results I was looking for already scrolled away, but... life will be really painful with 0 scrollback.
You'll need it, too... as soon as you get oops and will want to see errors just prior to that oops.
If it means I get to maintain it... I'm not happy about it but that's better than no scrollback.
Amen! What self respecting admin installs a gui on servers? What do we have to do to get this back in? What was so buggy with this code that it needed to be removed? Why was it such a burden to just leave it be?
It really was buggy, with security implications. And we have no maintainers.
So the scroll-back code can't come back until we have a maintainer and a cleaner and simpler implementation.
And no, maintaining it really doesn't mean "just get it back to the old broken state".
So far I haven't actually seen any patches, which means that it's not coming back.
The good news? If you have an actual text VGA console, that should still work just fine.
IIRC, all of this was written for systems lacking VGA text consoles in the first place...
Also on anything that is remotely modern (i.e. runs a drm kernel modesetting driver undearneath the fbdev/fbcon stack) there's a pile more issues on top of just the scrollback/fbcon code being a mess.
Would it help to remove DRM_FBDEV_EMULATION (instead)?
It's a problem with the hardware. "Write some registers and done" isn't how display blocks work nowadays. So your proposal amounts to "no fbdev/fbcon for anything modern-ish".
With "modern-ish" actually meaning: "desktop/gaming/mobile-style 3D-accelerated wide-color display hardware". There's plenty of display hardware that doesn't fall into that class, and is served by fbdev (also out-of-tree due to the moratorium) because of that.
Also I said "a pile more", most of the issues in fbcon/fbdev code apply for all drivers.
Specifically the locking is somewhere between yolo and outright deadlocks. This holds even more so if the use case here is "I want scrollback for an oops". There's rough sketches for how it could be solved, but it's all very tricky work.
When an oops happens, all bets are off. At that point, all information you can extract from the system is valuable, and additional locking issues are moot.
Except the first oops then scrolls aways because it's getting buried under further fail. Your locking needs to be minimally good enough to not make the situation worse.
When an oops happens, all bets are off...
Gr{oetje,eeting}s,
Geert
Hi
Am 15.01.21 um 09:06 schrieb Geert Uytterhoeven:
Hi Daniel,
On Thu, Jan 14, 2021 at 5:11 PM Daniel Vetter daniel@ffwll.ch wrote:
On Thu, Jan 14, 2021 at 4:56 PM Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Jan 12, 2021 at 5:00 PM Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Jan 9, 2021 at 12:11 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, Jan 8, 2021 at 11:13 AM Phillip Susi phill@thesusis.net wrote:
> Could we pause this madness? Scrollback is still useful. I needed it > today... it was too small, so command results I was looking for > already scrolled away, but... life will be really painful with 0 > scrollback.
> You'll need it, too... as soon as you get oops and will want to see > errors just prior to that oops.
> If it means I get to maintain it... I'm not happy about it but that's > better than no scrollback.
Amen! What self respecting admin installs a gui on servers? What do we have to do to get this back in? What was so buggy with this code that it needed to be removed? Why was it such a burden to just leave it be?
It really was buggy, with security implications. And we have no maintainers.
So the scroll-back code can't come back until we have a maintainer and a cleaner and simpler implementation.
And no, maintaining it really doesn't mean "just get it back to the old broken state".
So far I haven't actually seen any patches, which means that it's not coming back.
The good news? If you have an actual text VGA console, that should still work just fine.
IIRC, all of this was written for systems lacking VGA text consoles in the first place...
Also on anything that is remotely modern (i.e. runs a drm kernel modesetting driver undearneath the fbdev/fbcon stack) there's a pile more issues on top of just the scrollback/fbcon code being a mess.
Would it help to remove DRM_FBDEV_EMULATION (instead)?
Of the fbdev code, DRM's fbdev emulation is the cleanest. We now even have test cases for the userspace I/O.
It's a problem with the hardware. "Write some registers and done" isn't how display blocks work nowadays. So your proposal amounts to "no fbdev/fbcon for anything modern-ish".
With "modern-ish" actually meaning: "desktop/gaming/mobile-style 3D-accelerated wide-color display hardware". There's plenty of display hardware that doesn't fall into that class, and is served by fbdev (also out-of-tree due to the moratorium) because of that.
Userspace has been moving away from fbdev. Writing an fbdev driver locks you into a legacy userspace. I also found that DRM drivers are smaller, because of all the DRM helper libraries. Using DRM + fbdev emulation is a win in almost any case. We did have some complaints about performance of the emulation. So that might be worth looking into.
Best regards Thomas
Also I said "a pile more", most of the issues in fbcon/fbdev code apply for all drivers.
Specifically the locking is somewhere between yolo and outright deadlocks. This holds even more so if the use case here is "I want scrollback for an oops". There's rough sketches for how it could be solved, but it's all very tricky work.
When an oops happens, all bets are off. At that point, all information you can extract from the system is valuable, and additional locking issues are moot.
Except the first oops then scrolls aways because it's getting buried under further fail. Your locking needs to be minimally good enough to not make the situation worse.
When an oops happens, all bets are off...
Gr{oetje,eeting}s,
Geert
Geert Uytterhoeven writes:
Judging from some of the comments in the code, it looks like you were one of the original authors of fbcon? I haven't been able to find any of these sczbot crash reports, and am not sure how fuzzing syscalls would really affect this code ( it's not really handling a buch of ioctls or otherwise taking arguments from user space ) , but I am a bit confused as to why the softback was implemented the way that it was.
vgacon simply copies the main buffer to vram in ->set_origin() and then changes the pointers to operate out of the much larger vram while that virtual terminal is active. If I understand it correctly, it looks like fbcon instead opts to operate out of the main buffer but rescue lines as they are scrolled off and relocate them to the softback buffer. This seems to be rather more convoluted.
I'm thinking of re-implementing scrollback more like the way vgacon does it: allocate a big "vram" buffer and operate out of that. Obviously ->scroll() and ->scrolldelta() have to actually repaint the screen rather than simply change the pointer register, but that should be about the only difference.
I have also noticed that there was some code to use hardware panning of the video buffer rather than having to do a block bitblt to scroll the contents of the screen, but that it was disabled because virtually no video drivers actually implemented it? That seems like a shame, but if it is so, then there's no sense carrying the dead code so I think I'll clean that up now.
Now that I look at it again, everything is simply always redrawn now instead of even doing a simple bitblt. Daniel, you mentioned that almost nobody supports hardware acceleration, but even without any specific hardware support, surely even if bitblt() is implemented just as a memcpy(), it has to be faster than redrawing all of the characters doesn't it? Getting rid of the panning if it isn't generally supported I can see, but I don't understand killing bitblt even if most devices don't accelerate it.
In addition, I noticed that ->screen_pos() was changed to just return vc_origin+offset. fbcon is the only console driver to implement ->screenpos() and if not implemented, vt defaults to using vc_visible_origin+offset, so it looks like this function isn't needed at all anymore and ->screen_pos() can be removed from struct consw.
Does this make sense or am I talking crazy?
Hi Phillip,
On Fri, Jan 22, 2021 at 8:26 PM Phillip Susi phill@thesusis.net wrote:
Geert Uytterhoeven writes: Judging from some of the comments in the code, it looks like you were one of the original authors of fbcon? I haven't been able to find any
Indeed, a looooong time ago... Before DRM existed.
of these sczbot crash reports, and am not sure how fuzzing syscalls would really affect this code ( it's not really handling a buch of ioctls or otherwise taking arguments from user space ) , but I am a bit
AFAIU, most of these are triggered by VT ioctls. There is an intimate relation between the VT and fbev subsystems: VT changes impact fbdev, and vice versa.
Perhaps these should be decoupled, at the expense of worse user experience (i.e. the user needing to change both screen resolution and number of columns/rows of the text console)?
confused as to why the softback was implemented the way that it was.
vgacon simply copies the main buffer to vram in ->set_origin() and then changes the pointers to operate out of the much larger vram while that virtual terminal is active. If I understand it correctly, it looks like fbcon instead opts to operate out of the main buffer but rescue lines as they are scrolled off and relocate them to the softback buffer. This seems to be rather more convoluted.
I'm thinking of re-implementing scrollback more like the way vgacon does it: allocate a big "vram" buffer and operate out of that. Obviously ->scroll() and ->scrolldelta() have to actually repaint the screen rather than simply change the pointer register, but that should be about the only difference.
I'm not that intimate familiar anymore with the current state of the code, but it used to be like this: - vgacon used a VRAM buffer for the current VC, and multiple shadow buffers to implement virtual consoles, - fbcon always used the shadow buffers, with each update triggering an update of the frame buffer (see below).
As the text console buffer handling should be the same for vgacon and fbcon, I expect most scrollback bugs (if any) to be present in both.
I have also noticed that there was some code to use hardware panning of the video buffer rather than having to do a block bitblt to scroll the contents of the screen, but that it was disabled because virtually no video drivers actually implemented it? That seems like a shame, but if it is so, then there's no sense carrying the dead code so I think I'll clean that up now.
Now that I look at it again, everything is simply always redrawn now instead of even doing a simple bitblt. Daniel, you mentioned that almost nobody supports hardware acceleration, but even without any specific hardware support, surely even if bitblt() is implemented just as a memcpy(), it has to be faster than redrawing all of the characters doesn't it? Getting rid of the panning if it isn't generally supported I can see, but I don't understand killing bitblt even if most devices don't accelerate it.
There are multiple ways to implement scrolling: 1. If the hardware supports a larger virtual screen and panning, and the virtual screen is enabled, most scrolling can be implemented by panning, with a casual copy when reaching the bottom (or top) of the virtual screen. This mode is (was) available on most graphics hardware with dedicated graphics memory. 2. If a 2D acceleration engine is available, copying (and clearing/filling) can be implemented by rectangle copy/fill operations. 3. Rectangle copy/fill by the CPU is always available. 4. Redrawing characters by the CPU is always available.
Which option was used depended on the hardware: not all options are available everywhere, and some perform better than others. E.g. on PCI graphics cards, reading graphics memory by the CPU is usually very slow, so option 3 is much slower than option 4 (given a sufficiently fast CPU). AFAIU, option 2 is not suitable for modern systems with 3D acceleration. On the older (slower) systems (lacking VGA text mode) for which fbcon was originally written, option 4 is usually the slowest.
Support for 1-3 were removed in commit 39aead8373b3c20b ("fbcon: Disable accelerated scrolling"), which claimed only 3 (DRM) drivers made use of this, ignoring the other 32 (fbdev) drivers making use of it.
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Fri, Jan 22, 2021 at 01:55:04PM -0500, Phillip Susi wrote:
Geert Uytterhoeven writes:
Judging from some of the comments in the code, it looks like you were one of the original authors of fbcon? I haven't been able to find any of these sczbot crash reports, and am not sure how fuzzing syscalls would really affect this code ( it's not really handling a buch of ioctls or otherwise taking arguments from user space ) , but I am a bit confused as to why the softback was implemented the way that it was.
vgacon simply copies the main buffer to vram in ->set_origin() and then changes the pointers to operate out of the much larger vram while that virtual terminal is active. If I understand it correctly, it looks like fbcon instead opts to operate out of the main buffer but rescue lines as they are scrolled off and relocate them to the softback buffer. This seems to be rather more convoluted.
I'm thinking of re-implementing scrollback more like the way vgacon does it: allocate a big "vram" buffer and operate out of that. Obviously ->scroll() and ->scrolldelta() have to actually repaint the screen rather than simply change the pointer register, but that should be about the only difference.
I have also noticed that there was some code to use hardware panning of the video buffer rather than having to do a block bitblt to scroll the contents of the screen, but that it was disabled because virtually no video drivers actually implemented it? That seems like a shame, but if it is so, then there's no sense carrying the dead code so I think I'll clean that up now.
Now that I look at it again, everything is simply always redrawn now instead of even doing a simple bitblt. Daniel, you mentioned that almost nobody supports hardware acceleration, but even without any specific hardware support, surely even if bitblt() is implemented just as a memcpy(), it has to be faster than redrawing all of the characters doesn't it? Getting rid of the panning if it isn't generally supported I can see, but I don't understand killing bitblt even if most devices don't accelerate it.
Just a quick comment on this: Since most framebuffers are write-combining, and reads from that tend to be ~3 orders of magnitude slower than writes (at least on the pile of machines I looked at here, there's big differences, and some special streaming cpu instructions to make the reading side not so slow).
So scrolling by copying tends to be significantly slower than just redrawing everything.
And once you're at that point it's really hard to write a 2d acceleration which is consistently faster than just cpu rendering.
If you're interested in why 2d acceleration is rather hard as a general problem, not just specific to fbcon, I wrote a blog on that a while ago:
https://blog.ffwll.ch/2018/08/no-2d-in-drm.html
Cheers, Daniel
In addition, I noticed that ->screen_pos() was changed to just return vc_origin+offset. fbcon is the only console driver to implement ->screenpos() and if not implemented, vt defaults to using vc_visible_origin+offset, so it looks like this function isn't needed at all anymore and ->screen_pos() can be removed from struct consw.
Does this make sense or am I talking crazy?
Daniel Vetter writes:
Just a quick comment on this: Since most framebuffers are write-combining, and reads from that tend to be ~3 orders of magnitude slower than writes (at least on the pile of machines I looked at here, there's big differences, and some special streaming cpu instructions to make the reading side not so slow).
So scrolling by copying tends to be significantly slower than just redrawing everything.
I know this was the case years ago with AGP as iirc, it doubled ( 4x, 8x ) the PCI clock rate but only for writes wasn't it? I thought this was no longer an issue with PCIe, but if it is, then I guess I'll go ahead with cleaning up the dead code and having it re-render with the larger text buffer.
On Tue, Feb 02, 2021 at 10:13:14AM -0500, Phillip Susi wrote:
Daniel Vetter writes:
Just a quick comment on this: Since most framebuffers are write-combining, and reads from that tend to be ~3 orders of magnitude slower than writes (at least on the pile of machines I looked at here, there's big differences, and some special streaming cpu instructions to make the reading side not so slow).
So scrolling by copying tends to be significantly slower than just redrawing everything.
I know this was the case years ago with AGP as iirc, it doubled ( 4x, 8x ) the PCI clock rate but only for writes wasn't it? I thought this was no longer an issue with PCIe, but if it is, then I guess I'll go ahead with cleaning up the dead code and having it re-render with the larger text buffer.
Still the same with PCIe, probably gotten worse since uncached reads are still as slow, but write-combined writes have gotten much faster even. There's work going on to have a coherent link to gpus which would allow fully cached reads and writes, early with nvlink and now as a standard with CXL (https://en.wikipedia.org/wiki/Compute_Express_Link)
But that's aimed at big compute jobs for servers, not really for display.
Also some on-die gpus have become fully coherent, but again only for render/compute buffers, not for the display framebuffer.
So all together 0 signs this is changing going forward, reading from framebuffers is slow.
Ok there's some exceptions: For manual update buffers (defio for fbdev drivers, drm also supports this with an entire set of helpers) the framebuffer used by the cpu is sometimes (but very often still not) cached. Imo not worth optimizing for, since for the drivers where it is cached they either have no blitter, or it's really tiny panels behind spi links and similar, so not going to be fast anyway. -Daniel
dri-devel@lists.freedesktop.org