https://bugzilla.kernel.org/show_bug.cgi?id=34772
Summary: [radeon] [R300] GPU lockups with when KMS is enabled Product: Drivers Version: 2.5 Kernel Version: 2.6.38 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) AssignedTo: drivers_video-dri@kernel-bugs.osdl.org ReportedBy: rbrito@ime.usp.br Regression: No
Created an attachment (id=57062) --> (https://bugzilla.kernel.org/attachment.cgi?id=57062) dmesg output right after the lock up, obtained via the network
Hi there.
I have been getting some Oopses/stack traces when I try to use my iBook G4 (with an "ATI Technologies Inc M11 NV/FireGL Mobility T2e" card) and I enable KMS.
The userland here is Debian unstable with the DRM from experimental, but I am willing to test anything that you would like me to.
For example, attached is the last of a series of such Oopses that I got when I tried to test if a video was playing or not with mplayer.
I tried to use 2.6.39-rc{5,6}, but upon boot I get messages telling me that there were failures and that hardware acceleration will be disabled and I that I get is a desktop with colors distorted like if there were some endianness issues.
This is, BTW, part of my attempts to get Linux running well on PowerPC, with some of my logs (with photos) present at my homepage:
http://www.ime.usp.br/~rbrito/linux/debug-r300/
Please, if there is anything that I can provide to fix this, let me know and I will do my best.
Thanks, Rogério Brito.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #1 from Rogério Brito rbrito@ime.usp.br 2011-05-19 20:34:40 --- Just for the record, I can provide further messages of these: this is as reproducible as I like.
In fact, I am now able to reproduce it with kernel 2.6.38 if I boot the iBook G4 with the options:
"video=radeonfb:off radeon.agpmode=-1 radeon.modeset=1"
and play a video with mplayer.
If, OTOH, I leave off the KMS, then I don't get the GPU lockups that I reported.
Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 the kernel doesn't even get the colors correctly---everything that should be red becomes blue and so forth (any kind of endianness problem?).
I am attaching here another stacktrace, in case it helps.
Regards,
Rogério Brito.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #2 from Rogério Brito rbrito@ime.usp.br 2011-05-19 20:36:24 --- Created an attachment (id=58602) --> (https://bugzilla.kernel.org/attachment.cgi?id=58602) A dmesg log from 2.6.39-rc7 showing problems.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #3 from Rogério Brito rbrito@ime.usp.br 2011-05-19 20:37:08 --- Created an attachment (id=58612) --> (https://bugzilla.kernel.org/attachment.cgi?id=58612) The log of X with the 2.6.39-rc7 kernel
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #4 from Rogério Brito rbrito@ime.usp.br 2011-05-19 20:38:14 --- Created an attachment (id=58622) --> (https://bugzilla.kernel.org/attachment.cgi?id=58622) A dmesg log with 2.6.38 kernel
Please, notice the GPU hang with kernel 2.6.38.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #5 from Rogério Brito rbrito@ime.usp.br 2011-05-19 20:38:56 --- Created an attachment (id=58632) --> (https://bugzilla.kernel.org/attachment.cgi?id=58632) Log from X with the kernel 2.6.38
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #6 from Michel Dänzer michel@daenzer.net 2011-05-20 12:11:38 --- (In reply to comment #1)
Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 the kernel doesn't even get the colors correctly---everything that should be red becomes blue and so forth (any kind of endianness problem?).
That's probably nothing to do with the kernel directly but endianness bugs in the X driver when acceleration is not available.
It would be interesting if you could bisect what broke acceleration with radeon.agpmode=-1. Note that you should boot with radeon.no_wb=1 as well for this, as CP writeback was only fixed during the 2.6.39 cycle (in commit dc66b325f161bb651493c7d96ad44876b629cf6a).
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #7 from Michel Dänzer michel@daenzer.net 2011-05-20 14:31:00 --- I was able to reproduce the acceleration initialization failure with the Debian 2.6.39-rc7-powerpc kernel, but not with a self-built 2.6.39 kernel. So this was probably just an intermittent problem during the 2.6.39 cycle, e.g. due to the intermittent broken usage of the DMA API by TTM.
As for the GPU lockups, does radeon.dynclks=1 help for those?
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #8 from Andreas Schwab schwab@linux-m68k.org 2011-05-20 20:58:03 --- radeon.dynclks=1 causes the wrong resolution to be selected. It thinks something is conncted to the S-video port with a max resolution of 800x600, so it selects this instead of the native resolution (1024x768).
-<6>Console: switching to colour frame buffer device 128x48 +<6>[drm] crtc 1 is connected to a TV +<6>Console: switching to colour frame buffer device 100x37
+(II) RADEON(0): Printing probed modes for output S-video +(II) RADEON(0): Modeline "800x600"x59.9 38.25 800 832 912 1024 600 603 607 624 -hsync +vsync (37.4 kHz) +(II) RADEON(0): Modeline "640x480"x59.9 25.18 640 656 752 800 480 490 492 525 -hsync -vsync (31.5 kHz) +(II) RADEON(0): Modeline "320x240"x60.1 12.59 320 328 376 400 240 245 246 262 doublescan -hsync -vsync (31.5 kHz) (II) RADEON(0): Output LVDS connected (II) RADEON(0): Output VGA-0 disconnected -(II) RADEON(0): Output S-video disconnected +(II) RADEON(0): Output S-video connected (II) RADEON(0): Using exact sizes for initial modes -(II) RADEON(0): Output LVDS using initial mode 1024x768 +(II) RADEON(0): Output LVDS using initial mode 800x600 +(II) RADEON(0): Output S-video using initial mode 800x600
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #9 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:16:28 --- Hi, Michel.
On Fri, May 20, 2011 at 12:11, bugzilla-daemon@bugzilla.kernel.org wrote:
--- Comment #6 from Michel Dänzer michel@daenzer.net 2011-05-20 12:11:38 --- (In reply to comment #1)
Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 the kernel doesn't even get the colors correctly---everything that should be red becomes blue and so forth (any kind of endianness problem?).
That's probably nothing to do with the kernel directly but endianness bugs in the X driver when acceleration is not available.
OK, then that's a separate issue. Good to know.
It would be interesting if you could bisect what broke acceleration with radeon.agpmode=-1.
Oooh, I guess that I made some mess in your head here, taking into account the other messages of us. To clear things up: When I use 2.6.38, it works mostly OK if I use radeon.agpmode=-1. It is sufficiently stable to the point that I told you that this setting was OK. But, in fact, if I play a video with mplayer, then it always (so far, 100% reproducible) causes those GPU lockups, but the computer is still accessible via the network, so that I can take the logs etc. If, instead, I use 1 instead of -1, then, even with kernel 2.6.38, I get those lysergide-like :-) pictures that I put on my homepage (but, for documentation purposes, I am thinking of uploading here as attachments, as I am quite short of space there).
With kernel 2.6.39, I have not been able to get anything working, whether or not I pass any option to the kernel.
Summary:
* 2.6.38 with KMS and agpmode=-1: OK, up to me trying to play some video, then GPU lockups. * 2.6.38 with KMS and agpmode=1: GPU lockups a few seconds after X loads (it *does* show up, but locks up a few seconds latter). * 2.6.39 with KMS and agpmode=-1: Not OK, even if I don't use anything accelerated (problems with colors and software rendering).
So, I am not quite sure if it would be the case of bisecting or, at least, what would be a good starting point. I can, though, try to boot with many other kernels to see if I can (provided that udev doesn't stop me).
Note that you should boot with radeon.no_wb=1 as well for
OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not.
this, as CP writeback was only fixed during the 2.6.39 cycle (in commit dc66b325f161bb651493c7d96ad44876b629cf6a).
Right. Thanks for that fix of yours (just read the commit).
Regards,
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #10 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:23:27 --- Hi there.
On Sat, May 21, 2011 at 09:16, bugzilla-daemon@bugzilla.kernel.org wrote:
OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not.
Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still locks up the GPU when I play a video with mplayer.
I will try with 2.6.39 with the same settings.
Thanks,
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #11 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:34:20 --- Another test.
On Sat, May 21, 2011 at 09:23, bugzilla-daemon@bugzilla.kernel.org wrote:
On Sat, May 21, 2011 at 09:16, bugzilla-daemon@bugzilla.kernel.org wrote:
OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not.
Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still locks up the GPU when I play a video with mplayer.
Just for the record #2, 2.6.38 with KMS + agpmode=-1 + no_wb=1 + dynclks=1 still locks up the GPU when I play a video with mplayer.
Besides that, like Andreas, with dynclks=1 the resolution is reduced to be 800x600. I didn't have the opportunity to read the X logs regarding the S-Video port, but, at least for the user, iBooks (differently from PowerBooks) don't have user-accessible S-Video ports (but this doesn't prevent Apple from having inutilized them somehow).
Thanks,
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #12 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:42:12 --- On Sat, May 21, 2011 at 09:34, bugzilla-daemon@bugzilla.kernel.org wrote:
On Sat, May 21, 2011 at 09:23, bugzilla-daemon@bugzilla.kernel.org wrote:
On Sat, May 21, 2011 at 09:16, bugzilla-daemon@bugzilla.kernel.org wrote:
OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not.
Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still locks up the GPU when I play a video with mplayer.
Wooow! Oopsen galore with 2.6.39 with KMS + agpmode=-1 + no_wb=1... Five in a row.
OK, probably only the first one matters. Then, it stays there and doesn't load the system... Actually, as I am writing this thing, after about 180 seconds, the boot process is continuing and X is being loaded, but with the wrong colors (the "endianness issue"). I will try to see if the network is available and attach here what I get from dmesg.
BTW, I hope that you don't mind me providing copious amounts of testing here (and their results) in the hope to get this fixed... :-)
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #13 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:47:45 --- Created an attachment (id=58892) --> (https://bugzilla.kernel.org/attachment.cgi?id=58892) dmesg log with 2.6.39-rc7 with KMS + agpmode=-1 + no_wb=1
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #14 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:50:35 --- Created an attachment (id=58902) --> (https://bugzilla.kernel.org/attachment.cgi?id=58902) X log with 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #15 from Rogério Brito rbrito@ime.usp.br 2011-05-21 09:56:33 --- With 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1 + dynclks=1:
* I don't get the Oopsen. * the resolution is restricted to 800x600. * XV is not available to mplayer or other applications.
I think the XV extension not working is something that has always happened with 2.6.39 kernels.
Thanks,
Rogério Brito.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #16 from Michel Dänzer michel@daenzer.net 2011-05-21 14:54:25 --- (In reply to comment #15)
With 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1 + dynclks=1:
- XV is not available to mplayer or other applications.
When the kernel radeon driver fails to initialize acceleration, there's no point in trying any functionality that needs acceleration, such as XVideo.
I don't think there's any point doing any more tests with 2.6.39-rc7, as it's obviously suffering from additional issues which only occurred intermittently during the 2.6.39 cycle.
(In reply to comment #12)
BTW, I hope that you don't mind me providing copious amounts of testing here (and their results) in the hope to get this fixed... :-)
Well, I'm afraid less quantity but more quality would be better... It's becoming rather difficult and time-consuming to find the relevant pieces of information in this mass.
(In reply to comment #11)
Just for the record #2, 2.6.38 with KMS + agpmode=-1 + no_wb=1 + dynclks=1 still locks up the GPU when I play a video with mplayer.
Has either of you tried agpmode=1 dynclks=1? Does that increase stability at all?
Besides that, like Andreas, with dynclks=1 the resolution is reduced to be 800x600. I didn't have the opportunity to read the X logs regarding the S-Video port, but, at least for the user, iBooks (differently from PowerBooks) don't have user-accessible S-Video ports (but this doesn't prevent Apple from having inutilized them somehow).
I thought there was some kind of multimedia adapter for the external output.
Anyway, it should be possible to override the incorrect output detection, either on the kernel command line with something like video=S-video-1:d or later in xorg.conf or during X runtime with something like xrandr.
But really, we need to focus on one problem per bug report as much as possible, or things are getting out of hand.
(In reply to comment #9)
So, I am not quite sure if it would be the case of bisecting or, at least, what would be a good starting point.
No, there's no point in bisecting, as that problem should be gone with 2.6.39 final.
Note that you should boot with radeon.no_wb=1 as well for
OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not.
no_wb=1 would only have been important for bisecting, to avoid the writeback endianness bug interfering.
P.S. beware of Debian package udev version 169-1: IME an initrd generated with that installed prevents the radeon module from being loaded automatically, and when trying to load it manually, it fails to load the CP microcode and consequently fails to initialize acceleration.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #17 from Rogério Brito rbrito@ime.usp.br 2011-05-21 15:34:37 --- Hi, Michel.
Thank you very much for the attention.
(In reply to comment #16)
When the kernel radeon driver fails to initialize acceleration, there's no point in trying any functionality that needs acceleration, such as XVideo.
OK.
I don't think there's any point doing any more tests with 2.6.39-rc7, as it's obviously suffering from additional issues which only occurred intermittently during the 2.6.39 cycle.
Right.
Well, I'm afraid less quantity but more quality would be better... It's becoming rather difficult and time-consuming to find the relevant pieces of information in this mass.
Indeed, it is getting out of hand pretty quickly. Do you want me to give you some SSH access to this notebook? Or, if that's not feasible/useful, what would you like me to test as the next step, so that I avoid flooding you with so much data?
Has either of you tried agpmode=1 dynclks=1? Does that increase stability at all?
I will try those. But with which kernel? I have been avoiding compiling a kernel nowadays, since they take ages on this notebook, but I can set up a cross-compilation environment, if necessary.
BTW, would you mind sharing your .config?
I thought there was some kind of multimedia adapter for the external output.
The only external adapter is one to a VGA port. No traces of S-video here.
But really, we need to focus on one problem per bug report as much as possible, or things are getting out of hand.
OK, I can file a separate bug for this S-Video issue, then.
Thank you so much for your patience,
Rogério Brito.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #18 from Alex Deucher alexdeucher@gmail.com 2011-05-21 15:38:57 --- apples sells VGA to s-video adapters, so we list both connectors in the driver.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #19 from Rogério Brito rbrito@ime.usp.br 2011-05-21 15:42:21 --- (In reply to comment #18)
apples sells VGA to s-video adapters, so we list both connectors in the driver.
Oh, sorry for the ignorance.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #20 from Michel Dänzer michel@daenzer.net 2011-05-21 16:47:10 --- Created an attachment (id=58922) --> (https://bugzilla.kernel.org/attachment.cgi?id=58922) Allow forcing on all GPU clocks
(In reply to comment #17)
Has either of you tried agpmode=1 dynclks=1? Does that increase stability at all?
I will try those. But with which kernel?
2.6.38 should be fine for this test. But at some point it'll probably be useful for you to be able to try kernel patches. Once you've built a kernel, building the radeon module with a patch shouldn't take long.
E.g., you guys could try this patch, and booting with radeon.dynclks=0, which should force on all GPU clocks. Does that increase stability with agpmode=1 or agpmode=-1?
BTW, would you mind sharing your .config?
My .config still takes 1-2 hours to build on this 1.6 GHz PowerBook. If that could help you, please ask for it on the debian-powerpc list.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #21 from Michel Dänzer michel@daenzer.net 2011-05-21 16:58:59 --- Would also be interesting if one of you guys could attach dmesg with agpmode=1.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
Alan alan@lxorguk.ukuu.org.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED CC| |alan@lxorguk.ukuu.org.uk Resolution| |OBSOLETE
https://bugzilla.kernel.org/show_bug.cgi?id=34772
Rogério Brito rbrito@ime.usp.br changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|OBSOLETE |
--- Comment #22 from Rogério Brito rbrito@ime.usp.br 2012-10-28 18:51:45 --- Just for the record,
I can still provide the information, as I am going to reinstall Linux on the iBook.
Thanks in advance,
Rogério Brito.
https://bugzilla.kernel.org/show_bug.cgi?id=34772
xerofoify@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |xerofoify@gmail.com
--- Comment #23 from xerofoify@gmail.com --- This bug needs to be tested against a newer kernel to see if it's fixed. Cheers Nick
https://bugzilla.kernel.org/show_bug.cgi?id=34772
--- Comment #24 from Rogério Brito rbrito@ime.usp.br --- Hi, Nick.
On Jun 25 2014, bugzilla-daemon@bugzilla.kernel.org wrote:
https://bugzilla.kernel.org/show_bug.cgi?id=34772 --- Comment #23 from xerofoify@gmail.com --- This bug needs to be tested against a newer kernel to see if it's fixed. Cheers Nick
OK, I think that this may be easier to test than the previous issue, but, if I recall correctly, this issue was so fragile that almost anything crashed it.
Again, as my other e-mail, please ping me if I don't respond, as I am swamped with work.
Thanks,
dri-devel@lists.freedesktop.org