After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. I left my KDE desktop running until the screensaver turned off the monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to switch to another virtual console also did not work. Alt+SysRq magic still worked, so I was able to force the syslog to disk and restart the system.
From the log:
Jan 21 19:30:01 thoregon cron[3960]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons) Jan 21 19:39:41 thoregon kernel: [ 6364.620131] radeon 0000:07:00.0: GPU lockup CP stall for more than 10000msec Jan 21 19:39:41 thoregon kernel: [ 6364.620139] GPU lockup (waiting for 0x0003F1F2 last fence id 0x0003F1F1) Jan 21 19:39:41 thoregon kernel: [ 6364.636341] radeon 0000:07:00.0: GPU softreset Jan 21 19:39:41 thoregon kernel: [ 6364.636348] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0xA0003028 Jan 21 19:39:41 thoregon kernel: [ 6364.636354] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 21 19:39:41 thoregon kernel: [ 6364.620131] radeon 0000:07:00.0: GPU lockup CP stall for more than 10000msec Jan 21 19:39:41 thoregon kernel: [ 6364.620139] GPU lockup (waiting for 0x0003F1F2 last fence id 0x0003F1F1) Jan 21 19:39:41 thoregon kernel: [ 6364.636341] radeon 0000:07:00.0: GPU softreset Jan 21 19:39:41 thoregon kernel: [ 6364.636348] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0xA0003028 Jan 21 19:39:41 thoregon kernel: [ 6364.636354] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 21 19:39:41 thoregon kernel: [ 6364.636359] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 21 19:39:41 thoregon kernel: [ 6364.636370] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE Jan 21 19:39:41 thoregon kernel: [ 6364.651219] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 Jan 21 19:39:41 thoregon kernel: [ 6364.667212] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0x00003028 Jan 21 19:39:41 thoregon kernel: [ 6364.667217] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 21 19:39:41 thoregon kernel: [ 6364.667223] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 21 19:39:41 thoregon kernel: [ 6364.668226] radeon 0000:07:00.0: GPU reset succeed Jan 21 19:39:41 thoregon kernel: [ 6364.673142] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Jan 21 19:39:41 thoregon kernel: [ 6364.673177] radeon 0000:07:00.0: WB enabled Jan 21 19:39:41 thoregon kernel: [ 6364.673184] [drm] fence driver on ring 0 use gpu addr 0x20000c00 and cpu addr 0xffff880328636c00 Jan 21 19:39:41 thoregon kernel: [ 6364.719445] [drm] ring test on 0 succeeded in 1 usecs Jan 21 19:40:01 thoregon cron[3975]: (root) CMD (test -x /usr/sbin/run-crons && /usr/sbin/run-crons) Jan 21 19:43:37 thoregon kernel: [ 6600.390150] INFO: task X:3098 blocked for more than 120 seconds. Jan 21 19:43:37 thoregon kernel: [ 6600.390157] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jan 21 19:43:37 thoregon kernel: [ 6600.390163] X D ffff880337d50a00 0 3098 3077 0x00400000 Jan 21 19:43:37 thoregon kernel: [ 6600.390174] ffff88031df15080 0000000000000086 ffff8802f5087300 0000000000010a00 Jan 21 19:43:37 thoregon kernel: [ 6600.390185] ffff88031bf79fd8 0000000000010a00 ffff88031bf78000 ffff88031bf79fd8 Jan 21 19:43:37 thoregon kernel: [ 6600.390194] 0000000000010a00 ffff88031df15080 0000000000010a00 0000000000010a00 Jan 21 19:43:37 thoregon kernel: [ 6600.390203] Call Trace: Jan 21 19:43:37 thoregon kernel: [ 6600.390219] [<ffffffff815eee58>] ? __mutex_lock_slowpath+0xc8/0x140 Jan 21 19:43:37 thoregon kernel: [ 6600.390230] [<ffffffff815eeb4a>] ? mutex_lock+0x1a/0x40 Jan 21 19:43:37 thoregon kernel: [ 6600.390239] [<ffffffff81352be2>] ? radeon_ib_get+0x52/0x230 Jan 21 19:43:37 thoregon kernel: [ 6600.390249] [<ffffffff8136e86a>] ? r600_ib_test+0x5a/0x300 Jan 21 19:43:37 thoregon kernel: [ 6600.390258] [<ffffffff8137246e>] ? rv770_startup+0xf7e/0x1590 Jan 21 19:43:37 thoregon kernel: [ 6600.390267] [<ffffffff81372d5c>] ? rv770_resume+0x2c/0x90 Jan 21 19:43:37 thoregon kernel: [ 6600.390275] [<ffffffff8132bd8e>] ? radeon_gpu_reset+0x11e/0x160 Jan 21 19:43:37 thoregon kernel: [ 6600.390284] [<ffffffff8133ef43>] ? radeon_fence_wait+0x363/0x3b0 Jan 21 19:43:37 thoregon kernel: [ 6600.390293] [<ffffffff8104f340>] ? wake_up_bit+0x40/0x40 Jan 21 19:43:37 thoregon kernel: [ 6600.390301] [<ffffffff81352d77>] ? radeon_ib_get+0x1e7/0x230 Jan 21 19:43:37 thoregon kernel: [ 6600.390310] [<ffffffff81354b4a>] ? radeon_cs_ioctl+0x27a/0x4d0 Jan 21 19:43:37 thoregon kernel: [ 6600.390319] [<ffffffff812f42d4>] ? drm_ioctl+0x3e4/0x490 Jan 21 19:43:37 thoregon kernel: [ 6600.390327] [<ffffffff813548d0>] ? radeon_cs_finish_pages+0xa0/0xa0 Jan 21 19:43:37 thoregon kernel: [ 6600.390336] [<ffffffff81024769>] ? do_page_fault+0x199/0x420 Jan 21 19:43:37 thoregon kernel: [ 6600.390344] [<ffffffff810af30c>] ? mmap_region+0x1dc/0x570 Jan 21 19:43:37 thoregon kernel: [ 6600.390352] [<ffffffff810de446>] ? do_vfs_ioctl+0x96/0x4e0 Jan 21 19:43:37 thoregon kernel: [ 6600.390359] [<ffffffff815efd0c>] ? __schedule+0x28c/0x630 Jan 21 19:43:37 thoregon kernel: [ 6600.390366] [<ffffffff810de8d9>] ? sys_ioctl+0x49/0x90 Jan 21 19:43:37 thoregon kernel: [ 6600.390375] [<ffffffff815f16e2>] ? system_call_fastpath+0x16/0x1b Jan 21 19:45:08 thoregon kernel: [ 6691.864440] SysRq : Emergency Sync Jan 21 19:45:08 thoregon kernel: [ 6691.864838] Emergency Sync complete Jan 21 19:45:14 thoregon kernel: [ 6697.476112] SysRq : Emergency Remount R/O Jan 21 19:46:33 thoregon kernel: [ 0.000000] Linux version 3.3.0-rc1 (root@thoregon) (gcc version 4.5.3 (Gentoo 4.5.3-r2 p1.0, pie-0.4.6) ) #1 SMP Fri Jan 20 09:54:26 CET 2012
I did not have any trouble with 3.2 or earlier kernel, so it looks like an regression in 3.3-rc1.
Info from my card: thoregon ~ # lspci -vvs 07:00.0 07:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV730 PRO [Radeon HD 4650] (prog-if 00 [VGA controller]) Subsystem: Hightech Information System Ltd. Device 2269 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 78 Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at fe9e0000 (64-bit, non-prefetchable) [size=64K] Region 4: I/O ports at e000 [size=256] Expansion ROM at fe9c0000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Not Supported, TimeoutDis- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee3f00c Data: 4189 Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Kernel driver in use: radeon
Please ask, if you need any other information, I will try to provide it.
Torsten
On Sat, Jan 21, 2012 at 08:03:37PM +0100, Torsten Kaiser wrote:
After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. I left my KDE desktop running until the screensaver turned off the monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to switch to another virtual console also did not work. Alt+SysRq magic still worked, so I was able to force the syslog to disk and restart the system.
Can you test if attached patch help your case ?
Of course it would be best if we did not lockup in the first place.
Cheers, Jerome
On Mon, Jan 23, 2012 at 5:57 PM, Jerome Glisse j.glisse@gmail.com wrote:
On Sat, Jan 21, 2012 at 08:03:37PM +0100, Torsten Kaiser wrote:
After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. I left my KDE desktop running until the screensaver turned off the monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to switch to another virtual console also did not work. Alt+SysRq magic still worked, so I was able to force the syslog to disk and restart the system.
Can you test if attached patch help your case ?
Patch is installed, but I can't reproduce the hang on demand. It did happen a second time yesterday while letting the screensaver kick in, but only at around the third or fourth try. Just using "xset dpms force standby/suspend/off" did not trigger it.
Of course it would be best if we did not lockup in the first place.
Not sure if this is important: I also upgraded to mesa 8.0-rc1 before the first hang, but after switching back to 3.2 but still using mesa 8.0 I did not have any problems. Except the KDE desktop effects there should not have been any OpenGL programs running. The screen saver itself is just turning the screens off via the KDE power profile.
I will report again, when I succeeded in triggering the GPU lockup again...
Torsten
My two cents here: I'm experiencing the same problem. I've noticed there was a problem earlier in the boot process where the kernel was crying about a deadlock in radeon power management.
I opened a bug and I'm bisecting the kernel right now ( https://bugzilla.kernel.org/show_bug.cgi?id=42639).
It may or may not be related, but both problems appeared when moving to 3.3-rc1.
On Mon, Jan 23, 2012 at 7:01 PM, Torsten Kaiser just.for.lkml@googlemail.com wrote:
On Mon, Jan 23, 2012 at 5:57 PM, Jerome Glisse j.glisse@gmail.com wrote:
On Sat, Jan 21, 2012 at 08:03:37PM +0100, Torsten Kaiser wrote:
After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. I left my KDE desktop running until the screensaver turned off the monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to switch to another virtual console also did not work. Alt+SysRq magic still worked, so I was able to force the syslog to disk and restart the system.
Can you test if attached patch help your case ?
Patch is installed, but I can't reproduce the hang on demand. It did happen a second time yesterday while letting the screensaver kick in, but only at around the third or fourth try. Just using "xset dpms force standby/suspend/off" did not trigger it.
I think the patch did what it was intended to do, but it did not really help. While the GPU reset did seem to work, X still got stuck and was not able to turn the monitors back on.
From the log:
The GPU lockup happend while the system was idle: Jan 23 23:53:54 thoregon kernel: [17121.080129] radeon 0000:07:00.0: GPU lockup CP stall for more than 10000msec Jan 23 23:53:54 thoregon kernel: [17121.080137] GPU lockup (waiting for 0x002080B7 last fence id 0x002080B6) Jan 23 23:53:54 thoregon kernel: [17121.096334] radeon 0000:07:00.0: GPU softreset Jan 23 23:53:54 thoregon kernel: [17121.096341] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0xA0003028 Jan 23 23:53:54 thoregon kernel: [17121.096346] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 23 23:53:54 thoregon kernel: [17121.096351] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 23 23:53:54 thoregon kernel: [17121.096362] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE Jan 23 23:53:54 thoregon kernel: [17121.111386] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 Jan 23 23:53:54 thoregon kernel: [17121.127378] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0x00003028 Jan 23 23:53:54 thoregon kernel: [17121.127384] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 23 23:53:54 thoregon kernel: [17121.127390] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 23 23:53:54 thoregon kernel: [17121.128393] radeon 0000:07:00.0: GPU reset succeed Jan 23 23:53:54 thoregon kernel: [17121.133330] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Jan 23 23:53:54 thoregon kernel: [17121.133364] radeon 0000:07:00.0: WB enabled Jan 23 23:53:54 thoregon kernel: [17121.133370] [drm] fence driver on ring 0 use gpu addr 0x20000c00 and cpu addr 0xffff8803286e5c00 Jan 23 23:53:54 thoregon kernel: [17121.179627] [drm] ring test on 0 succeeded in 1 usecs Jan 23 23:53:54 thoregon kernel: [17121.179653] [drm] ib test on ring 0 succeeded in 1 usecs
There where no messages about X getting stuck ("blocked for more than 120 seconds"), but after trying to access the system and failing SysRq+W reported this: Jan 24 08:08:20 thoregon kernel: [46786.741180] SysRq : Show Blocked State Jan 24 08:08:20 thoregon kernel: [46786.741190] task PC stack pid father Jan 24 08:08:20 thoregon kernel: [46786.741270] X D ffff880337d50a00 0 3047 3026 0x00400004 Jan 24 08:08:20 thoregon kernel: [46786.741281] ffff880327eacac0 0000000000000086 ffff880327d52e00 0000000000010a00 Jan 24 08:08:20 thoregon kernel: [46786.741292] ffff88031be9bfd8 0000000000010a00 ffff88031be9a000 ffff88031be9bfd8 Jan 24 08:08:20 thoregon kernel: [46786.741301] 0000000000010a00 ffff880327eacac0 0000000000010a00 0000000000010a00 Jan 24 08:08:20 thoregon kernel: [46786.741310] Call Trace: Jan 24 08:08:20 thoregon kernel: [46786.741326] [<ffffffff815ee9f7>] ? schedule_timeout+0x157/0x220 Jan 24 08:08:20 thoregon kernel: [46786.741336] [<ffffffff8103fbd0>] ? run_timer_softirq+0x240/0x240 Jan 24 08:08:20 thoregon kernel: [46786.741346] [<ffffffff8133ee39>] ? radeon_fence_wait+0x239/0x3b0 Jan 24 08:08:20 thoregon kernel: [46786.741356] [<ffffffff8104f340>] ? wake_up_bit+0x40/0x40 Jan 24 08:08:20 thoregon kernel: [46786.741364] [<ffffffff81352e07>] ? radeon_ib_get+0x257/0x2e0 Jan 24 08:08:20 thoregon kernel: [46786.741372] [<ffffffff81354d7a>] ? radeon_cs_ioctl+0x27a/0x4d0 Jan 24 08:08:20 thoregon kernel: [46786.741381] [<ffffffff812f42d4>] ? drm_ioctl+0x3e4/0x490 Jan 24 08:08:20 thoregon kernel: [46786.741389] [<ffffffff81354b00>] ? radeon_cs_finish_pages+0xa0/0xa0 Jan 24 08:08:20 thoregon kernel: [46786.741398] [<ffffffff81024769>] ? do_page_fault+0x199/0x420 Jan 24 08:08:20 thoregon kernel: [46786.741406] [<ffffffff810af30c>] ? mmap_region+0x1dc/0x570 Jan 24 08:08:20 thoregon kernel: [46786.741414] [<ffffffff810de446>] ? do_vfs_ioctl+0x96/0x4e0 Jan 24 08:08:20 thoregon kernel: [46786.741422] [<ffffffff810de8d9>] ? sys_ioctl+0x49/0x90 Jan 24 08:08:20 thoregon kernel: [46786.741430] [<ffffffff815f1922>] ? system_call_fastpath+0x16/0x1b
I did search my logs for more GPU lockups after noting that this also happened with 3.2. The first lockup in my logs occurred on Nov 4 under 3.1. But until 3.3-rc1 X always was able to resume normal operations.
My best guess for the cause of the GPU lockups seems to be the upgrade from xf86-video-ati-6.14.2 to 6.14.3, but 3.3-rc1 seems to have an independent bug that prevents X to recover from a GPU lockup/reset.
Of course it would be best if we did not lockup in the first place.
Not sure if this is important: I also upgraded to mesa 8.0-rc1 before the first hang, but after switching back to 3.2 but still using mesa 8.0 I did not have any problems. Except the KDE desktop effects there should not have been any OpenGL programs running. The screen saver itself is just turning the screens off via the KDE power profile.
I will report again, when I succeeded in triggering the GPU lockup again...
Torsten
On Tue, Jan 24, 2012 at 8:34 AM, Torsten Kaiser just.for.lkml@googlemail.com wrote:
On Mon, Jan 23, 2012 at 7:01 PM, Torsten Kaiser just.for.lkml@googlemail.com wrote:
On Mon, Jan 23, 2012 at 5:57 PM, Jerome Glisse j.glisse@gmail.com wrote:
On Sat, Jan 21, 2012 at 08:03:37PM +0100, Torsten Kaiser wrote:
After updating to kernel 3.3-rc1 I have experienced a lockup of my GPU. I left my KDE desktop running until the screensaver turned off the monitors. But on key presses it would not turn back on. Ctrl+Alt+F1 to switch to another virtual console also did not work. Alt+SysRq magic still worked, so I was able to force the syslog to disk and restart the system.
Can you test if attached patch help your case ?
Patch is installed, but I can't reproduce the hang on demand. It did happen a second time yesterday while letting the screensaver kick in, but only at around the third or fourth try. Just using "xset dpms force standby/suspend/off" did not trigger it.
I think the patch did what it was intended to do, but it did not really help. While the GPU reset did seem to work, X still got stuck and was not able to turn the monitors back on.
From the log: The GPU lockup happend while the system was idle: Jan 23 23:53:54 thoregon kernel: [17121.080129] radeon 0000:07:00.0: GPU lockup CP stall for more than 10000msec Jan 23 23:53:54 thoregon kernel: [17121.080137] GPU lockup (waiting for 0x002080B7 last fence id 0x002080B6) Jan 23 23:53:54 thoregon kernel: [17121.096334] radeon 0000:07:00.0: GPU softreset Jan 23 23:53:54 thoregon kernel: [17121.096341] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0xA0003028 Jan 23 23:53:54 thoregon kernel: [17121.096346] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 23 23:53:54 thoregon kernel: [17121.096351] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 23 23:53:54 thoregon kernel: [17121.096362] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE Jan 23 23:53:54 thoregon kernel: [17121.111386] radeon 0000:07:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 Jan 23 23:53:54 thoregon kernel: [17121.127378] radeon 0000:07:00.0: R_008010_GRBM_STATUS=0x00003028 Jan 23 23:53:54 thoregon kernel: [17121.127384] radeon 0000:07:00.0: R_008014_GRBM_STATUS2=0x00000002 Jan 23 23:53:54 thoregon kernel: [17121.127390] radeon 0000:07:00.0: R_000E50_SRBM_STATUS=0x200000C0 Jan 23 23:53:54 thoregon kernel: [17121.128393] radeon 0000:07:00.0: GPU reset succeed Jan 23 23:53:54 thoregon kernel: [17121.133330] [drm] PCIE GART of 512M enabled (table at 0x0000000000040000). Jan 23 23:53:54 thoregon kernel: [17121.133364] radeon 0000:07:00.0: WB enabled Jan 23 23:53:54 thoregon kernel: [17121.133370] [drm] fence driver on ring 0 use gpu addr 0x20000c00 and cpu addr 0xffff8803286e5c00 Jan 23 23:53:54 thoregon kernel: [17121.179627] [drm] ring test on 0 succeeded in 1 usecs Jan 23 23:53:54 thoregon kernel: [17121.179653] [drm] ib test on ring 0 succeeded in 1 usecs
I found the commit (in xf86-video-ati) that causes the lockups and filed a bug at the xorg bugzilla about it: https://bugs.freedesktop.org/show_bug.cgi?id=45329
But that still leaves the regression in 3.3-rc1 that even with Jeromes patch the X server is no longer able to recover from the lockup, as shown by the SysRq+W trace below.
There where no messages about X getting stuck ("blocked for more than 120 seconds"), but after trying to access the system and failing SysRq+W reported this: Jan 24 08:08:20 thoregon kernel: [46786.741180] SysRq : Show Blocked State Jan 24 08:08:20 thoregon kernel: [46786.741190] task PC stack pid father Jan 24 08:08:20 thoregon kernel: [46786.741270] X D ffff880337d50a00 0 3047 3026 0x00400004 Jan 24 08:08:20 thoregon kernel: [46786.741281] ffff880327eacac0 0000000000000086 ffff880327d52e00 0000000000010a00 Jan 24 08:08:20 thoregon kernel: [46786.741292] ffff88031be9bfd8 0000000000010a00 ffff88031be9a000 ffff88031be9bfd8 Jan 24 08:08:20 thoregon kernel: [46786.741301] 0000000000010a00 ffff880327eacac0 0000000000010a00 0000000000010a00 Jan 24 08:08:20 thoregon kernel: [46786.741310] Call Trace: Jan 24 08:08:20 thoregon kernel: [46786.741326] [<ffffffff815ee9f7>] ? schedule_timeout+0x157/0x220 Jan 24 08:08:20 thoregon kernel: [46786.741336] [<ffffffff8103fbd0>] ? run_timer_softirq+0x240/0x240 Jan 24 08:08:20 thoregon kernel: [46786.741346] [<ffffffff8133ee39>] ? radeon_fence_wait+0x239/0x3b0 Jan 24 08:08:20 thoregon kernel: [46786.741356] [<ffffffff8104f340>] ? wake_up_bit+0x40/0x40 Jan 24 08:08:20 thoregon kernel: [46786.741364] [<ffffffff81352e07>] ? radeon_ib_get+0x257/0x2e0 Jan 24 08:08:20 thoregon kernel: [46786.741372] [<ffffffff81354d7a>] ? radeon_cs_ioctl+0x27a/0x4d0 Jan 24 08:08:20 thoregon kernel: [46786.741381] [<ffffffff812f42d4>] ? drm_ioctl+0x3e4/0x490 Jan 24 08:08:20 thoregon kernel: [46786.741389] [<ffffffff81354b00>] ? radeon_cs_finish_pages+0xa0/0xa0 Jan 24 08:08:20 thoregon kernel: [46786.741398] [<ffffffff81024769>] ? do_page_fault+0x199/0x420 Jan 24 08:08:20 thoregon kernel: [46786.741406] [<ffffffff810af30c>] ? mmap_region+0x1dc/0x570 Jan 24 08:08:20 thoregon kernel: [46786.741414] [<ffffffff810de446>] ? do_vfs_ioctl+0x96/0x4e0 Jan 24 08:08:20 thoregon kernel: [46786.741422] [<ffffffff810de8d9>] ? sys_ioctl+0x49/0x90 Jan 24 08:08:20 thoregon kernel: [46786.741430] [<ffffffff815f1922>] ? system_call_fastpath+0x16/0x1b
I did search my logs for more GPU lockups after noting that this also happened with 3.2. The first lockup in my logs occurred on Nov 4 under 3.1. But until 3.3-rc1 X always was able to resume normal operations.
My best guess for the cause of the GPU lockups seems to be the upgrade from xf86-video-ati-6.14.2 to 6.14.3, but 3.3-rc1 seems to have an independent bug that prevents X to recover from a GPU lockup/reset.
Of course it would be best if we did not lockup in the first place.
Not sure if this is important: I also upgraded to mesa 8.0-rc1 before the first hang, but after switching back to 3.2 but still using mesa 8.0 I did not have any problems. Except the KDE desktop effects there should not have been any OpenGL programs running. The screen saver itself is just turning the screens off via the KDE power profile.
I will report again, when I succeeded in triggering the GPU lockup again...
Torsten
dri-devel@lists.freedesktop.org