https://bugzilla.kernel.org/show_bug.cgi?id=42172
Summary: WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x39f/0x3d0() Product: Drivers Version: 2.5 Kernel Version: 3.1-rc3 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) AssignedTo: drivers_video-dri@kernel-bugs.osdl.org ReportedBy: nissarin@gmail.com Regression: Yes
Basically, some time after login I experience GPU stall, after that the X environment becomes unusable due to more errors ("couldn't schedule IB"),so I've to switch to console to reboot the PC.
If I'm not mistaken, I hit this as soon as I tried 3.1 (rc2, perhaps even rc1 ?) but at this point I'm not sure about this. Kernel 3.0 (+some stuff from airlied - radeon-testing/fixes) works fine.
I've no idea how to reproduce this - usually I'm running only Firefox and/or Thunderbird and some terminal windows (ssh sessions, rtorrent). It can happen a few minutes after login or after a few hours, the one below actually took some time as normally it's within a hour.
I'm using... - Radeon 6850 - Gentoo/AMD64 system - Xfce/xfwm v4.8.1 w/ compositor enabled - xorg-server 1.10.4/1.11 - ati-driver (r600g), libdrm and mesa from git
I could try to bisect this but due to nature of the bug I don't know how long it would take.
Sep 01 15:06:00 [kernel] [15000.167025] radeon 0000:01:00.0: GPU lockup CP stall for more than 10036msec Sep 01 15:06:00 [kernel] [15000.167032] ------------[ cut here ]------------ Sep 01 15:06:00 [kernel] [15000.167046] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:267 radeon_fence_wait+0x39f/0x3d0() Sep 01 15:06:00 [kernel] [15000.167053] Hardware name: GA-MA78G-DS3H Sep 01 15:06:00 [kernel] [15000.167058] GPU lockup (waiting for 0x000A529F last fence id 0x000A529C) Sep 01 15:06:00 [kernel] [15000.167062] Modules linked in: reiserfs Sep 01 15:06:00 [kernel] [15000.167073] Pid: 3280, comm: X Not tainted 3.1.0-rc4-00131-g9e79e3e #242 Sep 01 15:06:00 [kernel] [15000.167078] Call Trace: Sep 01 15:06:00 [kernel] [15000.167092] [<ffffffff810694ab>] ? warn_slowpath_common+0x7b/0xc0 Sep 01 15:06:00 [kernel] [15000.167101] [<ffffffff810695a5>] ? warn_slowpath_fmt+0x45/0x50 Sep 01 15:06:00 [kernel] [15000.167111] [<ffffffff812ea3cf>] ? radeon_fence_wait+0x39f/0x3d0 Sep 01 15:06:00 [kernel] [15000.167119] [<ffffffff81084be0>] ? wake_up_bit+0x40/0x40 Sep 01 15:06:00 [kernel] [15000.167129] [<ffffffff812b47ef>] ? ttm_bo_wait+0x10f/0x1b0 Sep 01 15:06:00 [kernel] [15000.167139] [<ffffffff8130413f>] ? radeon_gem_wait_idle_ioctl+0x8f/0x110 Sep 01 15:06:00 [kernel] [15000.167147] [<ffffffff8129d4e1>] ? drm_ioctl+0x401/0x4a0 Sep 01 15:06:00 [kernel] [15000.167156] [<ffffffff813040b0>] ? radeon_gem_set_tiling_ioctl+0xb0/0xb0 Sep 01 15:06:00 [kernel] [15000.167164] [<ffffffff810773a8>] ? set_current_blocked+0x38/0x60 Sep 01 15:06:00 [kernel] [15000.167172] [<ffffffff81031d2a>] ? do_signal+0x21a/0x770 Sep 01 15:06:00 [kernel] [15000.167181] [<ffffffff8110da7c>] ? do_vfs_ioctl+0x9c/0x540 Sep 01 15:06:00 [kernel] [15000.167188] [<ffffffff810773a8>] ? set_current_blocked+0x38/0x60 Sep 01 15:06:00 [kernel] [15000.167195] [<ffffffff810324f8>] ? sys_rt_sigreturn+0x1e8/0x200 Sep 01 15:06:00 [kernel] [15000.167203] [<ffffffff8110df69>] ? sys_ioctl+0x49/0x80 Sep 01 15:06:00 [kernel] [15000.167212] [<ffffffff815d6b7b>] ? system_call_fastpath+0x16/0x1b Sep 01 15:06:00 [kernel] [15000.167218] ---[ end trace f6bfd0dc5ce37413 ]--- Sep 01 15:06:00 [kernel] [15000.168398] radeon 0000:01:00.0: GPU softreset Sep 01 15:06:00 [kernel] [15000.168405] radeon 0000:01:00.0: GRBM_STATUS=0xA0003828 Sep 01 15:06:00 [kernel] [15000.168410] radeon 0000:01:00.0: GRBM_STATUS_SE0=0x00000007 Sep 01 15:06:00 [kernel] [15000.168416] radeon 0000:01:00.0: GRBM_STATUS_SE1=0x00000007 Sep 01 15:06:00 [kernel] [15000.168422] radeon 0000:01:00.0: SRBM_STATUS=0x20020EC0 Sep 01 15:06:00 [kernel] [15000.345258] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.345265] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x00007F6B Sep 01 15:06:00 [kernel] [15000.345372] radeon 0000:01:00.0: GRBM_STATUS=0x00003828 Sep 01 15:06:00 [kernel] [15000.345377] radeon 0000:01:00.0: GRBM_STATUS_SE0=0x00000007 Sep 01 15:06:00 [kernel] [15000.345383] radeon 0000:01:00.0: GRBM_STATUS_SE1=0x00000007 Sep 01 15:06:00 [kernel] [15000.345388] radeon 0000:01:00.0: SRBM_STATUS=0x200206C0 Sep 01 15:06:00 [kernel] [15000.346396] radeon 0000:01:00.0: GPU reset succeed Sep 01 15:06:00 [kernel] [15000.556795] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.744306] radeon 0000:01:00.0: Wait for MC idle timedout ! Sep 01 15:06:00 [kernel] [15000.747925] radeon 0000:01:00.0: WB enabled Sep 01 15:06:00 [kernel] [15000.969041] [drm:r600_ring_test] *ERROR* radeon: ring test failed (scratch(0x8504)=0xCAFEDEAD) Sep 01 15:06:00 [kernel] [15000.969049] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume Sep 01 15:06:00 [kernel] [15000.977509] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(15). Sep 01 15:06:00 [kernel] [15000.977518] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.982304] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(0). Sep 01 15:06:00 [kernel] [15000.982307] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.984280] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(1). Sep 01 15:06:00 [kernel] [15000.984283] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB ! Sep 01 15:06:00 [kernel] [15000.984819] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(2). Sep 01 15:06:00 [kernel] [15000.984821] [drm:radeon_cs_ioctl] *ERROR* Failed to schedule IB !
https://bugzilla.kernel.org/show_bug.cgi?id=42172
Niels Ole Salscheider niels_ole@salscheider-online.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |niels_ole@salscheider-onlin | |e.de
--- Comment #1 from Niels Ole Salscheider niels_ole@salscheider-online.de 2011-09-01 15:45:56 --- This might be the same as bug 42162, which I bisected. You could try if commit b03e7495a862b028294f59fc87286d6d78ee7fa1 is the first bad commit...
https://bugzilla.kernel.org/show_bug.cgi?id=42172
--- Comment #2 from nissarin@gmail.com 2011-09-01 16:42:41 --- OK, after some "extensive" testing (as in glxgears x 20, duh) b03e7495a862b028294f59fc87286d6d78ee7fa1 "crashed" near 30 minute mark. Currently I'm running 5f66d2b58ca879e70740c82422354144845d6dd3, lets see what happens.
As a side note, I've noticed you are also using Gigabyte mobo... I might be oversensitive here but I encountered some strange bugs before, bugs which can be annoying yet I didn't saw many ppl reporting it.
https://bugzilla.kernel.org/show_bug.cgi?id=42172
--- Comment #3 from nissarin@gmail.com 2011-09-02 00:59:20 --- More than 8 hours have passed without a single glitch (same as before.. glxgears, web browser, wesnoth, etc.) so yeah, it appears that b03e7495a862b028294f59fc87286d6d78ee7fa1 is most likely the cause.
https://bugzilla.kernel.org/show_bug.cgi?id=42172
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #4 from Alex Deucher alexdeucher@gmail.com 2011-09-02 05:02:08 --- Should mark this as a dupe of bug 42162 then. Does the patch on bug 42162 help?
https://bugzilla.kernel.org/show_bug.cgi?id=42172
--- Comment #5 from nissarin@gmail.com 2011-09-02 09:52:26 --- Yes, it seems to be the same issue. Currently I'm testing the patch, I'll notify you later if it worked for me.
https://bugzilla.kernel.org/show_bug.cgi?id=42172
nissarin@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |DUPLICATE
--- Comment #6 from nissarin@gmail.com 2011-09-03 01:18:38 --- The patch works, thanks.
*** This bug has been marked as a duplicate of bug 42162 ***
dri-devel@lists.freedesktop.org