https://bugzilla.kernel.org/show_bug.cgi?id=52491
Summary: radeon massive screen corruption BARTS HD6870 Product: Drivers Version: 2.5 Kernel Version: 3.8-rc Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: high Priority: P1 Component: Video(DRI - non Intel) AssignedTo: drivers_video-dri@kernel-bugs.osdl.org ReportedBy: maxijac@free.fr Regression: No
Created an attachment (id=90801) --> (https://bugzilla.kernel.org/attachment.cgi?id=90801) return to desktop after bad rendering
Hi,
I'm experiencing a major rendering corruption with linux 3.8 and my HD 6870.
software: latest kernel from linus as of 2013-08-01 latest mesa git as of 2013-08-01 latest llvm from tstellar git as of 2013-08-01 latest DDX from git as of 2013-08-01 libdrm 2.4.40
Symptoms:
I triggered this several times running Heroes of Newerth. When a match starts, sometimes The textures are all black, or sometimes my cursor is missing. (It looks like my LLVM-enabled for the glsl compiler builds of mesa trigger the black textures more often)
When this happens, quitting the game and returning to my desktop, everything is garbled, things do not refresh correctly. See screenshot.
keeping the same userland and just downgrading to linux 3.7 solves everything.
Nothing gets added to dmesg...
I don't have much time for bisecting this, I'll try asap but it won't be before some days, so if someone has similar hardware, please try to reproduce it. HoN is free to play and natively runs on linux. (http://www.heroesofnewerth.com)
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #1 from Bruno J. maxijac@free.fr 2013-01-08 18:45:10 --- Created an attachment (id=90811) --> (https://bugzilla.kernel.org/attachment.cgi?id=90811) When the rendering is bad inside the game
right before I quit the game and all the bad stuff happens.
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Bruno J. maxijac@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression|No |Yes
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #2 from Bruno J. maxijac@free.fr 2013-01-10 23:00:51 --- Still happening in rc3
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #3 from Bruno J. maxijac@free.fr 2013-01-12 18:27:13 --- I bisected it. THough it looks like the behavior changed halfway, I was getting kernel crashes for this commit for example.
d2ead3eaf8a4bf92129eda69189ce18a6c1cc8bd is the first bad commit commit d2ead3eaf8a4bf92129eda69189ce18a6c1cc8bd Author: Alex Deucher alexander.deucher@amd.com Date: Thu Dec 13 09:55:45 2012 -0500
drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2)
Allows us to use the DMA ring from userspace. DMA doesn't have a good NOP packet in which to embed the reloc idx, so userspace has to add a reloc for each buffer used and order them to match the command stream.
v2: fix address bounds checking
Signed-off-by: Alex Deucher alexander.deucher@amd.com
:040000 040000 7183de0d56e5c01b40775244d5dc4b5441406786 f3abce52c375cc4598cd23739df825771e6fb46e M drivers
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |alexdeucher@gmail.com
--- Comment #4 from Alex Deucher alexdeucher@gmail.com 2013-01-12 22:27:22 --- Probably a bad bisect. That commit just enables userspace accel drivers to utilize the DMA engine, but the userspace drivers do not take advantage of that yet so the code is currently never called.
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #5 from Bruno J. maxijac@free.fr 2013-01-12 22:41:42 --- Hm, I was expecting something like that. I'm experiencing 2 bugs in 1. This one was a plain crash, while the first one I opened the report about was a corruption but no crash.
I may re-bisect the kernel and flag "good" when it just crashes but no corruption ?
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #6 from Bruno J. maxijac@free.fr 2013-01-12 23:15:25 --- After further testing, it seems that there is 2 bugs as I said. The first one I'm seeing (and the one the report is about) seems to be caused by commit dd54fee7d440c4a9756cce2c24a50c15e4c17ccb.
And now the crash I also got is probably an older commit, but as the crash is pretty random, I got the bisect wrong.
Seeing the commit message of dd54fee7d440c4a9756cce2c24a50c15e4c17ccb, it fixes a kernel crash that looks like mine, I'll attach a screenshot of mine (poor quality :/)
==> So maybe dd54fef DID fix the kernel crash but replaced it with the corruption I'm seeing ?
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #7 from Bruno J. maxijac@free.fr 2013-01-12 23:18:59 --- Created an attachment (id=91101) --> (https://bugzilla.kernel.org/attachment.cgi?id=91101) kernel crash I got from older commits while bisecting
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #8 from Michel Dänzer michel@daenzer.net 2013-01-15 14:36:15 --- (In reply to comment #6)
==> So maybe dd54fef DID fix the kernel crash but replaced it with the corruption I'm seeing ?
Does the corruption also occur with dd54fee7d440c4a9756cce2c24a50c15e4c17ccb applied manually on top of 0d0b3e7443bed6b49cb90fe7ddc4b5578a83a88d?
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #9 from Alex Deucher alexdeucher@gmail.com 2013-01-15 17:37:44 --- Does reverting the following commit fix the issue?
commit d025e9e2b890db679f1246037bf65bd4be512627 Author: Jerome Glisse jglisse@redhat.com Date: Thu Nov 29 10:35:41 2012 -0500
drm/radeon: do not move bo to different placement at each cs
The bo creation placement is where the bo will be. Instead of trying to move bo at each command stream let this work to another worker thread that will use more advance heuristic.
agd5f: remove leftover unused variable
Signed-off-by: Jerome Glisse jglisse@redhat.com Reviewed-by: Alex Deucher alexander.deucher@amd.com
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #10 from Bruno Jacquet maxijac@free.fr 2013-01-15 19:26:08 --- (In reply to comment #8)
(In reply to comment #6)
==> So maybe dd54fef DID fix the kernel crash but replaced it with the corruption I'm seeing ?
Does the corruption also occur with dd54fee7d440c4a9756cce2c24a50c15e4c17ccb applied manually on top of 0d0b3e7443bed6b49cb90fe7ddc4b5578a83a88d?
g0d0b3e7 with patch dd54fee7d I see no corruption
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #11 from Bruno Jacquet maxijac@free.fr 2013-01-15 19:38:05 --- (In reply to comment #9)
Does reverting the following commit fix the issue?
commit d025e9e2b890db679f1246037bf65bd4be512627 Author: Jerome Glisse jglisse@redhat.com Date: Thu Nov 29 10:35:41 2012 -0500
drm/radeon: do not move bo to different placement at each cs The bo creation placement is where the bo will be. Instead of trying to move bo at each command stream let this work to another worker thread that will use more advance heuristic. agd5f: remove leftover unused variable Signed-off-by: Jerome Glisse <jglisse@redhat.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
It does fix the corruption.
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #12 from Alex Deucher alexdeucher@gmail.com 2013-01-15 20:57:27 --- Same issue as: https://bugs.freedesktop.org/show_bug.cgi?id=58659
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #13 from Jérôme Glisse glisse@freedesktop.org 2013-01-16 22:31:46 --- Created an attachment (id=91421) --> (https://bugzilla.kernel.org/attachment.cgi?id=91421) Exclude system placement
Does applying this patch without reverting anything fix the issue ?
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Jérôme Glisse glisse@freedesktop.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |glisse@freedesktop.org
--- Comment #14 from Jérôme Glisse glisse@freedesktop.org 2013-01-17 00:20:44 --- Better to try this patch instead first : http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placeme...
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #15 from Bruno Jacquet maxijac@free.fr 2013-01-17 19:26:36 --- (In reply to comment #14)
Better to try this patch instead first : http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placeme...
With this patch, my game froze before I could even check the rendering. My cursor still moved, I could switch to tty1. I checked dmesg : nothing added. I went back to tty7 (X) and then it was stuck there.
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Florian Mickler florian@mickler.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |florian@mickler.org
--- Comment #16 from Florian Mickler florian@mickler.org 2013-01-26 10:50:36 --- A patch referencing this bug report has been merged in Linux v3.8-rc5:
commit 20707874fd4fd37e09513f508e642fa8bd06365a Author: Alex Deucher alexander.deucher@amd.com Date: Thu Jan 17 13:10:50 2013 -0500
Revert "drm/radeon: do not move bo to different placement at each cs"
https://bugzilla.kernel.org/show_bug.cgi?id=52491
--- Comment #17 from Bruno Jacquet maxijac@free.fr 2013-01-27 14:04:31 --- Indeed, linux 3.8-rc5 with no patch applied is working now, I see no corruption.
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Bruno Jacquet maxijac@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |CODE_FIX
--- Comment #18 from Bruno Jacquet maxijac@free.fr 2013-03-05 14:41:47 --- Final 3.8 is working
dri-devel@lists.freedesktop.org