https://bugs.freedesktop.org/show_bug.cgi?id=87682
Bug ID: 87682 Summary: Horizontal lines in radeon driver on kernel 3.15 and upwards Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: Drivers/Gallium/r600 Assignee: dri-devel@lists.freedesktop.org Reporter: qwrules@gmail.com
Created attachment 111292 --> https://bugs.freedesktop.org/attachment.cgi?id=111292&action=edit Youtube video depicting the artefacts.
This is what I get if I run any kernel newer than 3.14: https://www.youtube.com/watch?v=nx2-Fvihzxg Those artefacts appear as soon as new kernel is selected in GRUB, and remain after logging into X session.
The last kernel working without artefacts is v3.15-rc2-trusty for Ubuntu, and 3.14.7 for Arch. All later kernels have those artefacts.
I updated mesa and xorg to those form ppaibaf/graphics-drivers ppa on Ubuntu, and on using git version on Arch but it did not change a thing. This regression is solely kernel-related.
GPU: Mobility Radeon HD 3200 (RS780M) System tested: Arch, Ubuntu 14.04, 14.10 Kernels affected: 3.15 and onwards (tested up to 3.19-rc1)
https://bugs.freedesktop.org/show_bug.cgi?id=87682
lockheed qwrules@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|medium |high URL| |https://www.youtube.com/wat | |ch?v=nx2-Fvihzxg
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #1 from Michel Dänzer michel@daenzer.net --- Can you isolate the kernel change which introduced the problem with git bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #2 from Christian König deathsimple@vodafone.de --- Most likely another problem caused by the PLL rework. I would guess it's one of those patches.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #3 from lockheed qwrules@gmail.com --- @Michel Dänzer, I can contribute bug as detailed as I can, but I don't think I have the necessary combination of time and skill to "bisect" a kernel.
However, since I gave the specific kernel version which the error emerges, it should be enough information for someone with more knowledge to find the cause.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|Drivers/Gallium/r600 |DRM/Radeon Version|git |unspecified Product|Mesa |DRI
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #4 from Alex Deucher alexdeucher@gmail.com --- Possibly related to https://bugzilla.kernel.org/show_bug.cgi?id=83461
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #5 from Thom madeforspam@telfort.nl --- I can confirm this bug: Laptop HP 6735s 2xTurion + RS780M videochip I happen to have the exact same artefacts with any kernelversion higher than 3.13 It affects the buildin LVDS but NOT the VGA-output.
I tested kernels up to 4.4.0 (to no avail)
I don't know what "git bisect" is but eager to learn. I also dropped a note on https://bugzilla.kernel.org/show_bug.cgi?id=83461
I used the link to Lockheed's video as illustration on https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1479136 where I originally filed a bug.
I am in the happy circumstances to dedicate this laptop to any test you want me to throw at it.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #6 from Felix Schwarz felix.schwarz@oss.schwarz.eu --- (In reply to Thom from comment #5)
I don't know what "git bisect" is but eager to learn.
"bisecting" is a way to find out which commit caused a specific regression. This involves compiling the linux kernel from git and testing the compiled versions. If you can find out which commit is the culprit chances are pretty good that the problem can be fixed quickly.
To learn more about bisecting I suggest seaching for "git bisect".
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #7 from Thom madeforspam@telfort.nl --- Ok, I did my first bisect, it worked out well but I encountered something that puzzles me a bit. Here is the last part of the bisect:
3.15.0-rc3-00725-g1465967 bad
Bisecting: 658 revisions left to test after this (roughly 9 steps) [e9dba837640d960f56bef22ff08611955ff8a5b4] Merge tag 'pm+acpi-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
3.15.0-rc2-00219-ge9dba83 bad
Bisecting: 355 revisions left to test after this (roughly 8 steps) [6e66d5dab5d530a368314eb631201a02aabb075d] Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
3.15.0-rc1-00303-g6e66d5d good
Bisecting: 176 revisions left to test after this (roughly 8 steps) [4d0fa8a0f01272d4de33704f20303dcecdb55df1] Merge tag 'gpio-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
3.15.0-rc2-00042-g4d0fa8a good
Bisecting: 99 revisions left to test after this (roughly 7 steps) [76e7745e8e4330fdb30f049303d524261c0b7a2c] Merge tag 'zynq-dt-fixes-for-3.15' of git://git.xilinx.com/linux-xlnx into fixes
3.15.0-rc2-00077-g76e7745 good (how can this be ??)
Bisecting: 49 revisions left to test after this (roughly 6 steps) [92891ed6b1fdb49655f9a071ef2880a567807375] Merge branch 'fixes_for_v3.15' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
3.15.0-rc2-00092-g92891ed bad
Bisecting: 22 revisions left to test after this (roughly 5 steps) [1aae31c8306e5f1bdeafd87b2cd9e3f0df3709e5] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
3.15.0-rc2-00069-g1aae31c bad
Bisecting: 13 revisions left to test after this (roughly 4 steps) [7740fc52105c9e6d2beac389a9ae0ce7138cf5ab] Input: soc_button_array - fix a crash during rmmod
3.14.0-rc4-00065-g7740fc5 good
Bisecting: 6 revisions left to test after this (roughly 3 steps) [3ed9a335cfc64b2c83545f341cdddf2347b12b97] drm/radeon/pm: don't walk the crtc list before it has been initialized (v2)
3.15.0-rc1-00075-g3ed9a33 bad
Bisecting: 3 revisions left to test after this (roughly 2 steps) [c2fb3094669a3205f16a32f4119d0afe40b1a1fd] drm/radeon: improve PLL limit handling in post div calculation
3.15.0-rc1-00071-gc2fb309 bad
Bisecting: 0 revisions left to test after this (roughly 1 step) [24315814239a3fdb306244c99bd076bc79db4ade] drm/radeon: use fixed PPL ref divider if needed
3.15.0-rc1-00070-g2431581 good
c2fb3094669a3205f16a32f4119d0afe40b1a1fd is the first bad commit
commit c2fb3094669a3205f16a32f4119d0afe40b1a1fd Author: Christian König christian.koenig@amd.com Date: Sun Apr 20 13:24:32 2014 +0200
drm/radeon: improve PLL limit handling in post div calculation
This improves the PLL parameters when we work at the limits of the allowed ranges.
Signed-off-by: Christian König christian.koenig@amd.com
:040000 040000 5c3ac5ddf911c2c1f8926ecde2d83fdbcd6bb269 4731ceed6e1c149abd6fda6a06318700750f8
So far so good, but what I'm puzzled about is this:
As far as I understand; 3.15.0-rc2-00077-g76e7745 is a later revision (good) than 3.15.0-rc2-00069-g1aae31c (bad) and an earlier revision than 3.15.0-rc2-00092-g92891ed (bad) which doesn't seem to make sense to me.
It is as if someone did a patch to improve on 3.15.0-rc1-00071-gc2fb309 but that it got revoked afterwards, is that possible ?
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #8 from Chris Bainbridge chris.bainbridge@gmail.com ---
As far as I understand; 3.15.0-rc2-00077-g76e7745 is a later revision (good) than 3.15.0-rc2-00069-g1aae31c (bad)
This is not correct. The 77/69 does not imply a linear ordering because of forks:
$ git merge-base --is-ancestor 3.15.0-rc2-00069-g1aae31c 3.15.0-rc2-00077-g76e7745; echo $? 1
Trust git ;-)
c2fb3094669a3205f16a32f4119d0afe40b1a1fd is the first bad commit
Not familiar with this code, but from the patch the PLL values are printed out:
DRM_DEBUG_KMS("%d - %d, pll dividers - fb: %d.%d ref: %d, post %d\n", freq, *dot_clock_p * 10, *fb_div_p, *frac_fb_div_p, ref_div, post_div);
So suggest enabling debug log and compare those two lines from a working and non-working kernel.
It should also be trivial to checkout a recent tag and revert the bad commit (there is a conflict but just delete the avivo_get_fb_ref_div function to resolve it).
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #9 from Thom madeforspam@telfort.nl --- (In reply to Chris Bainbridge from comment #8)
Thanks for the update, that explains everything. I hardly know git, and before yesterday I didn't even know what git or what bisecting was...it's a bit overwhelming.
That is like magic :-) How did you get git to give you the source of that patch so quickly ? (I googled for hours on this stuff without success)
So suggest enabling debug log and compare those two lines from a working and non-working kernel.
I assume that I have to enable debug log via a bootoption because I couldn't find anything in menuconfig that wasn't already marked for inclusion. What bootoption do I have to use to enable the right (and right amount of) debug logging ? (and after that, where do I find the log output?)
It should also be trivial to checkout a recent tag and revert the bad commit
I don't even know yet what that is or how to do that, even after reading the man pages about checkout, tag, revert and commit; but I'm convinced I'll get there in the end ;-)
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #10 from Thom madeforspam@telfort.nl --- Hmmm.... I'm afraid I have to enable "debug boot parapeters" in menuconfig. What git command do I use to get a specific kernelversion source lined up so I can recompile selected kernels for debug ?
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #11 from Thom madeforspam@telfort.nl --- ok, some results:
PLL-readings on good working compilations:
3.15.0-rc1-00303-g6e66d5d [drm:radeon_compute_pll_avivo] 69300 - 6949, pll dividers - fb: 165.0 ref: 2, post 17
3.15.0-rc2-00042-g4d0fa8a [drm:radeon_compute_pll_avivo] 69300 - 6930, pll dividers - fb: 329.1 ref: 4, post 17
3.15.0-rc2-00077-g76e7745 [drm:radeon_compute_pll_avivo] 69300 - 6930, pll dividers - fb: 329.1 ref: 4, post 17
3.15.0-rc1-00070-g2431581 no output, system hangs loading driver in debug mode (probably because this one didn't had the patch yet.) works ok when not in debug mode.
PLL-readings on bad noisy-artefacty compilations:
3.15.0-rc2-00069-g1aae31c [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14
3.15.0-rc1-00071-gc2fb309 [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14
3.15.0-rc1-00075-g3ed9a33 [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14
3.15.0-rc2-00092-g92891ed [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14
Problem is: I haven't the slightest clue what it all means.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #12 from Thom madeforspam@telfort.nl --- (In reply to Chris Bainbridge from comment #8)
So suggest enabling debug log and compare those two lines from a working and non-working kernel.
Done (see previous message) :-)
It should also be trivial to checkout a recent tag and revert the bad commit
Done :-) Reverted the bad commit on current 4.6.0-rc6+ an tested and it worked like a charm !! no display problems anymore
(there is a conflict but just delete the avivo_get_fb_ref_div function to resolve it).
I did, and thanks to your directions it all worked out perfectly :-)
https://bugs.freedesktop.org/show_bug.cgi?id=87682
Chris Bainbridge chris.bainbridge@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |chris.bainbridge@gmail.com
--- Comment #13 from Chris Bainbridge chris.bainbridge@gmail.com --- This might be https://bugzilla.kernel.org/show_bug.cgi?id=75241 - there is one line patch there from Christian König but it doesn't look like it was ever merged.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #14 from Thom madeforspam@telfort.nl --- (In reply to Chris Bainbridge from comment #13)
I did a git fetch origin , git reset --hard origin/master to get a plain unaltered current kernel again (4.6.0-rc7+)
I changed the one line in ./drivers/gpu/drm/radeon/radeon_display.c: fb_div_max = pll->max_feedback_div; to: fb_div_max = min(pll->max_feedback_div, 512u); according to: https://bugzilla.kernel.org/attachment.cgi?id=142281 (linked from https://bugzilla.kernel.org/show_bug.cgi?id=75241)
and compiled (make && make modules_install install)
Assuming that i did not make a mistake or overlooked something; this patch didn't work, lots of noise/artefacts. Timings seem identical to the other "bad" compilations, i.e. nothing changed:
(bootparam drm.debug=4) [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 135.5 ref: 2, post 14
too bad, but it was absolutely worth to try. I wonder if "fb" and "post" are consequently too low....is that possible ?
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #15 from Thom madeforspam@telfort.nl --- ok, i created a variation of the one liner patch that works without reverting any of the existing code:
This patch prevents fb from going lower than 140 Preventing noise/snow on display . (for RS780M + LVDS)
diff: @@ void radeon_compute_pll_avivo(struct radeon_pll *pll,
/* determine allowed feedback divider range */ -- fb_div_min = pll->min_feedback_div; ++ fb_div_min = max(pll->min_feedback_div, 140u); fb_div_max = pll->max_feedback_div;
if (pll->flags & RADEON_PLL_USE_FRAC_FB_DIV) { fb_div_min *= 10;
results in: [drm:radeon_compute_pll_avivo] 69300 - 69290, pll dividers - fb: 271.0 ref: 4, post 14
This "works for me (TM)"
But it would be good if someone could check if there are no "unforeseen consequences" to this patch. I don't know much about GPU stuff an I am not familiar with the code. (and yes I know: hardcoding values is definitely "not done")
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #16 from Thom madeforspam@telfort.nl --- fb lower than 140 is possible, my current stock kernel 3.13.0-86 works flawless [drm:radeon_compute_pll_avivo], 6928, pll dividers - fb: 125.8 ref: 2, post 13
(sigh) I just wish I understood why some modes work and some don't
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #17 from Chris Bainbridge chris.bainbridge@gmail.com --- Christian König posted an explanation of the PLL divider values at https://bugzilla.kernel.org/show_bug.cgi?id=91861#c12 (another "no screen after 3.15" bug report)
The various fixes adjust the divider value limits slightly for different displays. The basic formula is commented in the radeon_compute_pll_avivo function:
dot_clock = (ref_freq * feedback_div) / (ref_div * post_div)
So by adjusting the limits of those values you can find something that works for your laptop display. But I don't know which solution is technically correct - if you don't get a reply here you could try emailing Christian König and asking.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #18 from Thom madeforspam@telfort.nl --- (In reply to Chris Bainbridge from comment #17)
if you don't get a reply here you could try emailing Christian König and asking.
I did, and Christian responded almost instantly, so I will be busy for quite a while with testing. Don't close this bug yet....work in progress :-)
https://bugs.freedesktop.org/show_bug.cgi?id=87682
Thom madeforspam@telfort.nl changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #19 from Thom madeforspam@telfort.nl --- Patch submitted by Christian König
https://lists.freedesktop.org/archives/dri-devel/2016-June/110724.html
This solved the bug. Thanks everyone for all the help.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #20 from Gilbert Smith gjsmith3rd@gmail.com --- I have this same problem with an upgrade from 14.04 LTS to 16.04 LTS Ubuntu -
Linux DV7 4.6.4-040604-generic #201607111332 SMP Mon Jul 11 17:34:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
01:05.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] RS780M [Mobility Radeon HD 3200] [1002:9612] (prog-if 00 [VGA controller])
I noticed that a patch was submitted. Can I expect to see this in a future kernel or perhaps a RC version after my 4.6.4 kernel?
-------
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #21 from Thom madeforspam@telfort.nl --- Gilbert, same with me, also ubuntu 14.04 -> 16.04.
The patch is already in the 4.7+ kernel tree so it should be in the first 4.7 kernel (pre) release.
I'm not familiar with ubuntu's kernel policy and I also don't know anyone who does but I guess that the 4.7 kernel will land in 16.10 or 17.04. Best to ask the Ubuntu Kernelteam.
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #22 from Thom madeforspam@telfort.nl --- addendum:
https://github.com/torvalds/linux/commit/9ef8537e68941d858924a3eacee5a194576...
i.e. kernel 4.7-rc4 and up
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #23 from Gilbert Smith gjsmith3rd@gmail.com --- (In reply to Thom from comment #21)
Thank you for the informative information. I'll probably stay on the LTS 16.04 but as soon as I get wind of the release of kernel 4.7+ I will install it.
I was able to get my system working properly by reverting to kernel 3.13.0-92-generic.
Here' a link to a discussion I found that stated that users who upgraded may use older kernels from 12.04 and 14.04 on 16.04 even if not supported.
http://askubuntu.com/questions/776910/install-old-kernel-in-ubuntu-16-04/801...
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #24 from Gilbert Smith gjsmith3rd@gmail.com --- (In reply to Thom from comment #21)
I just installed the new kernel 4.7.0-040700-generic but it didn't fix the display problem. Reverting back to 3.13.0-92-generic. :(
https://bugs.freedesktop.org/show_bug.cgi?id=87682
--- Comment #25 from Thom madeforspam@telfort.nl --- AFAIK the patch is in since 4.7-RC4. Could it be that your version is older ?
see also: https://github.com/torvalds/linux/commit/9ef8537e68941d858924a3eacee5a194576...
dri-devel@lists.freedesktop.org