https://bugs.freedesktop.org/show_bug.cgi?id=89980
Bug ID: 89980 Summary: [Regression] Graphical corruption after resuming from suspend (w/ dual monitor configuration) Product: DRI Version: XOrg git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: falaca@gmail.com
Created attachment 115013 --> https://bugs.freedesktop.org/attachment.cgi?id=115013&action=edit Xorg log
This is a regression: Bug is reproducible in Ubuntu 15.04 Beta (Xorg 1.17.1, linux 3.19) and Fedora 22 Alpha (Xorg 1.17.1, linux 4.0) with a Radeon R7 260X video card.
I first noticed the issue in Ubuntu 14.04 after I upgraded from Xorg 1.16 to 1.17.
The problem does not occur with fglrx.
Symptoms: Checkerboard tearing pattern begins to occur in approximately the top 1/8 of the display after resuming from suspend, and does not resolve itself until a reboot.
Demonstration of bug in Ubuntu w/ Unity (highlighting menu entries with the mouse): https://www.dropbox.com/s/ez2v03oetppecgx/VID_20150324_020612.mp4?dl=0
Demonstration of bug in Fedora w/ Gnome 3 (maximizing/restoring a window): https://www.dropbox.com/s/85n2iq27zm00dlo/VID_20150410_033406.mp4?dl=0
Steps to reproduce:
I can only reproduce this with when I have 2 displays connected. My primary screen is set to 2560x1440, and the secondary screen in portrait mode is set to 1200x1920 on the left-hand side. I have the landscape monitor centered with respect to the portrait one, so y = 240 in ~/.config./monitors.xml.
I cannot observe the bug when both screens are aligned at the top, i.e., with y=0 in ~/.config/monitors.xml.
I also cannot observe the bug with a single monitor connected, or with both monitors in landscape mode.
After setting up the monitor configuration, all that needs to be done to reproduce the corruption is to suspend the system, resume, and observe the top portion of the primary (landscape) display when the screen is changing, e.g., it is apparent when watching full-screen movies or minimizing/maximizing windows as demonstrated in my demo video.
Attached: Xorg log and dmesg (w/ kernel parameter drm.debug=14) saved after a suspend/resume cycle
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #1 from falaca@gmail.com --- Created attachment 115014 --> https://bugs.freedesktop.org/attachment.cgi?id=115014&action=edit dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=89980
falaca@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Hardware|Other |x86-64 (AMD64) Version|XOrg git |unspecified OS|All |Linux (All)
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #2 from Alex Deucher alexdeucher@gmail.com --- Since this is a regression can you narrow down the component (kernel, mesa, ddx) and bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #3 from falaca@gmail.com --- Based on my debugging so far, it seems that mesa is the likely culprit.
Step #1: Fresh install of Ubuntu 14.04 (installed on a new, clean partition): Linux 3.13 kernel -> Manually upgraded to Linux 4.0 mainline Xorg 1.15.1 Mesa 10.1.3 xf86-video-ati 7.3 libdrm 2.4.56
Status: The bug is not present, and confirms that the issue is not with the kernel.
----------
Step #2: Installed Ubuntu 14.04.2 hardware enablement stack: Linux 4.0 (mainline) Xorg 1.16.0 Mesa 10.3.2 xf86-video-ati 7.4 libdrm 2.4.56
Status: The bug is present.
----------
Step #3: I reverted to the original 14.04 packages, but compiled xf86-video-ati 7.4 from git.
Status: The bug is not present, and confirms (?) that the issue is not with ddx.
----------
Step #4: I had trouble getting Ubuntu to work with Mesa compiled from git (whenever I try to log in, I just get kicked back to the lightdm greeter), and I couldn't upgrade Mesa from the Ubuntu repo without also upgrading Xorg, so I upgraded Mesa from Oibaf PPA:
Linux 4.0 (mainline) Xorg 1.15.1 Mesa 10.6 (oibaf-ppa) xf86-video-ati 7.4 (git) (also tested 7.5.99 from oibaf-ppa) libdrm 2.4.60 (oibaf-ppa)
Status: The bug is present. So it seems likely that the bug was introduced somewhere between Mesa 10.1.3 and 10.3.2.
If I can figure out how to get Ubuntu to play nice with mainline Mesa compiled from git (maybe if I figure out how to apply the Ubuntu patches), I can do a bisect, but that's where I'm stuck as of now.
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #4 from falaca@gmail.com --- I have found the bad commit. I will attach my bisect log.
I bisected between 10.1-branchpoint and 10.2-branchpoint, and here's the final result:
4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 Author: Marek Olšák marek.olsak@amd.com Date: Mon Feb 10 01:25:54 2014 +0100
r600g,radeonsi: set correct initial domain for shared resources
:040000 040000 eafa3cdc6eea908c6ba8861f3d063f6a3161217b 7938f0ed0cdf8c677af35f1b2e67739dc210bda8 M src
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #5 from falaca@gmail.com --- Created attachment 115071 --> https://bugs.freedesktop.org/attachment.cgi?id=115071&action=edit Mesa bisect log
https://bugs.freedesktop.org/show_bug.cgi?id=89980
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |maraeo@gmail.com
--- Comment #6 from Michel Dänzer michel@daenzer.net --- (In reply to falaca from comment #4)
4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 Author: Marek Olšák marek.olsak@amd.com Date: Mon Feb 10 01:25:54 2014 +0100
r600g,radeonsi: set correct initial domain for shared resources
Weird. Marek, any ideas?
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #7 from falaca@gmail.com --- Created attachment 115102 --> https://bugs.freedesktop.org/attachment.cgi?id=115102&action=edit Different "checkerboard" corruption
This might be totally unrelated, but since I don't know enough to make that judgement, I thought I should share just in case (otherwise I could make a separate bug report for it).
Basically, I get intermittent checkerboard patterns which appear on my screen, as seen in the 2 screenshots I'm attaching (some areas blacked out by me for privacy). I don't know how to reproduce the patterns - they appear intermittent and seem unrelated to suspend/resume. They either look like "noise", like in the first screenshot, or they are remnants from a previous window that was open, like in the second screenshot.
Both of those screenshots are from the portrait display (unlike the behaviour from the video I posted in the original bug report, which only happens on the landscape display). I can't remember if I've seen this happen on the landscape display so far. I can keep collecting screenshots to see if it's confined to specific areas of the screen.
For what it's worth, I have used Catalyst 14.12 for a couple of months with this card, and didn't observe this type of behaviour.
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #8 from Michel Dänzer michel@daenzer.net --- (In reply to falaca from comment #0)
I can only reproduce this with when I have 2 displays connected. My primary screen is set to 2560x1440, and the secondary screen in portrait mode is set to 1200x1920 on the left-hand side. I have the landscape monitor centered with respect to the portrait one, so y = 240 in ~/.config./monitors.xml.
I cannot observe the bug when both screens are aligned at the top, i.e., with y=0 in ~/.config/monitors.xml.
Have you tried moving the landscape monitor to y = 0 and back to y = 240 after suspend/resume, while the session is up? Does that fix the problem, or does it stay corrupted?
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #9 from falaca@gmail.com --- (In reply to Michel Dänzer from comment #8)
(In reply to falaca from comment #0)
I can only reproduce this with when I have 2 displays connected. My primary screen is set to 2560x1440, and the secondary screen in portrait mode is set to 1200x1920 on the left-hand side. I have the landscape monitor centered with respect to the portrait one, so y = 240 in ~/.config./monitors.xml.
I cannot observe the bug when both screens are aligned at the top, i.e., with y=0 in ~/.config/monitors.xml.
Have you tried moving the landscape monitor to y = 0 and back to y = 240 after suspend/resume, while the session is up? Does that fix the problem, or does it stay corrupted?
I just tried right now, and it doesn't make a difference. But you know what, it turns out that it still *does* happen when y=0, it's just that it's a little less noticeable to me, e.g., I'm having trouble seeing it when maximizing a window, but I'm still seeing it happen in the menus. This is purely based on my eyesight, so it's hardly scientific, but I could make more videos if desired.
I tried to move my landscape screen further up (above the portrait one, but still overlapping, so I presume that would be y=0 for the landscape screen, but y= positive for the portrait screen). That resulted in X becoming unusable. My landscape screen turned white, and restarting X didn't make things much better - I just got an unusable tiled pattern: https://www.dropbox.com/s/sfrxv4owqchyq75/tiledpattern.jpg?dl=0
I rebooted and tried with linux 3.16, and also with Arch Linux + Gnome 3 + linux 3.19 (or maybe it was 4.0). Same result (white screen). So unfortunately I wasn't able to test out what would happen in that scenario.
Is there anybody else who can test this configuration (dual monitors with a portrait display)? It seems like it doesn't take much effort to break something.
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #10 from falaca@gmail.com --- I wanted to add that I built Mesa 10.1 from git and installed it on Ubuntu 15.04. Along with Xorg 1.17.1 and the latest DDX compiled from git, I can't observe the bug.
Is there anything else that I can do to help this along? I tried cloning the master branch and just reverting Marek's commit (the one that I narrowed the bug down to with my git bisect), but of course that didn't work since there is other newer code which now depends on that.
I also tried disabling hyperz (since I believe 10.2 turned hyperz on by default), and that had no effect.
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #11 from Marek Olšák maraeo@gmail.com --- (In reply to Michel Dänzer from comment #6)
(In reply to falaca from comment #4)
4a5519f1e019dbf1103e4f3abe0a695637a87518 is the first bad commit commit 4a5519f1e019dbf1103e4f3abe0a695637a87518 Author: Marek Olšák marek.olsak@amd.com Date: Mon Feb 10 01:25:54 2014 +0100
r600g,radeonsi: set correct initial domain for shared resources
Weird. Marek, any ideas?
Sorry, no. The commit just obtains the initial domain from the kernel, so that it can use it for command submission. The idea of the commit is that the driver shouldn't move imported buffers to a domain that is different from the domain where the buffer was originally created.
https://bugs.freedesktop.org/show_bug.cgi?id=89980
--- Comment #12 from falaca@gmail.com --- Is there any sort of debugging trace that I can collect, to objectively compare the difference in behaviour before and after a suspend?
https://bugs.freedesktop.org/show_bug.cgi?id=89980
falaca@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #13 from falaca@gmail.com --- Good news! I saw this today: http://lists.x.org/archives/xorg-driver-ati/2015-April/027345.html
So I built and installed Michel's xf86-video-ati repo and enabled TearFree in xorg.conf. The corruption is now gone - so I suppose it was simply some manifestation of tearing, but only after a suspend/resume cycle.
To test it, I installed the module, enabled TearFree, then did a suspend/resume cycle, and I couldn't observe any tearing in the global menus. So then I commented out the TearFree option in xorg.conf and restarted lightdm, and immediately started seeing the really obvious tearing in the menus like in the video I posted. As a final check, I uncommented the TearFree option and restarted lightdm again, and the tearing was gone.
Thanks Michel! And I hope the TearFree feature will eventually be extended to support rotated displays as well!
dri-devel@lists.freedesktop.org