https://bugs.freedesktop.org/show_bug.cgi?id=55692
Priority: medium Bug ID: 55692 Assignee: dri-devel@lists.freedesktop.org Summary: [KMS][Cayman] Garbled screen and oops with 6950 with linus git from 20121006 (3.7-rc0) Severity: normal Classification: Unclassified OS: Linux (All) Reporter: serkan@hosca.com Hardware: x86-64 (AMD64) Status: NEW Version: unspecified Component: DRM/Radeon Product: DRI
I boot up freshly compiled linus git from 20121006, gdm starts but its all black screen after a couple of seconds its all garbage.
I vt switch to 1 and try restarting gdm and i get the oops.
xf86-video-ati git from 20121004 mesa git from 20121004
Using arch with 3.6 works fine
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #1 from Serkan Hosca serkan@hosca.com --- Created attachment 68153 --> https://bugs.freedesktop.org/attachment.cgi?id=68153&action=edit dmesg.3.7.0-rc0
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #2 from Serkan Hosca serkan@hosca.com --- Created attachment 68154 --> https://bugs.freedesktop.org/attachment.cgi?id=68154&action=edit dmesg.3.7.0-rc0 with irqpoll
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #3 from Serkan Hosca serkan@hosca.com --- Created attachment 68155 --> https://bugs.freedesktop.org/attachment.cgi?id=68155&action=edit oops pic
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #4 from Serkan Hosca serkan@hosca.com --- Created attachment 68156 --> https://bugs.freedesktop.org/attachment.cgi?id=68156&action=edit Xorg.0.log with 3.7-rc0
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #5 from Serkan Hosca serkan@hosca.com --- Created attachment 68157 --> https://bugs.freedesktop.org/attachment.cgi?id=68157&action=edit Xorg.0.log with 3.6
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #6 from Serkan Hosca serkan@hosca.com --- Created attachment 68159 --> https://bugs.freedesktop.org/attachment.cgi?id=68159&action=edit dmesg with 3.6
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #7 from Alex Deucher agd5f@yahoo.com --- Can you bisect to locate the problematic commit?
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #8 from Serkan Hosca serkan@hosca.com --- Here it is: 2a6f1abbb48f1d90f20b8198c4894c0469468405 is the first bad commit commit 2a6f1abbb48f1d90f20b8198c4894c0469468405 Author: Christian König deathsimple@vodafone.de Date: Sat Aug 11 15:00:30 2012 +0200
drm/radeon: make page table updates async v2
Currently doing the update with the CP.
v2: Rebased on Jeromes bugfix. Make validity comparison more human readable.
Signed-off-by: Christian König deathsimple@vodafone.de
:040000 040000 3ed3f64bd42f5f1000ab9e957df08f53e81e09d9 c5143cbc30add8e3472366fbdb84756d9cdcd035 M drivers
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED
--- Comment #9 from Christian König deathsimple@vodafone.de --- Mhm, interesting. You get a GPU lockup, but not a pagefault.
Need to look deeper into it, but this looks rather strange to me.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #10 from Christian König deathsimple@vodafone.de --- Created attachment 68515 --> https://bugs.freedesktop.org/attachment.cgi?id=68515&action=edit Possible fix.
Could you try the attached patch ontop of Alex latest drm-nex-3.7 branch (git://people.freedesktop.org/~agd5f/linux) ?
I'm not 100% sure that it's this problem, but it might be it.
Thanks, Christian.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #68515|0 |1 is obsolete| |
--- Comment #11 from Christian König deathsimple@vodafone.de --- Created attachment 68516 --> https://bugs.freedesktop.org/attachment.cgi?id=68516&action=edit Possible fix rebased on correct branch.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #12 from Serkan Hosca serkan@hosca.com --- Yes the patch works.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #13 from Serkan Hosca serkan@hosca.com --- (In reply to comment #12)
Yes the patch works.
I'm sorry o spoke to soon, same problem
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #14 from Serkan Hosca serkan@hosca.com --- Created attachment 68519 --> https://bugs.freedesktop.org/attachment.cgi?id=68519&action=edit dmesg linus git with patch
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #15 from Serkan Hosca serkan@hosca.com --- I've tried the patch on git://people.freedesktop.org/~agd5f/linux drm-nex-3.7 branch and it doesn't work. The gdm sets the blue background image and freezes, no top bar or login dialog. I ssh from another computer and dmesg is clean at this point. I try to stop gdm and it displays some garbage, mostly black screen with some vertical purple bars about 4 cm thick and about 2 cm from the top of the screen, then it displays the gpu crash messages on log and then the console comes back.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #16 from Serkan Hosca serkan@hosca.com --- Created attachment 68531 --> https://bugs.freedesktop.org/attachment.cgi?id=68531&action=edit dmesg with alex's drm-next-3.7 branch with patch
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #17 from Serkan Hosca serkan@hosca.com --- It works with linus git without the patch with arch packages for mesa 9.0-1 and -ati 6.14.6-2.
I tried with -ati git and mesa 9 and it worked. Then i tried with mesa git and it failed. I started to bisect mesa but i got the following: $ git bisect bad Bisecting: a merge base must be tested [2d2f1fd164218eacf2b142bc808be1f25f66e72c] docs: Add some missing features to 9.0 release notes and GL3.txt
$ git bisect bad The merge base 2d2f1fd164218eacf2b142bc808be1f25f66e72c is bad. This means the bug has been fixed between 2d2f1fd164218eacf2b142bc808be1f25f66e72c and [e5fdeef1e08b55acd48dc68f0cc8fe213f2820b8].
So i did a git log --graph --oneline --all and started to git checkout between those two commits, starting from 2d2f1fd to de92b7a are bad and with commit "ef557ea winsys/radeon: disable virtual memory on Cayman" it started working.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #18 from Alexandre Demers alexandre.f.demers@gmail.com --- Is VM enabled or disabled on your system? I'm experiencing a similar bug with kernel 3.7-rc1, but it is working fine with 3.6. VM is enabled on my system, I'll try to disable it when I'll get home to see if that helps and I'll also try to bisect the kernel commit that screwed things for me.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #19 from Serkan Hosca serkan@hosca.com --- (In reply to comment #18)
Is VM enabled or disabled on your system? I'm experiencing a similar bug with kernel 3.7-rc1, but it is working fine with 3.6. VM is enabled on my system, I'll try to disable it when I'll get home to see if that helps and I'll also try to bisect the kernel commit that screwed things for me.
I don't know, how can i check?
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #20 from Serkan Hosca serkan@hosca.com --- mesa-git is working fine on linux 3.6 and mesa-git dont have the "ef557ea winsys/radeon: disable virtual memory on Cayman" commit
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #21 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to comment #19)
(In reply to comment #18)
Is VM enabled or disabled on your system? I'm experiencing a similar bug with kernel 3.7-rc1, but it is working fine with 3.6. VM is enabled on my system, I'll try to disable it when I'll get home to see if that helps and I'll also try to bisect the kernel commit that screwed things for me.
I don't know, how can i check?
Use "setenv" in a terminal and look for "RADEON_VA".
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #22 from Serkan Hosca serkan@hosca.com --- (In reply to comment #21)
(In reply to comment #19)
(In reply to comment #18)
Is VM enabled or disabled on your system? I'm experiencing a similar bug with kernel 3.7-rc1, but it is working fine with 3.6. VM is enabled on my system, I'll try to disable it when I'll get home to see if that helps and I'll also try to bisect the kernel commit that screwed things for me.
I don't know, how can i check?
Use "setenv" in a terminal and look for "RADEON_VA".
Oh, i have nothing like that in env
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #23 from Christian König deathsimple@vodafone.de --- Created attachment 68623 --> https://bugs.freedesktop.org/attachment.cgi?id=68623&action=edit Test patch.
VM is definitely enabled, otherwise you won't got that error in the first place.
Ok let's try to narrow down that bug a bit more, please apply the attached test patch and see what happens.
If the GPU hang vanished we indeed have a syncing issue, but not the PFP sync.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #24 from Serkan Hosca serkan@hosca.com --- (In reply to comment #23)
Created attachment 68623 [details] [review] Test patch.
VM is definitely enabled, otherwise you won't got that error in the first place.
Ok let's try to narrow down that bug a bit more, please apply the attached test patch and see what happens.
If the GPU hang vanished we indeed have a syncing issue, but not the PFP sync.
The patch resets the gpu constantly, even without X, with both linus git and agd5f drm-next-3.7 branch with mesa git.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #25 from Serkan Hosca serkan@hosca.com --- Created attachment 68655 --> https://bugs.freedesktop.org/attachment.cgi?id=68655&action=edit dmesg.3.7-rc1 with test patch
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #26 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to comment #23)
Created attachment 68623 [details] [review] Test patch.
VM is definitely enabled, otherwise you won't got that error in the first place.
Ok let's try to narrow down that bug a bit more, please apply the attached test patch and see what happens.
If the GPU hang vanished we indeed have a syncing issue, but not the PFP sync.
It is and it is not. What I mean is concerning comment 17 "So i did a git log --graph --oneline --all and started to git checkout between those two commits, starting from 2d2f1fd to de92b7a are bad and with commit "ef557ea winsys/radeon: disable virtual memory on Cayman" it started working."
If the variable "RADEON_VA" is not set or doesn't exist, from the point commit "ef557ea" kicks in, VM gets disabled. Before that commit, VM is always enabled; from that point, we must be careful. If we want to test after commit "ef557ea" with VM enabled, "RADEON_VA" MUST be set, otherwise it will be disable and will hide the bug.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #27 from Christian König deathsimple@vodafone.de --- Well that's interesting, according to the logs you are running out of GART memory (which is 512MB in size) just 7 seconds after boot, and that is really odd.
Could you please tell me what the heck you're doing to run out of memory? Is there some kind of animated splash screen running or something like that?
I think that this problem shows up when you're tight on memory AND try to use VM at the same time. Probably we're missing some return value check or something like this.
Anyway, as Alexandre Demers pointed out simply disabling VM should also help.
In the meantime I will try to test the VM implementation under memory pressure, maybe that will yield some results.
Cheers, Christian.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #28 from Serkan Hosca serkan@hosca.com --- (In reply to comment #27)
Well that's interesting, according to the logs you are running out of GART memory (which is 512MB in size) just 7 seconds after boot, and that is really odd.
Could you please tell me what the heck you're doing to run out of memory? Is there some kind of animated splash screen running or something like that?
I think that this problem shows up when you're tight on memory AND try to use VM at the same time. Probably we're missing some return value check or something like this.
Anyway, as Alexandre Demers pointed out simply disabling VM should also help.
In the meantime I will try to test the VM implementation under memory pressure, maybe that will yield some results.
Cheers, Christian.
I don't have anything graphical running during boot. I have radeon in mkinitcpio MODULES, no plymouth or anything just console, that sets up the mode then straight to gdm.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #29 from Alexandre Demers alexandre.f.demers@gmail.com --- I haven't had time to dig it, but just to let you know I'm pretty much in the same situation as Serkan with a very similar config. I don't think it has to do with something using too much memory, but more about not releasing/attributing it correctly in the first place. Otherwise, why would it work with kernel 3.6 and not 3.7 if only kernel version is in the balance?
I should have time to look at it tonight.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #30 from Jerome Glisse glisse@freedesktop.org --- Well log for comment #25 shows out of memory. Which should not happen. It looks like it's the framebuffer that try to go into gtt but that doesn't make sense (16M is fb size according to log).
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #31 from Serkan Hosca serkan@hosca.com --- (In reply to comment #29)
I haven't had time to dig it, but just to let you know I'm pretty much in the same situation as Serkan with a very similar config. I don't think it has to do with something using too much memory, but more about not releasing/attributing it correctly in the first place. Otherwise, why would it work with kernel 3.6 and not 3.7 if only kernel version is in the balance?
I should have time to look at it tonight.
I think the gart memory issue is because of my recent update to gnome 3.6, i didn't see that with gdm 3.4. The machine also boots very fast now after the systemd upgrade, from grub to gdm i would say its about 5~7 seconds. Also when grub starts, the screen stays at console login prompt with the mouse cursor available and it takes about 2~3 seconds till gdm starts doing its fading thing to login prompt.
I will try to revert it back and test it again when i get home.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #32 from Jerome Glisse glisse@freedesktop.org --- Other explanation might be that the gdm admin queue a bunch of animation in form of big bo and thus fill up the gart before the first gpu lockup had a chance to be detected.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #33 from Serkan Hosca serkan@hosca.com --- (In reply to comment #32)
Other explanation might be that the gdm admin queue a bunch of animation in form of big bo and thus fill up the gart before the first gpu lockup had a chance to be detected.
I'll try lightdm or straight startx from console too
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #34 from Alexandre Demers alexandre.f.demers@gmail.com --- If Serkan and I are experiencing the same problem as I suspect, I would say this is improbably related to Gnome 3.6 because I'm still using 3.4 (with both kernel 3.6 and 3.7-rc1). We have the same GPU and we are not using plymouth. We are experiencing similar visual problem (can't confirm with a remote connection for now) when moving to kernel 3.7-rcX, but not with 3.6.
I'll bisect kernel tonight and when I'm done. I'll keep you updated.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #35 from Alexandre Demers alexandre.f.demers@gmail.com --- I've been playing a bit (booting and restarting with kernel 3.7-rc1) and strangely, what I see is very similar to what I was observing in bug 43655. It was then merged with bug 42373. At the time, attachment 64759 was proposed and a similar patch ended up being commited that fixed bug 43655 for me (but it never fixed bug 42373 on NI CAICOS).
I'll try the workaround used at the time to see if it is really related to bug 43655 (comments 8 and 10) and I'll begin bisecting kernel right after.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #36 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #35)
I've been playing a bit (booting and restarting with kernel 3.7-rc1) and strangely, what I see is very similar to what I was observing in bug 43655. It was then merged with bug 42373. At the time, attachment 64759 [details] [review] was proposed and a similar patch ended up being commited that fixed bug 43655 for me (but it never fixed bug 42373 on NI CAICOS).
I'll try the workaround used at the time to see if it is really related to bug 43655 (comments 8 and 10) and I'll begin bisecting kernel right after.
I think what you really want for your caicos is this patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=6244...
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #37 from Serkan Hosca serkan@hosca.com --- Created attachment 68728 --> https://bugs.freedesktop.org/attachment.cgi?id=68728&action=edit dmesg.3.7-rc1 with testpatch with mesa-git
I removed gdm and installed slim as login manager. Also installed cinnamon as a replacement for gnome and it works fine the first round with linus git with the test patch and mesa git. Restarted slim and logged in again and there were some font corruptions, i restarted cinnamon and they were gone. I tried google maps with webgl enabled and it was working fine.
After that i edited my .xinitrc to startup gnome, restarted slim and logged in but it failed and got the error window saying oh no something has gone wrong and a log out button. I checked dmesg at that point and saw the ttm gart memory error. i switched back to cinnamon logged in and got the same font corruptions, restarting cinnamon fixed them.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #38 from Serkan Hosca serkan@hosca.com --- Using the same linus git kernel with test patch and mesa git, I've reverted gnome to 3.4, kept slim as login manager, logged in to gnome, it worked fine, no errors in dmesg. I stopped slim, installed gdm and started it and logged in without any errors.
I disabled slim and enabled gdm instead and rebooted the computer. Gdm login came up, i logged in and it worked fine.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #39 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to comment #36)
(In reply to comment #35)
I've been playing a bit (booting and restarting with kernel 3.7-rc1) and strangely, what I see is very similar to what I was observing in bug 43655. It was then merged with bug 42373. At the time, attachment 64759 [details] [review] [review] was proposed and a similar patch ended up being commited that fixed bug 43655 for me (but it never fixed bug 42373 on NI CAICOS).
I'll try the workaround used at the time to see if it is really related to bug 43655 (comments 8 and 10) and I'll begin bisecting kernel right after.
I think what you really want for your caicos is this patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit; h=62444b7462a2b98bc78d68736c03a7c4e66ba7e2
You misunderstood me. I'm using a 6950 (not CAICOS) and it was working great with commit http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=... for my Cayman (but not for CAICOS according to reporter of bug 42373) included in kernel 3.6. What I'm saying is that the symptoms I'm now seeing with 3.7-rc1 are similar to what I was seeing at the time, but it was fixed in 3.6.
Now, about the patch you propose, it is already included in kernel 3.7-rc1 according to commit history. Since I'm experiencing bug 55692 with 3.7-rc1, the proposed patch can't be the cure. I'm bisecting right now between kernel 3.6 and 3.7-rc1. If it appears to be a different bug than 55692, I'll open a new one.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #40 from Alexandre Demers alexandre.f.demers@gmail.com --- (In reply to comment #36)
(In reply to comment #35)
I've been playing a bit (booting and restarting with kernel 3.7-rc1) and strangely, what I see is very similar to what I was observing in bug 43655. It was then merged with bug 42373. At the time, attachment 64759 [details] [review] [review] was proposed and a similar patch ended up being commited that fixed bug 43655 for me (but it never fixed bug 42373 on NI CAICOS).
I'll try the workaround used at the time to see if it is really related to bug 43655 (comments 8 and 10) and I'll begin bisecting kernel right after.
I think what you really want for your caicos is this patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit; h=62444b7462a2b98bc78d68736c03a7c4e66ba7e2
kernel bisected. Here is the culprit commit from what I see here: 62444b7462a2b98bc78d68736c03a7c4e66ba7e2 is the first bad commit commit 62444b7462a2b98bc78d68736c03a7c4e66ba7e2 Author: Alex Deucher alexander.deucher@amd.com Date: Wed Aug 15 17:18:42 2012 -0400
drm/radeon: properly handle mc_stop/mc_resume on evergreen+ (v2)
- Stop the displays from accessing the FB - Block CPU access - Turn off MC client access
This should fix issues some users have seen, especially with UEFI, when changing the MC FB location that result in hangs or display corruption.
v2: fix crtc enabled check noticed by Luca Tettamanti
Signed-off-by: Alex Deucher alexander.deucher@amd.com
:040000 040000 3e0d33c9b4eda29ced814fe9a863efe63e53f14c 4932561607b160734ec1eade927a9fe18c9f3f1b M drivers
So it may not be the same bug I'm hitting as Serkan is. Where should I track this faulty commit/bug? In the NI CAICOS bug or in a new one?
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #41 from Christian König deathsimple@vodafone.de --- (In reply to comment #37)
Created attachment 68728 [details] dmesg.3.7-rc1 with testpatch with mesa-git
I removed gdm and installed slim as login manager. Also installed cinnamon as a replacement for gnome and it works fine the first round with linus git with the test patch and mesa git. Restarted slim and logged in again and there were some font corruptions, i restarted cinnamon and they were gone. I tried google maps with webgl enabled and it was working fine.
After that i edited my .xinitrc to startup gnome, restarted slim and logged in but it failed and got the error window saying oh no something has gone wrong and a log out button. I checked dmesg at that point and saw the ttm gart memory error. i switched back to cinnamon logged in and got the same font corruptions, restarting cinnamon fixed them.
Thanks allot for your additional testing, as I suspected we are really facing two problems here:
1. The new gnome/gdm versions seem to trigger an out of memory situation in the GART memory area. That's probably because some miscalculation or memory leak or something like this and should be handled as a separate bug.
BTW: You can take a look at the current memory allocations with: sudo cat /sys/kernel/debug/dri/0/radeon_gtt_mm and sudo cat /sys/kernel/debug/dri/0/radeon_vram_mm
2. Properly updating the page table asynchronously somehow fails under high memory pressure.
I will try to look into problem 2 first, since that got added with my patch. But problem number 1 is as equally as bad.
I don't think we just spool up allot of drawing operations like Jerome suspected, cause in this case TTM would just block on previous render operations to complete. It looks more like we are submitting a single draw operation with multiple ~16MB chunks of memory that is so big that it just won't fit into the GART memory altogether.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #42 from Christian König deathsimple@vodafone.de --- (In reply to comment #40) [SNIP]
kernel bisected. Here is the culprit commit from what I see here: 62444b7462a2b98bc78d68736c03a7c4e66ba7e2 is the first bad commit commit 62444b7462a2b98bc78d68736c03a7c4e66ba7e2 Author: Alex Deucher alexander.deucher@amd.com Date: Wed Aug 15 17:18:42 2012 -0400
drm/radeon: properly handle mc_stop/mc_resume on evergreen+ (v2) - Stop the displays from accessing the FB - Block CPU access - Turn off MC client access This should fix issues some users have seen, especially with UEFI, when changing the MC FB location that result in hangs or display corruption. v2: fix crtc enabled check noticed by Luca Tettamanti Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
:040000 040000 3e0d33c9b4eda29ced814fe9a863efe63e53f14c 4932561607b160734ec1eade927a9fe18c9f3f1b M drivers
So it may not be the same bug I'm hitting as Serkan is. Where should I track this faulty commit/bug? In the NI CAICOS bug or in a new one?
That indeed looks like a separate bug to me, so I suggest to open up a new bug.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #43 from Christian König deathsimple@vodafone.de --- Good news! I figured out what it is (the crash not the memory problem) and can reproduce it.
A patch fixing this shouldn't be to much of a problem any more, but I don't think I will have time to fix it before Monday.
So please be patient for a couple of more days.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #44 from Serkan Hosca serkan@hosca.com --- (In reply to comment #43)
Good news! I figured out what it is (the crash not the memory problem) and can reproduce it.
A patch fixing this shouldn't be to much of a problem any more, but I don't think I will have time to fix it before Monday.
So please be patient for a couple of more days.
Thats cool. I found out what triggers the gart error. I had gtk-redshift on session start up. After removing that the ttm error is gone. It redshifts the screen colors so that it is easy on the eyes and when its started it starts the redshifting gradually.
Also, i have been playing around with the RADEON_VA variable but i can't trigger the gpu stall anymore, i get some graphical corruptions and a couple of these instead:
[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -12!
After a shell restart, the glitches go away.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #45 from Serkan Hosca serkan@hosca.com ---
Thats cool. I found out what triggers the gart error. I had gtk-redshift on session start up. After removing that the ttm error is gone. It redshifts the screen colors so that it is easy on the eyes and when its started it starts the redshifting gradually.
Scratch that, i removed redshift but the gart error happened again. Its not the gdm startup though, it happens during gnome session startup.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #68516|0 |1 is obsolete| | Attachment #68623|0 |1 is obsolete| |
--- Comment #46 from Christian König deathsimple@vodafone.de --- Created attachment 68906 --> https://bugs.freedesktop.org/attachment.cgi?id=68906&action=edit Possible fix.
Ok, please try the attached patch. It should fix the issue with the original "async page table updates patch".
Please note that Alex current drm-fixes-3.7 branch already contains another patch that is also masquerading this problem, so please test with the original drm-next-3.7 branch.
I've submitted a series of patches that should fix and cleanup the code.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #47 from Serkan Hosca serkan@hosca.com --- (In reply to comment #46)
Created attachment 68906 [details] [review] Possible fix.
Ok, please try the attached patch. It should fix the issue with the original "async page table updates patch".
Please note that Alex current drm-fixes-3.7 branch already contains another patch that is also masquerading this problem, so please test with the original drm-next-3.7 branch.
I've submitted a series of patches that should fix and cleanup the code.
Yes the patch works. I've checked out v3.6 and merged alex' drm-next-3.7 branch on top and tested with mesa-git and ati-git. Because of the gnome update i don't get the same exact dmesg errors but the result is the same, gpu just stalls when you try to login.
After the patch, i am able to login, i still get a couple relocation errors and some glitches, which disappear after restarting gnome shell.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #48 from Serkan Hosca serkan@hosca.com --- Created attachment 68932 --> https://bugs.freedesktop.org/attachment.cgi?id=68932&action=edit dmesg-3.6+drm-next-3.7
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #49 from Serkan Hosca serkan@hosca.com --- Created attachment 68933 --> https://bugs.freedesktop.org/attachment.cgi?id=68933&action=edit dmesg-3.6+drm-next-3.7+patch
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #50 from Alex Deucher agd5f@yahoo.com --- you'll probably want the updated version of the patch here: http://lists.freedesktop.org/archives/dri-devel/2012-October/029292.html
https://bugs.freedesktop.org/show_bug.cgi?id=55692
--- Comment #51 from Alexandre Demers alexandre.f.demers@gmail.com --- Since the patch was submitted and applied on kernel 3.7, should this bug be closed?
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Serkan Hosca serkan@hosca.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution|--- |FIXED
--- Comment #52 from Serkan Hosca serkan@hosca.com --- Yes this is fixed.
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Alexandre Demers alexandre.f.demers@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugs.freedesktop.or | |g/show_bug.cgi?id=57567
https://bugs.freedesktop.org/show_bug.cgi?id=55692
Alexandre Demers alexandre.f.demers@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also|https://bugs.freedesktop.or | |g/show_bug.cgi?id=57567 |
dri-devel@lists.freedesktop.org