https://bugs.freedesktop.org/show_bug.cgi?id=79980
Priority: medium Bug ID: 79980 Assignee: dri-devel@lists.freedesktop.org Summary: Random radeonsi crashes Severity: normal Classification: Unclassified OS: All Reporter: darkbasic@linuxsystems.it Hardware: Other Status: NEW Version: XOrg CVS Component: DRM/Radeon Product: DRI
Created attachment 100978 --> https://bugs.freedesktop.org/attachment.cgi?id=100978&action=edit dmesg
Kernel 3.15.0-rc8 + PTE patches
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #1 from Alex Deucher agd5f@yahoo.com --- What specific app were you using that caused the GPU hang? Also if this is a regression can you biect?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #2 from darkbasic darkbasic@linuxsystems.it --- No specific app (not counting KDE desktop effects). If the problem is the kernel it's a regression because I didn't have any problem with -rc5. Unfortunately it's not easy to trigger the crash so there is no chance to bisect given how busy I actually am.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #3 from Alex Deucher agd5f@yahoo.com --- Does it still happen if you drop the PTE patches?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #4 from darkbasic darkbasic@linuxsystems.it --- Didn't try, but PTE worked flawlessly on -rc5.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #5 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #3)
Does it still happen if you drop the PTE patches?
Is that stop poisoning the GART TLB?
Whatever - it could be a separate issue, but I am now getting sort of random crashes on your drm-next-3.16 with my pitcairn.
I am stable on deathsimple 3.15 fixes + hdmi patches.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #6 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #5)
(In reply to comment #3)
Does it still happen if you drop the PTE patches?
Is that stop poisoning the GART TLB?
Ok ignore that :-) I didn't spot the rs600
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #7 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #6)
(In reply to comment #5)
(In reply to comment #3)
Does it still happen if you drop the PTE patches?
Is that stop poisoning the GART TLB?
Ok ignore that :-) I didn't spot the rs600
It applies to all asics from rs600 forward.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #8 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #7)
(In reply to comment #6)
(In reply to comment #5)
(In reply to comment #3)
Does it still happen if you drop the PTE patches?
Is that stop poisoning the GART TLB?
Ok ignore that :-) I didn't spot the rs600
It applies to all asics from rs600 forward.
Ahh, in the meantime I've now built with
optimize SI VM handling + use lower_32_bits where appropriate reverted - the latter just so I could revert the former.
I'll see if I am stable over the next couple of days like this.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #9 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #8)
(In reply to comment #7)
(In reply to comment #6)
(In reply to comment #5)
(In reply to comment #3)
Does it still happen if you drop the PTE patches?
Is that stop poisoning the GART TLB?
Ok ignore that :-) I didn't spot the rs600
It applies to all asics from rs600 forward.
Ahh, in the meantime I've now built with
optimize SI VM handling + use lower_32_bits where appropriate reverted - the latter just so I could revert the former.
I'll see if I am stable over the next couple of days like this.
I am stable so far with the above reverted.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #10 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #9)
(In reply to comment #8)
optimize SI VM handling + use lower_32_bits where appropriate reverted - the latter just so I could revert the former.
I'll see if I am stable over the next couple of days like this.
I am stable so far with the above reverted.
Spoke too soon, I just locked. Wasn't quite the same as before in that screen stayed on displaying normal rather that off/on + junk.
Wasn't doing anything GPU related (accepting I always am with glamor), was doing a big compile, so memory pressure I guess.
Also just add to the mix, after thinking I was stable yesterday I upgraded gcc and updated llvm and mesa so they were different in several ways, though I haven't rebuilt kernel.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #11 from darkbasic darkbasic@linuxsystems.it --- Created attachment 101226 --> https://bugs.freedesktop.org/attachment.cgi?id=101226&action=edit gray screen
This is what I often get, I was simply syncing my portage tree while it happened.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
agapito tcxjy@vomoto.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|medium |high
--- Comment #12 from agapito tcxjy@vomoto.com --- First of all: excuse my bad english.
I have the same problem with my HD 7950; using hangouts, playing Left for Dead 2, or watching a flash video my screen goes crazy with vertical lines or grey fog. Started when i upgraded to testing repo (Archlinux) and downloaded the newest linux-firmware package, who includes TAHITI_mc2.bin. I suffered this bug on kernels 3.14 and 3.15. For now, i am using 3.15.1 kernel, and the old Tahiti firmware, and it seems stable.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #13 from darkbasic darkbasic@linuxsystems.it ---
Wasn't doing anything GPU related (accepting I always am with glamor), was doing a big compile, so memory pressure I guess.
You're right, i was compiling too when it crashed. Nothing GPU related anyway.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #14 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #10)
(In reply to comment #9)
(In reply to comment #8)
optimize SI VM handling + use lower_32_bits where appropriate reverted - the latter just so I could revert the former.
I'll see if I am stable over the next couple of days like this.
I am stable so far with the above reverted.
Spoke too soon, I just locked. Wasn't quite the same as before in that screen stayed on displaying normal rather that off/on + junk.
Wasn't doing anything GPU related (accepting I always am with glamor), was doing a big compile, so memory pressure I guess.
Also just add to the mix, after thinking I was stable yesterday I upgraded gcc and updated llvm and mesa so they were different in several ways, though I haven't rebuilt kernel.
I got another lock last thing, this one was "typical" happened when closing seamonkey, this is the third time closing it has locked. Of course it doesn't do it if I try. I must be using gl someway/sometimes, as the last thing I see is the xterm from where it was started and there is a mesa message about default setting for s3tc being overridden by env (and that's not by me - I don't have drirc anywhere).
I think this is going to be a pain to find - I just tried reset --hard onto
add large PTE support for NI, SI and CIK v5
that failed to resume from mem 1st try, though it wasn't locked. just corrupt (mouse cursor large block of junk, fluxbox desktop black, but toolbar still visible) so maybe a different issue fixed by a later commit. I could SysRq - the log was normal.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #15 from agapito tcxjy@vomoto.com --- This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #16 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #15)
This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good.
Well I haven't tried without it, but I have so far failed to reproduce this bug on a slightly older 3.15 drm fixes also using TAHITI_mc2.bin.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #17 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #15)
This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good.
Did you test a new kernel with the old firmware or an old kernel without the new firmware patch? It could be some other change if you did the latter.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #18 from Alex Deucher agd5f@yahoo.com --- If it's the same problem Marek is seeing it's probably this: 6d2f294 - drm/radeon: use normal BOs for the page tables v4
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #19 from agapito tcxjy@vomoto.com --- (In reply to comment #17)
(In reply to comment #15)
This bug is caused by TAHITI_mc2.bin firmware. The old firmware works good.
Did you test a new kernel with the old firmware or an old kernel without the new firmware patch? It could be some other change if you did the latter.
3.14 or 3.15 + New firmware = Crashes
3.14 or 3.15 + Old firmware = No problems!
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #20 from agapito tcxjy@vomoto.com --- OK forget it. It's not a firmware related problem. I had this bug with old firmware on kernel 3.15.1. I resized a flash video window (vdpau accelerated) and lost my screen.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #21 from agapito tcxjy@vomoto.com --- It happened again. In this case with 3.16.rc2, resizing a firefox windows with flash content (vdpau on).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #22 from darkbasic darkbasic@linuxsystems.it --- It happened on 3.16-rc1 too while doing a video call with skype.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #23 from agapito tcxjy@vomoto.com --- Kernel 3.10.44 is affected also ! I am using my Intel Graphic Card for now. I had this bug every 15 minuts watching flash content.
My graphic card is HD 7950 using HDMI output.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #24 from agapito tcxjy@vomoto.com --- This bug is still present in 3.16 rc4, and 3.15.4.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Aaron B aaronbottegal@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |aaronbottegal@gmail.com
--- Comment #25 from Aaron B aaronbottegal@gmail.com --- *** Bug 80141 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #26 from Aaron B aaronbottegal@gmail.com --- (In reply to comment #24)
This bug is still present in 3.16 rc4, and 3.15.4.
This sounds exactly like the bug I talk about in Bug #80141. I'll mark my bug as duplicate of it.
Could Mesa commit c8011c1885003b79c9f0c6530e46ae6cb0e69575 have anything to do with what made 370184e813b25b463ad3dc9ca814231c98b95864 need to happen? Think that could be re-enabled for our GPU's now or not?
Also, would the geometry shaders have any effect on our GPU's as Mesa just patched a couple leaks on those.
These 2 fixes look like good ones fore this problem, as this problem was very random and sporadic, and that is the definition of a good, small leak.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #27 from darkbasic darkbasic@linuxsystems.it --- This bug is so annoying that I switched to Catalyst :-(
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #28 from Aaron B aaronbottegal@gmail.com --- *** Bug 80141 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #29 from agapito tcxjy@vomoto.com --- I can reproduce this bug, using mesa-git repo from Archlinux under kernel-lts 3.14.12. Unigine-valley engine ALWAYS crashes my display when 3D scene starts. If i use normal mesa (10.2) i can run ungine valley OK! But the bug is always present. Like i said in my previous posts, watching flash content increase the chances that the bug appears.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #30 from Marek Olšák maraeo@gmail.com --- (In reply to comment #29)
I can reproduce this bug, using mesa-git repo from Archlinux under kernel-lts 3.14.12. Unigine-valley engine ALWAYS crashes my display when 3D scene starts. If i use normal mesa (10.2) i can run ungine valley OK! But the bug is always present. Like i said in my previous posts, watching flash content increase the chances that the bug appears.
You're talking about a different bug. See: https://bugs.freedesktop.org/show_bug.cgi?id=79659
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #31 from Lukas Kahnert openproggerfreak@gmail.com --- On my machine i have no flash player installed but it increase the chance for this bug too on watching HTML5-Videos(qtwebkit with gstreamer). Unigine Valley is always crashing with black screen and GPU hang. Unigine Heaven works but with white screen(the FPS are visible), but im not sure if it have something to do with this bug. Using Linux 3.16-rc4 and mesa-git
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #32 from Christian König deathsimple@vodafone.de --- Created attachment 102784 --> https://bugs.freedesktop.org/attachment.cgi?id=102784&action=edit Possible fix
Please try if the attached patch (based on 3.15.5) fixes the stability issues with 3.15 and 3.16.
Thanks in advance, Christian.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #33 from Lukas Kahnert openproggerfreak@gmail.com --- compile error on 3.16-rc5 with this patch
drivers/gpu/drm/radeon/radeon_gem.c: In function 'radeon_gem_object_close': drivers/gpu/drm/radeon/radeon_gem.c:183:10: error: 'struct radeon_cs_reloc' has no member named 'domain' bo_reloc.domain = RADEON_GEM_DOMAIN_VRAM; ^ drivers/gpu/drm/radeon/radeon_gem.c:184:10: error: 'struct radeon_cs_reloc' has no member named 'alt_domain' bo_reloc.alt_domain = RADEON_GEM_DOMAIN_VRAM; ^ scripts/Makefile.build:257: recipe for target 'drivers/gpu/drm/radeon/radeon_gem.o' failed
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #102784|0 |1 is obsolete| |
--- Comment #34 from Christian König deathsimple@vodafone.de --- Created attachment 102867 --> https://bugs.freedesktop.org/attachment.cgi?id=102867&action=edit Possible fix v2.
As noted in the comment the last patch was for 3.15.
Here is an updated patch based on alex drm-fixes-3.16-wip branch.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #35 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #34)
Created attachment 102867 [details] [review] Possible fix v2.
As noted in the comment the last patch was for 3.15.
Here is an updated patch based on alex drm-fixes-3.16-wip branch.
Tried this on alex drm-fixes-3.16-wip with my R9 270X and it didn't go well.
When doing nothing I am getting errors (attached), later I was transcoding some vids so I guess memory pressure, and then I managed to lock the screen by trying to use uvd. I could SysRq OK - so different for me from before in that respect.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #36 from Andy Furniss adf.lists@gmail.com --- Created attachment 102925 --> https://bugs.freedesktop.org/attachment.cgi?id=102925&action=edit Kernel errors with Possible fix v2
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #37 from Aaron B aaronbottegal@gmail.com --- (In reply to comment #34)
Created attachment 102867 [details] [review] Possible fix v2.
As noted in the comment the last patch was for 3.15.
Here is an updated patch based on alex drm-fixes-3.16-wip branch.
Applied to 3.16-rc5 and after about 12 hours, most of which I've been watching youtube videos, all is well here. Also gamed fine, lots of screen switches and movement. I'll report any crashes/errors I encounter, though. But this is much more stable for me, no problems at all.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #38 from Michel Dänzer michel@daenzer.net --- Created attachment 102960 --> https://bugs.freedesktop.org/attachment.cgi?id=102960&action=edit Fixups for Christian's patch
This patch on top of Christian's patch has been working very well for me.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #39 from Aaron B aaronbottegal@gmail.com --- (In reply to comment #38)
Created attachment 102960 [details] [review] Fixups for Christian's patch
This patch on top of Christian's patch has been working very well for me.
Both of these patches together on top of a 3.16-rc5 kernel make an unbootable kernel for me. It has a null pointer dereference somewhere very early along the lines of loading and setting up everything.
Jul 17 01:16:17 aaron-desktop kernel: [ 4.084761] Switched to clocksource tsc Jul 17 01:16:17 aaron-desktop kernel: [ 5.000793] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 Jul 17 01:16:17 aaron-desktop kernel: [ 5.000822] IP: [<ffffffffc055764d>] radeon_vm_bo_set_addr+0x23d/0x440 [radeon] Jul 17 01:16:17 aaron-desktop kernel: [ 5.000879] PGD 41ee4a067 PUD 41e4de067 PMD 0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.000897] Oops: 0000 [#1] SMP Jul 17 01:16:17 aaron-desktop kernel: [ 5.000911] Modules linked in: hid_generic usbhid hid uas usb_storage mxm_wmi radeon i2c_algo_bit psmouse ttm drm_kms_helper r8169 drm mii ahci libahci ohci_pci wmi Jul 17 01:16:17 aaron-desktop kernel: [ 5.000979] CPU: 5 PID: 280 Comm: plymouthd Not tainted 3.16.0-rc5-rc99-RadeonSIFixV2 #1 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001003] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A99FX PRO R2.0, BIOS 2301 01/06/2014 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001032] task: ffff88041d905b20 ti: ffff88041f710000 task.ti: ffff88041f710000 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001053] RIP: 0010:[<ffffffffc055764d>] [<ffffffffc055764d>] radeon_vm_bo_set_addr+0x23d/0x440 [radeon] Jul 17 01:16:17 aaron-desktop kernel: [ 5.001093] RSP: 0018:ffff88041f713b38 EFLAGS: 00010203 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001109] RAX: ffff88041d9a0000 RBX: 0000000000000002 RCX: ffff88041e719560 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001129] RDX: ffff880424834400 RSI: 0000000000000003 RDI: ffff8800367ca438 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001150] RBP: ffff88041f713b80 R08: 0000000000000000 R09: ffff88041e718150 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001170] R10: 0000000000000000 R11: ffffffffc0503cc5 R12: 0000000000000000 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001190] R13: 0000000000000002 R14: 0000000000000001 R15: ffff88041f9494e0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001211] FS: 00007f5419413740(0000) GS:ffff88043ed40000(0000) knlGS:0000000000000000 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001234] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001251] CR2: 0000000000000078 CR3: 000000041ef62000 CR4: 00000000000407e0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001271] Stack: Jul 17 01:16:17 aaron-desktop kernel: [ 5.001278] ffff88041f713b50 ffff88041e718000 ffff8800367ca438 ffff880424834400 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001304] 0000000000000000 ffff88041e718000 ffff88041f998c00 ffff8800367ca400 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001338] ffff88041f95c800 ffff88041f713bc8 ffffffffc048b693 ffff880424f30800 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001363] Call Trace: Jul 17 01:16:17 aaron-desktop kernel: [ 5.001379] [<ffffffffc048b693>] radeon_driver_open_kms+0x133/0x230 [radeon] Jul 17 01:16:17 aaron-desktop kernel: [ 5.001408] [<ffffffffc03c8367>] drm_open+0x1b7/0x4d0 [drm] Jul 17 01:16:17 aaron-desktop kernel: [ 5.001428] [<ffffffffc03c8725>] drm_stub_open+0xa5/0x100 [drm] Jul 17 01:16:17 aaron-desktop kernel: [ 5.001448] [<ffffffff811d416f>] chrdev_open+0x9f/0x1d0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001465] [<ffffffff811cceff>] do_dentry_open+0x1ff/0x350 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001482] [<ffffffff811da7f2>] ? __inode_permission+0x52/0xc0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001500] [<ffffffff811d40d0>] ? cdev_put+0x30/0x30 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001516] [<ffffffff811cd221>] finish_open+0x31/0x40 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001532] [<ffffffff811de99a>] do_last+0xa7a/0x1210 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001548] [<ffffffff811dad21>] ? link_path_walk+0x71/0x870 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001566] [<ffffffff811b3d56>] ? kmem_cache_alloc_trace+0x1c6/0x1f0 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001586] [<ffffffff81342383>] ? apparmor_file_alloc_security+0x23/0x40 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001606] [<ffffffff811df1eb>] path_openat+0xbb/0x670 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001622] [<ffffffff811da789>] ? putname+0x29/0x40 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001637] [<ffffffff811dfebf>] ? user_path_at_empty+0x5f/0x90 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001655] [<ffffffff811dffaa>] do_filp_open+0x3a/0x90 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001672] [<ffffffff811ecb17>] ? __alloc_fd+0xa7/0x130 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001688] [<ffffffff811ceaa8>] do_sys_open+0x128/0x220 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001705] [<ffffffff81021b15>] ? syscall_trace_enter+0x145/0x250 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001724] [<ffffffff811cebbe>] SyS_open+0x1e/0x20 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001739] [<ffffffff817725ff>] tracesys+0xe1/0xe6 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001754] Code: f4 ff 4c 89 ef 41 89 dd e8 a1 91 21 c1 4d 39 ee 0f 83 4f ff ff ff 0f 1f 84 00 00 00 00 00 48 8b 7d c8 e8 47 8f 21 c1 4d 8b 67 48 <41> 8b 44 24 78 4d 8d 6c 24 48 85 c0 0f 84 8b 01 00 00 49 8b bc Jul 17 01:16:17 aaron-desktop kernel: [ 5.001891] RIP [<ffffffffc055764d>] radeon_vm_bo_set_addr+0x23d/0x440 [radeon] Jul 17 01:16:17 aaron-desktop kernel: [ 5.001924] RSP <ffff88041f713b38> Jul 17 01:16:17 aaron-desktop kernel: [ 5.001934] CR2: 0000000000000078 Jul 17 01:16:17 aaron-desktop kernel: [ 5.001945] ---[ end trace e59240e65015cb90 ]---
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #102960|0 |1 is obsolete| |
--- Comment #40 from Michel Dänzer michel@daenzer.net --- Created attachment 102966 --> https://bugs.freedesktop.org/attachment.cgi?id=102966&action=edit Fixups for Christian's patch v2
v2: Fix use-after-free and unprotected list manipulations
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #41 from Aaron B aaronbottegal@gmail.com --- This is with only the 3.16-rc5 patch without fix-ups, which was working okay. But when I clicked on the top-right of facebook to open up an event, it went out just like old times. But if you see from the time, it had a good run this time for sure. Youtube/Video players in general never crashed it once. I have the fixed kernel building now so soon I'll jump on the fixed one, it looks like code related to this has changed (Error message output a little different.) so I'll try it out.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #102867|0 |1 is obsolete| | Attachment #102925|0 |1 is obsolete| | Attachment #102966|0 |1 is obsolete| |
--- Comment #42 from Christian König deathsimple@vodafone.de --- Created attachment 102992 --> https://bugs.freedesktop.org/attachment.cgi?id=102992&action=edit Possible fix v3.
Updated and largely simplified patch.
I'm running the third piglit test with it now and so far the system seems to be stable.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #43 from Aaron B aaronbottegal@gmail.com --- Built, testing. Played youtube videos, chrome, multiple tabs, all while playing Portal 2 and not a single hiccup on the output, outside of the casual VBlank update problems I see you guys working on for 3-17. I did get a crash on the old patch, as said, but we'll give it more time and I'll post any negative results. For now, this is much more stable than before, though.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #44 from Aaron B aaronbottegal@gmail.com --- (In reply to comment #42)
Created attachment 102992 [details] [review] Possible fix v3.
Updated and largely simplified patch.
I'm running the third piglit test with it now and so far the system seems to be stable.
Just had a crash happen, was opening a Yahoo page. Very normal to crash on it TBH from the old version too, but it shows that this patch may only delay the problem, not be an actual fix. I don't really know what to say about it, same old same old. :/
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #45 from Aaron B aaronbottegal@gmail.com --- Also seems, by looking at my xorg log, many problems are happening along the way.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #46 from Lukas Kahnert openproggerfreak@gmail.com --- I tried to run piglit with all tests and everytime(I tried 3 times) i get a blackscreen and the System hangs. I dont know the usage of Piglit so i cant say on which test the GPU hangs. It looks like the same bug which also appears randomly by watching videos/flash. I used the latest patch(v3).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #47 from Aaron B aaronbottegal@gmail.com --- Created attachment 103013 --> https://bugs.freedesktop.org/attachment.cgi?id=103013&action=edit Crash on V3
Just posting another crash here. This one was caused, possibly by youtube as it was on in the background, but clicking on a facebook chat, it just lost it. So, there it is.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Aaron B aaronbottegal@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #103013|0 |1 is obsolete| |
--- Comment #48 from Aaron B aaronbottegal@gmail.com --- Created attachment 103014 --> https://bugs.freedesktop.org/attachment.cgi?id=103014&action=edit Crash on V3.
Pulled the wrong part out of the log, this is the correct crash. Times happened to be identical, forgot it was in MT.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Aaron B aaronbottegal@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #103014|text/plain |application/x-7z-compressed mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #49 from agapito tcxjy@vomoto.com --- I am using now vainilla 3.16 rc5 kernel, xserver 1.16, llvm 3.4.2 and latest mesa-git code. I had another crash, but this time i didn't lose my screen, I could "dmesg" and saw a lot of:
radeon 0000:01:00.0: failed to get a new IB (-35)
Then I could resuib my computer.
--------------------------------------------------------------------------
Now when I am trying to run unigine-valley I don't lose my screen like before, but i had this error:
LLVM ERROR: Cannot select: 0x6b00970: i32 = truncate 0x6aec240 [ORD=19] [ID=146] 0x6aec240: i128 = srl 0x6afc660, 0x6b00470 [ORD=19] [ID=126] 0x6afc660: i128,ch = load 0x6a31ff8, 0x6ae57c0, 0x6ae6bd0<LD16[%30](tbaa=!"const")> [ORD=19] [ID=116] 0x6ae57c0: i64,ch = CopyFromReg 0x6a31ff8, 0x6ae56c0 [ID=108] 0x6ae56c0: i64 = Register %vreg100 [ID=2] 0x6ae6bd0: i64 = undef [ID=8] 0x6b00470: i32 = Constant<96> [ID=103] In function: main
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #50 from agapito tcxjy@vomoto.com --- mesa-git compiled against llvm-svn = unigine-valley working again :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #51 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #42)
Created attachment 102992 [details] [review] Possible fix v3.
Updated and largely simplified patch.
I'm running the third piglit test with it now and so far the system seems to be stable.
Been running (not piglit) for a few days now without crashing.
I see it and a couple more fixes are now in agd5f drm-fixes-3.16-wip, so will try that.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #52 from Aaron B aaronbottegal@gmail.com --- I can confirm that the bug still isn't fixed. But, it does seem to be much more delayed now, though. I can run youtube for a while, but now Chromium seems to crash it more often in general. Been running for a few days, have had at least 4 crashes. All with about the same fail logs as before. But as side, there's a few VM and IB fixes extra in the 3.16-wip and 3.16 branch, so I'll wait until those to care about this problem more. :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #53 from Michel Dänzer michel@daenzer.net --- (In reply to comment #52)
I can run youtube for a while, but now Chromium seems to crash it more often in general. Been running for a few days, have had at least 4 crashes. All with about the same fail logs as before.
Your Youtube / Chromium issue is probably separate and should be tracked somewhere else. This report is about a stability regression in 3.15/6-rc kernels, which seems to be addressed by Christian's fixes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #54 from Christian König deathsimple@vodafone.de --- (In reply to comment #53)
(In reply to comment #52)
I can run youtube for a while, but now Chromium seems to crash it more often in general. Been running for a few days, have had at least 4 crashes. All with about the same fail logs as before.
Your Youtube / Chromium issue is probably separate and should be tracked somewhere else. This report is about a stability regression in 3.15/6-rc kernels, which seems to be addressed by Christian's fixes.
Yeah, agree. Your log doesn't show any VM faults at all.
That looks more like a userspace problem triggered by some Chromium operations.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #55 from darkbasic darkbasic@linuxsystems.it --- Finally, it's time to purge Catalyst once again :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #56 from Aaron B aaronbottegal@gmail.com --- (In reply to comment #54)
(In reply to comment #53)
(In reply to comment #52)
I can run youtube for a while, but now Chromium seems to crash it more often in general. Been running for a few days, have had at least 4 crashes. All with about the same fail logs as before.
Your Youtube / Chromium issue is probably separate and should be tracked somewhere else. This report is about a stability regression in 3.15/6-rc kernels, which seems to be addressed by Christian's fixes.
Yeah, agree. Your log doesn't show any VM faults at all.
That looks more like a userspace problem triggered by some Chromium operations.
Any idea where I should file the bug report? Would it be the Cinnamon back end, or glamour?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #57 from Michel Dänzer michel@daenzer.net --- (In reply to comment #56)
Any idea where I should file the bug report? Would it be the Cinnamon back end, or glamour?
The first candidate is the Mesa radeonsi driver.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #58 from agapito tcxjy@vomoto.com --- 3.16 rc7 solved this bug! but i need more testing.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #59 from darkbasic darkbasic@linuxsystems.it --- Created attachment 103680 --> https://bugs.freedesktop.org/attachment.cgi?id=103680&action=edit dmesg_3.16-rc7
Far from being fixed with 3.16-rc7 I simply watched a Facebook flash video full screen.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #60 from darkbasic darkbasic@linuxsystems.it --- Created attachment 103681 --> https://bugs.freedesktop.org/attachment.cgi?id=103681&action=edit crash photo
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #61 from jackdachef@gmail.com --- thanks for all the fixes !
currently using http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-rebased-on-fix... and it looks pretty stable so far (radeonsi, R9 270X)
even with 3.14.14 there were some (pretty seldom) but occasional crashes of X:
vanilla 3.16-rc6 was pretty much unusable (X + box hardlocks)
and especially http://cgit.freedesktop.org/~agd5f/linux/?h=drm-next-3.17-wip (*without* the fixes) had dozens of gpu crashes yesterday evening/night which were recovering (probably thanks to radeon.hard_reset=1) without crashing X and hardlocking the system:
ring 0 stalled for more than 10010msec GPU lockup waiting for ... last fence id ... on ring 0 GPU softreset ...
[drm] UVD initialized successfully. [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35) radeon 0000:01:00.0: ib ring test failed (-35) couldn't schedule ib still active bo inside vm GPU softreset ...
radeon 0000:01:00.0: ring 5 stalled for more than 10000msec radeon 0000:01:00.0: GPU lockup (waiting for ... last fence id ... on ring 5 [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).[drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35)[drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed)
radeon 0000:01:00.0: ring 5 stalled for more than 3438516msec radeon 0000:01:00.0: GPU lockup (waiting for ... last fence id ... on ring 5) radeon 0000:01:00.0: ffff8807cfa63000 pin failed [drm:radeon_crtc_page_flip] *ERROR* failed to pin new rbo buffer before flip
and messages in Xorg.0.log with (WW) RADEON(0): radeon_dri2_flip_event_handler: Pageflip completion event has impossible msc 89175 < target_msc 89176
and
(WW) RADEON(0): flip queue failed: Invalid argument (WW) RADEON(0): Page flip failed: Invalid argument
currently still running with radeon.hard_reset=1 - but that probably shouldn't be needed anymore
best trigger for those occasional crashes was using chromium/chrome, watching videos on youtube, switching apps via alt+tab (using an composited desktop with compiz-fusion 0.8.8) and from time to time opening up gmail (which, if I remember correctly would cause the gpu to crash at least twice of these countless times yesterday)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #62 from jackdachef@gmail.com --- Created attachment 103685 --> https://bugs.freedesktop.org/attachment.cgi?id=103685&action=edit crash during watching of a flash-video on youtube
video during crash was: https://www.youtube.com/watch?v=9QIlB3ZVJes
before that: https://www.youtube.com/watch?v=fMKe89zcvHU
and before that: 1 video in 1080p, 3 videos in 720p (mainly slowly moving classical music, not sure if the type of the video's content [slow, fast-moved, etc.] makes any difference)
like mentioned in https://bugs.freedesktop.org/show_bug.cgi?id=81612 will try running with zswap & zram disabled,
but the frequent crashes on drm-next-3.17-wip also occured with zram, zswap disabled
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #63 from jackdachef@gmail.com --- Created attachment 103686 --> https://bugs.freedesktop.org/attachment.cgi?id=103686&action=edit dmesg before the crash occured (clean boot)
hope this is fixed soon-ish & you guys know why this occurs,
otherwise I'll have to swap this card for production out
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #64 from jackdachef@gmail.com --- Created attachment 103687 --> https://bugs.freedesktop.org/attachment.cgi?id=103687&action=edit Xorg.0.log with some (hopefully useful) information about the crashes
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #65 from jackdachef@gmail.com --- Created attachment 103695 --> https://bugs.freedesktop.org/attachment.cgi?id=103695&action=edit Xorg.0.log from the same session (several lockups and partially successful recovery attempts, finally no X startup possible anymore)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #66 from jackdachef@gmail.com --- Created attachment 103696 --> https://bugs.freedesktop.org/attachment.cgi?id=103696&action=edit this is the Xorg.0.log after X finally crashed after all the recoveries and xdm/slim/startx couldn't be successfully launched up anymore (it stayed in VT)
it's basically the same behavior like with drm-next-3.17-wip - only, that it this time took longer until X couldn't be started up anymore
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #67 from Alex Deucher agd5f@yahoo.com --- I think the main issue this bug was for (VM page table stability regression) is fixed at this point. The remaining issues seem to be due to video playback as that seems to be the common trigger in the last few comments. You might try not using vdpau for video playback to see if that helps to narrow it down. Someone should probably open a new bug for the videoplay back stability as this bug is starting to get unwieldy and has become a dumping ground for anything.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #68 from jackdachef@gmail.com --- Created attachment 103697 --> https://bugs.freedesktop.org/attachment.cgi?id=103697&action=edit whole dmesg output, at the end ("still active bo inside vm", "couldn't schedule ib") X can't be launched up anymore - it simply stays in VT
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #69 from jackdachef@gmail.com --- (In reply to comment #67)
I think the main issue this bug was for (VM page table stability regression) is fixed at this point. The remaining issues seem to be due to video playback as that seems to be the common trigger in the last few comments. You might try not using vdpau for video playback to see if that helps to narrow it down. Someone should probably open a new bug for the videoplay back stability as this bug is starting to get unwieldy and has become a dumping ground for anything.
even with those last lines from #68 ?
I've suspected for some time, that it must be related to videoplayback and uvd - so it should improve with vdpau disabled, right ?
but those last lines got me thinking that it could be something else
in the end (after those recovery attempts)
triggers (where the screen turned black) were
- switching between programs (alt+tab) - simply entering text in a note (gnote) [at least twice] - ... others I currently don't remember
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #70 from jackdachef@gmail.com --- not sure but the flash version in chromium/google-chrome isn't using vdpau at all, right ?
can't remember having viewed any video in a video-player accelerated with vdpau during those crashes (not even before)
so it's not related to vdpau, at least in my case
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #71 from Alex Deucher agd5f@yahoo.com --- This bug was originally about a stability regression due to some GPUVM changes in 3.15. If 3.14 is stable for you but newer kernels are not, then it may be related. Otherwise, it's probably another issue. See bug 81644 about stability issues with Chromium specifically.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #72 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #70)
not sure but the flash version in chromium/google-chrome isn't using vdpau at all, right ?
To be sure start chrome with the env VDPAU_TRACE=1 and play something, there will be lots of debugging to see if it does use it.
I've tried quite hard to crash and so far failed, but then I am using seamonkey, and it seems I am also a few commits off head now.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #73 from darkbasic darkbasic@linuxsystems.it --- Alex I didn't use vdpau when playing back, also I got an X freeze even without playing back anything: I was just starting Android Studio, PyCharm and Netbeans (I often get freezes when starting these editors).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #74 from jackdachef@gmail.com --- (In reply to comment #71)
This bug was originally about a stability regression due to some GPUVM changes in 3.15. If 3.14 is stable for you but newer kernels are not, then it may be related. Otherwise, it's probably another issue. See bug 81644 about stability issues with Chromium specifically.
unfortunately 3.14 also isn't entirely stable it mostly is (99%) but here the problem is that very very seldomly X simply crashes, gpu doesn't simply turn black and recovers, so important data can be lost (when being worked on) - couldn't pinpoint the reason yet - the gpu just "reboots"
no additional error messages dmesg or Xorg.0.log as far as I know
can't say anything about kernel versions prior to that since this card is still a few days old
gcc 4.9 also couldn't be the cause, at least with 3.16-rc* kernels and drm-next-3.17-rebased-on-fixes, since I've already added the patch manually and the kernel also isn *NOT* compiled with -Os (optimize for size)
it seems to be more of general instability going towards 3.15 and 3.16 - but who I am to ask, I'm just a enthusiast user :P
having read about stability issues with the new firmware somewhere
how could I test and revert to the old firmware ? would it still work ?
simply removing the PITCAIRN_mc2.bin (e.g. for the R9 270X) and leaving the other PITCAIRN files in /lib/firmware ?
the graphics driver is loading as a module and not compiled into the kernel or included in initramfs
I'd really like to upgrade to 3.16-rc* due to recent changes (especially in connection with Btrfs)
an option to disable UVD entirely would have been nice (when still using 5850 and during the new introduction of DPM there was a patch which offered a module parameter to turn it manually off) - would that be an option to further troubleshoot this issue and to exclude UVD from the list of potential causes ?
Thanks
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #75 from agapito tcxjy@vomoto.com --- Well, still not fixed in 3.16 rc7 :(
I was using steam (wine) and the bug reappears. Grey garbage in my screen and a hard lock up. I had to reboot pressing reset button.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #76 from jackdachef@gmail.com --- Created attachment 103776 --> https://bugs.freedesktop.org/attachment.cgi?id=103776&action=edit DVD watching (150 minutes) on fluxbox with only konsole & smplayer with vdpau running
so obviously vdpau is *NOT* the problem here the whole movie played fine and the whole box didn't lockup without any suspicious messages, data or behavior on dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #77 from jackdachef@gmail.com --- Created attachment 103777 --> https://bugs.freedesktop.org/attachment.cgi?id=103777&action=edit output of Xorg.0.log during uvd-test with DVD watching in fluxbox via smplayer
mark the "extra" explicitely set settings:
[ 253.915] (**) RADEON(0): Option "EnablePageFlip" "off" [ 253.915] (**) RADEON(0): Option "ColorTiling" "on" [ 253.915] (**) RADEON(0): Option "AccelMethod" "Glamor" [ 253.915] (**) RADEON(0): Option "EXAVSync" "off" [ 253.915] (**) RADEON(0): Option "EXAPixmaps" "on" [ 253.915] (**) RADEON(0): Option "SwapbuffersWait" "on"
I read about commits from 3.14 to 3.16-rc* that mentioned pageflipping changes but disabling that didn't make a change
*before* watching the DVD
I did another test
and accidentally watched a short portion of a HTML video on youtube with chromium (actually only had intended to read lkml, but there was a reference link and didn't mention that it was a video /facepalm), did some browsing, had gnote running (and some more which I currently don't remember)
adobe flash was explicitely turned *off* via about:plugins
and X hardlocked while playing music (and mp3, via audacious) & browsing via chromium
so at least in my case more and more signs seem to lead towards bug 81644 (https://bugs.freedesktop.org/show_bug.cgi?id=81644) and that chromium + html video; or chromium and all sorts of video (vdpau doesn't apply here ?!) triggers instability & (hard)locks
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #78 from jackdachef@gmail.com --- just saw that there are new updates in
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-wip &
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-rebased-on-fix...
hopefully the changes make a difference, will try out the new drm-next-3.17-rebased-on-fixes in approximately a day or later
thanks guys !
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #79 from Andy Furniss adf.lists@gmail.com --- Created attachment 103818 --> https://bugs.freedesktop.org/attachment.cgi?id=103818&action=edit Oops on fbcon load drm-next-3.17-rebased-on-fixex
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #80 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #78)
just saw that there are new updates in
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-wip &
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-rebased-on- fixes
hopefully the changes make a difference, will try out the new drm-next-3.17-rebased-on-fixes in approximately a day or later
thanks guys !
Well I just tried drm-next-3.17-rebased-on-fixes and it died as soon as fbcon loaded. Screen full junk and an oops logged.
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #81 from Michel Dänzer michel@daenzer.net --- (In reply to comment #80)
Well I just tried drm-next-3.17-rebased-on-fixes and it died as soon as fbcon loaded. Screen full junk and an oops logged.
Please file a new report for that, it's nothing to do with random crashes in X.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #82 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #80)
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
When did you download it? The header format changed last week and I uploaded a new version.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #83 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #82)
(In reply to comment #80)
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
When did you download it? The header format changed last week and I uploaded a new version.
Ahh, it was longer ago that that, will try newer.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Andy Furniss adf.lists@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #103818|0 |1 is obsolete| |
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #84 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #83)
(In reply to comment #82)
(In reply to comment #80)
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
When did you download it? The header format changed last week and I uploaded a new version.
Ahh, it was longer ago that that, will try newer.
Booting OK with new firmware.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #85 from jackdachef@gmail.com --- (In reply to comment #82)
(In reply to comment #80)
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
When did you download it? The header format changed last week and I uploaded a new version.
does this apply to PITCAIRN gpus, too ?
I only see firmware up to April (PITCAIRN_mc2.bin)
kindly link to the new firmware files so that I can update to it, too
update:
posted new information concerning this case (or the issue with chromium) in #81644 (https://bugs.freedesktop.org/show_bug.cgi?id=81644)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #86 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #85)
(In reply to comment #82)
(In reply to comment #80)
I have got the new firmware (unless there's an even newer one) and have booted into the other 3.17 before - but it eventually crashed due to missing fixes.
When did you download it? The header format changed last week and I uploaded a new version.
does this apply to PITCAIRN gpus, too ?
I only see firmware up to April (PITCAIRN_mc2.bin)
kindly link to the new firmware files so that I can update to it, too
http://people.freedesktop.org/~agd5f/radeon_ucode/ucode.tar.gz
The new firmwares have lowercase names so you can keep the old ones in place for kernels < 3.17.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #87 from jackdachef@gmail.com --- (In reply to comment #86)
(In reply to comment #85)
(In reply to comment #82)
does this apply to PITCAIRN gpus, too ?
I only see firmware up to April (PITCAIRN_mc2.bin)
kindly link to the new firmware files so that I can update to it, too
http://people.freedesktop.org/~agd5f/radeon_ucode/ucode.tar.gz
The new firmwares have lowercase names so you can keep the old ones in place for kernels < 3.17.
thanks a lot
running now with latest firmware, unfortunately
the trigger still seems to be chromium & the whole box hardlocks ...
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #88 from jackdachef@gmail.com --- just crashed yesterday with chromium only being started up,
and browsing some random wallpapers with firefox 31 (not even fully opened, only previews)
both firefox & chromium had hardware acceleration/webgl disabled
Desktop: Xfce4 + compiz-fusion (opengl composited desktop)
unfortunately no dmesg info
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #89 from agapito tcxjy@vomoto.com --- Still present in the final 3.16 kernel. This bug is really crazy. I have a lot of hard lockups playing Age of Empires HD using windows steam. I can't provide any log or debug info because my machine die completely.
My hardware is: Gigabyte HD 7950, using HDMI output on Archlinux + KDE + Mesa 10.2.5 + xserver 1.16.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #90 from jackdachef@gmail.com --- (In reply to comment #89)
Still present in the final 3.16 kernel. This bug is really crazy. I have a lot of hard lockups playing Age of Empires HD using windows steam. I can't provide any log or debug info because my machine die completely.
My hardware is: Gigabyte HD 7950, using HDMI output on Archlinux + KDE + Mesa 10.2.5 + xserver 1.16.
make sure you try out the very latest & greatest
http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-3.17-rebased-on-fix...
with userptr support
these changes are supposd to be performance-enhanced improvements (only ?) but also seem raise stability for me ( bug #81612 ):
information on this: http://lists.freedesktop.org/archives/intel-gfx/2014-February/040513.html
I'll go ahead and test other user-cases, see whether those are also stable now
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #91 from agapito tcxjy@vomoto.com --- Created attachment 104145 --> https://bugs.freedesktop.org/attachment.cgi?id=104145&action=edit dmesg output drm-next-3.17
Using drm-next-3.17 my xorg crashes. I don't know if is the same bug. But here is my dmesg output (sorry about the quality)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #92 from Chernovsky Oleg adonai@xaker.ru --- I've successfully reproduced this bug
I'm on 3.16 kernel, mesa 10.2.5.
I start Qt5 QtCreator (as I know it uses OpenGL acceleration for some QML elements), then I start any 3D-demanding app (like one of Valve's game titles). After that I close app. And when I close QtCreator, this bug occurs, total system hang, ring 0 freezes and does not wake up on soft reset.
Where should I look at to fix it? I assume it hids somewhere in GPUVM? (or is it needed anymore or is it already fixed?)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #93 from farmboy0+freedesktop@googlemail.com --- With 3.17-rc1 my desktop has been mostly stable.
Havent experienced any more random deadlocks besides one that left this in Xorg.log: (EE) Backtrace: (EE) 0: /usr/bin/X (QueuePointerEvents+0x52) [0x454ad2] (EE) 1: /usr/lib64/xorg/modules/input/evdev_drv.so (_init+0x2dd4) [0x7f53cf900254] (EE) 2: /usr/bin/X (DPMSSupported+0xd8) [0x47c738] (EE) 3: /usr/bin/X (xf86SerialModemClearBits+0x1ca) [0x4a6e6a] (EE) 4: /lib64/libpthread.so.0 (funlockfile+0x70) [0x7f53daabf73f] (EE) 5: /lib64/libc.so.6 (ioctl+0x7) [0x7f53d97dc1c7] (EE) 6: /usr/lib64/libdrm.so.2 (drmIoctl+0x30) [0x7f53da8a5310] (EE) 7: /usr/lib64/libdrm.so.2 (drmCommandWrite+0x1b) [0x7f53da8a799b] (EE) 8: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x4c3824) [0x7f53d4be7eb5] (EE) 9: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x4c4573) [0x7f53d4be9810] (EE) 10: /usr/lib64/dri/radeonsi_dri.so (radeon_drm_winsys_create+0x1b46) [0x7f53d47364f2] (EE) 11: /usr/lib64/dri/radeonsi_dri.so (radeon_drm_winsys_create+0xc909) [0x7f53d474bbde] (EE) 12: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x3f3888) [0x7f53d4a47f8d] (EE) 13: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x1e695b) [0x7f53d462e122] (EE) 14: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x1e7f7f) [0x7f53d4630be6] (EE) 15: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x114a46) [0x7f53d448a250] (EE) 16: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x1164d1) [0x7f53d448d749] (EE) 17: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x1eb007) [0x7f53d4636764] (EE) 18: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x116e11) [0x7f53d448e487] (EE) 19: /usr/lib64/dri/radeonsi_dri.so (__driDriverGetExtensions_swrast+0x116e6a) [0x7f53d448eb5e] (EE) 20: /usr/lib64/libglamor.so.0 (glamor_poly_segment_nf+0x2d22) [0x7f53d7f36ea2] (EE) 21: /usr/lib64/libglamor.so.0 (glamor_poly_segment_nf+0x335b) [0x7f53d7f37abb] (EE) 22: /usr/lib64/libglamor.so.0 (glamor_add_traps_nf+0x36d) [0x7f53d7f2f22d] (EE) 23: /usr/bin/X (miFillUniqueSpanGroup+0x1a08) [0x58ce28] (EE) 24: /usr/bin/X (xf86I2CGetScreenBuses+0x1e3a) [0x4cf17a] (EE) 25: /usr/bin/X (dixDestroyPixmap+0x19d9) [0x43ae19] (EE) 26: /usr/bin/X (SendErrorToClient+0x2ff) [0x43c8ff] (EE) 27: /usr/bin/X (remove_fs_handlers+0x42d) [0x440bcd] (EE) 28: /lib64/libc.so.6 (__libc_start_main+0xf5) [0x7f53d971add5] (EE) 29: /usr/bin/X (_start+0x29) [0x42a7f1] (EE) 30: ? (?+0x29) [0x29] (EE) (EE) [mi] EQ overflow continuing. 1000 events have been dropped. (EE) [mi] No further overflow reports will be reported until the clog is cleared.
I am using xorg 1.15.1 and glamor from git.
Havent tried many opengl applications yet besides pale moon(firefox clone) with hardware acceleration and some short wine session.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #94 from agapito tcxjy@vomoto.com --- I had this bug again using 3.17-rc1. Still not fixed.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #95 from farmboy0+freedesktop@googlemail.com --- Is there some way to debug this? Debug messages from the kernel/mesa/whatever to activate to find out what goes on with the radeon driver. Some /sys files to monitor or something?
Please let us help you fix this bug but for me at least it was always a complete deadlock with nothing in the logs to indicate what went wrong.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #96 from Maciej gutigen@outlook.com --- Yes, I would love to help debug this issue. Atm mesa git is completely unusable on radeonsi.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #97 from darkbasic darkbasic@linuxsystems.it --- I do not even use radeonsi anymore because of this bug, please don't understimate the effect it has on your userbase.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Christian König deathsimple@vodafone.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |mmstickman@gmail.com
--- Comment #98 from Christian König deathsimple@vodafone.de --- *** Bug 82886 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #99 from Malte Schröder maltesch@gmx.de --- I am not sure if this is related, but what I see _a lot_ recently in my kernel log is this:
[ 554.747835] [TTM] Illegal buffer object size [ 554.747837] [TTM] Illegal buffer object size [ 554.747838] [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (0, 6, 4096, -22)
This is on X.org 1.16.0, radeon_drv 7.4.99 with glamor enabled, mesa 10.2.5 and drm-next-3.17.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #100 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #99)
I am not sure if this is related, but what I see _a lot_ recently in my kernel log is this:
[ 554.747835] [TTM] Illegal buffer object size [ 554.747837] [TTM] Illegal buffer object size [ 554.747838] [drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (0, 6, 4096, -22)
Unrelated. You are seeing bug 82162 which is already fixed.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #101 from Maximilian Böhm winlux@gmail.com --- Hey, my system is running nearly crash free for three days now with Linux 3.17 RC1. The crashes were related to VLC and VDPAU and something got stuck but my system feels pretty stable now. If you are on Arch Linux compile linux-mainline from AUR and try for yourself (don't forget updating GRUB).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #102 from darkbasic darkbasic@linuxsystems.it --- 3.17-rc1 + drm-fixes seems MUCH more stable than ever. I will let you know if things keeps going on well.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #103 from darkbasic darkbasic@linuxsystems.it --- I just saw this in dmesg: radeon 0000:01:00.0: Packet0 not allowed!
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #104 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #103)
I just saw this in dmesg: radeon 0000:01:00.0: Packet0 not allowed!
Some userspace component is generating an invalid command stream. Probably a bad packet count somewhere.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #105 from Chernovsky Oleg adonai@xaker.ru --- So, is this bug finally fixed? I saw some activity, but was it a real fix or only a workaround?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #106 from Alex Deucher agd5f@yahoo.com --- (In reply to comment #105)
So, is this bug finally fixed? I saw some activity, but was it a real fix or only a workaround?
I think thus bug has become largely useless. It's become a general dumping ground for any sort of problem with radeonsi.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #107 from agapito tcxjy@vomoto.com --- 3.17 rc1 and 3.16.1 are still affected by this bug, but they are more stable than previous kernels.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #108 from darkbasic darkbasic@linuxsystems.it --- The original bug wasn't fixed in 3.16-rc but it seems fixed in 3.17+drm-fixes. I will let you know after a couple of days, I'm really happy I can use radeonsi once more. With 3.16-rc I had crashes after a few minutes of usage.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #109 from Chernovsky Oleg adonai@xaker.ru --- Yep. I tried to repeat previously described crashes (including my own) on 3.17-rc1 with your drm-fixes branch and failed. No gfx ring lockup anymore.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #110 from Chernovsky Oleg adonai@xaker.ru --- Alex, I'm still curious, what was the original problem that caused this bug?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #111 from Christian König deathsimple@vodafone.de --- (In reply to comment #110)
Alex, I'm still curious, what was the original problem that caused this bug?
Well that was the problem: A couple of different things!
We have an long outstanding issue with TLB poisoning, a couple of bugs related to dynamically allocating page tables and I think one or two userspace issues mixed into a single bugreport.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #112 from Chernovsky Oleg adonai@xaker.ru --- Thanks, Christian, so I assume, all those issues were fixed and TLB poisoning was workarounded for now?
I'm asking because I'm currently digging through the source trying to figure out the full picture.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #113 from mmstickman@gmail.com --- I've tried kernel 3.17-rc1 and I still get Xorg crashing with my Radeon HD 7950. After idling in KDE overnight this error happened:
24726.971466] radeon 0000:01:00.0: ring 0 stalled for more than 10000msec [24726.971476] radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000020375d last fence id 0x0000000000203756 on ring 0) [24727.506525] radeon 0000:01:00.0: Saved 493 dwords of commands on ring 0. [24727.506572] radeon 0000:01:00.0: GPU softreset: 0x0000006C [24727.506574] radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 [24727.506576] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 [24727.506578] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 [24727.506580] radeon 0000:01:00.0: SRBM_STATUS = 0x20000AC0 [24727.506614] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [24727.506616] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [24727.506618] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010000 [24727.506620] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000006 [24727.506622] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80018647 [24727.506624] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44483106 [24727.506626] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C84206 [24727.506628] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [24727.506630] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [24727.506634] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c03d6dc0 flags=0x0010] [24727.506646] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c03d6df0 flags=0x0030] [24727.506651] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c0000100 flags=0x0030] [24727.506655] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c03d6c00 flags=0x0010] [24727.506659] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c03d6c80 flags=0x0010] [24727.506663] AMD-Vi: Event logged [IO_PAGE_FAULT device=01:00.0 domain=0x0016 address=0x00000000c03d6c40 flags=0x0010] [24728.034963] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF [24728.035016] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140 [24728.036173] radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 [24728.036175] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 [24728.036177] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 [24728.036179] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [24728.036213] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [24728.036215] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [24728.036217] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [24728.036219] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 [24728.036221] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 [24728.036223] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [24728.036224] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [24728.036322] radeon 0000:01:00.0: GPU reset succeeded, trying to resume [24728.085248] [drm] probing gen 2 caps for device 1002:5a16 = 31cd02/0 [24728.085250] [drm] PCIE gen 2 link speeds already enabled [24728.086416] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000). [24728.086542] radeon 0000:01:00.0: WB enabled [24728.086545] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x00000000c0000c00 and cpu addr 0xffff880420943c00 [24728.086547] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x00000000c0000c04 and cpu addr 0xffff880420943c04 [24728.086549] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x00000000c0000c08 and cpu addr 0xffff880420943c08 [24728.086550] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x00000000c0000c0c and cpu addr 0xffff880420943c0c [24728.086552] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x00000000c0000c10 and cpu addr 0xffff880420943c10 [24728.086936] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc90011d35a18 [24728.256690] [drm] ring test on 0 succeeded in 1 usecs [24728.256694] [drm] ring test on 1 succeeded in 1 usecs [24728.256698] [drm] ring test on 2 succeeded in 1 usecs [24728.256758] [drm] ring test on 3 succeeded in 2 usecs [24728.256765] [drm] ring test on 4 succeeded in 2 usecs [24728.454199] [drm] ring test on 5 succeeded in 2 usecs [24728.454204] [drm] UVD initialized successfully. [24728.454233] radeon 0000:01:00.0: GPU fault detected: 146 0x03ee700c [24728.454235] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x0000019F [24728.454236] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0E07000C [24728.454238] VM fault (0x0c, vmid 7) at page 415, read from CP (112) [24738.469792] radeon 0000:01:00.0: ring 0 stalled for more than 10000msec [24738.469801] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000203762 last fence id 0x0000000000203756 on ring 0) [24738.469823] [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35). [24738.469829] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35). [24738.469833] radeon 0000:01:00.0: ib ring test failed (-35). [24738.976051] radeon 0000:01:00.0: GPU softreset: 0x00000048 [24738.976054] radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 [24738.976056] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 [24738.976058] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 [24738.976060] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [24738.976094] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [24738.976096] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [24738.976098] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010100 [24738.976100] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000086 [24738.976102] radeon 0000:01:00.0: R_008680_CP_STAT = 0x80018647 [24738.976104] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [24738.976106] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [24738.976108] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [24738.976110] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 [24739.475375] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF [24739.475428] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100 [24739.476585] radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 [24739.476587] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 [24739.476588] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 [24739.476590] radeon 0000:01:00.0: SRBM_STATUS = 0x200000C0 [24739.476625] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 [24739.476627] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 [24739.476629] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 [24739.476631] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 [24739.476632] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 [24739.476634] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 [24739.476636] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 [24739.476720] radeon 0000:01:00.0: GPU reset succeeded, trying to resume [24739.493947] [drm] probing gen 2 caps for device 1002:5a16 = 31cd02/0 [24739.493950] [drm] PCIE gen 2 link speeds already enabled [24739.495109] [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000). [24739.495231] radeon 0000:01:00.0: WB enabled [24739.495233] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x00000000c0000c00 and cpu addr 0xffff880420943c00 [24739.495235] radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x00000000c0000c04 and cpu addr 0xffff880420943c04 [24739.495237] radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x00000000c0000c08 and cpu addr 0xffff880420943c08 [24739.495239] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x00000000c0000c0c and cpu addr 0xffff880420943c0c [24739.495241] radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x00000000c0000c10 and cpu addr 0xffff880420943c10 [24739.495655] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc90011d35a18 [24739.665510] [drm] ring test on 0 succeeded in 1 usecs [24739.665515] [drm] ring test on 1 succeeded in 1 usecs [24739.665519] [drm] ring test on 2 succeeded in 1 usecs [24739.665578] [drm] ring test on 3 succeeded in 2 usecs [24739.665586] [drm] ring test on 4 succeeded in 2 usecs [24739.863021] [drm] ring test on 5 succeeded in 1 usecs [24739.863027] [drm] UVD initialized successfully. [24739.863059] [drm] ib test on ring 0 succeeded in 0 usecs [24739.863077] [drm] ib test on ring 1 succeeded in 0 usecs [24739.863095] [drm] ib test on ring 2 succeeded in 0 usecs [24739.863113] [drm] ib test on ring 3 succeeded in 0 usecs [24739.863156] [drm] ib test on ring 4 succeeded in 0 usecs [24750.048352] radeon 0000:01:00.0: ring 5 stalled for more than 10000msec [24750.048362] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring 5) [24750.048368] [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35). [24750.048375] [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35). [24750.048395] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed [25118.213731] radeon 0000:01:00.0: ring 5 stalled for more than 377576msec [25118.213741] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000003 last fence id 0x0000000000000002 on ring 5)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #114 from Chernovsky Oleg adonai@xaker.ru --- (In reply to comment #113)
I've tried kernel 3.17-rc1 and I still get Xorg crashing with my Radeon HD
did you try vanilla rc1 or with Alex's drm-fixes branch applied?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #115 from darkbasic darkbasic@linuxsystems.it --- Please use drm-fixes because it's what we are talking about.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #116 from farmboy0+freedesktop@googlemail.com --- Will the drm-fixes branch be part of 3.17-rc{2,}? Where is the repo for the branch located?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #117 from darkbasic darkbasic@linuxsystems.it --- Here is drm-fixes and yes, it will be part of -rc2: http://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-fixes-3.17
Unfortunately I've been too eager to declare drm-fixes stable because I just got an X freeze: [139998.633243] traps: chrome[11324] general protection ip:7f8f8f1790b0 sp:7fffba3d5610 error:0 in .radeonsi_dri.so._portage_merge_.20817 (deleted)[7f8f8ef8c000+5de000]
I was simply running an "emerge --sync" while browsing a couple of simple web pages.
Any chance to fix this? If yes I'm willing to open a new bug for this.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #118 from darkbasic darkbasic@linuxsystems.it --- [121054.909144] radeon 0000:01:00.0: ring 3 stalled for more than 10000msec [121054.909150] radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000983653 last fence id 0x000000000098364e on ring 3)
Just got another crash with drm-fixes-3.17: ironically I was replying to a Phoronix user telling him that drm-fixes-3.17 is finally stable LOL
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #119 from darkbasic darkbasic@linuxsystems.it --- I was browsing while compiling and once again it crashed: [57534.191174] Watchdog[29216]: segfault at 0 ip 00007f402637ae58 sp 00007f40131e1810 error 6 in chrome[7f40228fa000+590d000] [57544.227442] Watchdog[10690]: segfault at 0 ip 00007f1ffab93e58 sp 00007f1fe79fa810 error 6 in chrome[7f1ff7113000+590d000]
Does it happen with other browsers than chrome/chromium? Do someone use firefox?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #120 from darkbasic darkbasic@linuxsystems.it --- Created attachment 105282 --> https://bugs.freedesktop.org/attachment.cgi?id=105282&action=edit Xorg.0.log after X freeze
I found a backtrace in Xorg.0.log
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #121 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #119)
I was browsing while compiling and once again it crashed: [57534.191174] Watchdog[29216]: segfault at 0 ip 00007f402637ae58 sp 00007f40131e1810 error 6 in chrome[7f40228fa000+590d000] [57544.227442] Watchdog[10690]: segfault at 0 ip 00007f1ffab93e58 sp 00007f1fe79fa810 error 6 in chrome[7f1ff7113000+590d000]
Does it happen with other browsers than chrome/chromium? Do someone use firefox?
I've been stable for some time now using seamonkey, but than I use flashblock and don't usually use it for vid. Saying that I have deliberately tried and failed to crash playing vids since this bug started.
I think there's another bug for chromium issues.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #122 from langkamp@tomblog.de --- (In reply to comment #119)
I was browsing while compiling and once again it crashed: [57534.191174] Watchdog[29216]: segfault at 0 ip 00007f402637ae58 sp 00007f40131e1810 error 6 in chrome[7f40228fa000+590d000] [57544.227442] Watchdog[10690]: segfault at 0 ip 00007f1ffab93e58 sp 00007f1fe79fa810 error 6 in chrome[7f1ff7113000+590d000]
Does it happen with other browsers than chrome/chromium? Do someone use firefox?
Yes, me. We just spoke at github. Never had x crashes or lockups.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #123 from langkamp@tomblog.de ---
Yes, me. We just spoke at github. Never had x crashes or lockups.
My specs (was in a hurry yesterday): HD7950 Core2Quad Kernel 3.16 (openSUSE Factory, x64, KDE+desktopeffects) Rest from git (mesa, llvm, ati, xserver)
did not test chrome, but firefox and its old flash (11.2) or steam games never gave me x-crashes or lockups on radeonSI. Also earlier kernel versions (openSUSE Tumbleweed) made no problem.
cheers tomtomme
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #124 from Grigori Goronzy greg@chown.ath.cx --- I can very quickly, almost deterministically, hang the GPU (radeonsi, Cape Verde) with the following command:
LIBGL_ALWAYS_SOFTWARE=1 mpv --fs --vo=opengl:sw /path/to/some_video
this works on both 3.16.0 and 3.17rc3. Try seeking, it often happens directly after a seek. In most cases, the hang is unrecoverable and crashes the kernel after some "atombios stuck in a loop" messages. Very strange indeed, software rendered glxgears doesn't cause this.
Can anyone verify? A somewhat reliable test case might be a good start to finally fixing this.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #125 from Christian König deathsimple@vodafone.de --- (In reply to comment #124)
I can very quickly, almost deterministically, hang the GPU (radeonsi, Cape Verde) with the following command:
LIBGL_ALWAYS_SOFTWARE=1 mpv --fs --vo=opengl:sw /path/to/some_video
this works on both 3.16.0 and 3.17rc3. Try seeking, it often happens directly after a seek. In most cases, the hang is unrecoverable and crashes the kernel after some "atombios stuck in a loop" messages. Very strange indeed, software rendered glxgears doesn't cause this.
Can anyone verify? A somewhat reliable test case might be a good start to finally fixing this.
Well this is interesting, so you're saying that using software rendering on the client side can crash the GPU? That only leaves glamor and maybe the compositor as the only one using the hardware driver.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #126 from darkbasic darkbasic@linuxsystems.it --- (In reply to comment #124)
I can very quickly, almost deterministically, hang the GPU (radeonsi, Cape Verde) with the following command:
LIBGL_ALWAYS_SOFTWARE=1 mpv --fs --vo=opengl:sw /path/to/some_video
I'm sorry but I can't reproduce it with Tahiti (HD7950).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #127 from Andy Furniss adf.lists@gmail.com --- (In reply to comment #126)
(In reply to comment #124)
I can very quickly, almost deterministically, hang the GPU (radeonsi, Cape Verde) with the following command:
LIBGL_ALWAYS_SOFTWARE=1 mpv --fs --vo=opengl:sw /path/to/some_video
I'm sorry but I can't reproduce it with Tahiti (HD7950).
I also can't reproduce with pitcairn (R9270X)
On agd5f drm-next-3.18-wip, git mesa,llvm,ddx,glamor. Xorg couple of months old.
Tried with mplayer and 2 versions of mpv.
Running fluxbox, so no compositor.
Other differences - I guess mesa/llvmpipe uses different sse for me on older CPU (phenom II x4 965).
Screen res? I am testing 1920x1080@60Hz.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #128 from Grigori Goronzy greg@chown.ath.cx --- You might want to try the patch in https://bugs.freedesktop.org/show_bug.cgi?id=83500
Maybe some of these issues have a common cause.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #129 from AdrianG raziel_theripper@yahoo.com --- Radeon 8550g/8670m - doesn't get passed login screen with 3.17-rc3. At least in rc1 I could get to the desktop but then it would almost immediately hang. (distro: Ubuntu 14.04 standard + Gnome 3.2).
Works like a charm on kernel 3.14*
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #130 from Marti Raudsepp marti@juffo.org --- Created attachment 105926 --> https://bugs.freedesktop.org/attachment.cgi?id=105926&action=edit double-hang after "failed to get a new IB (-35)"
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #131 from Marti Raudsepp marti@juffo.org --- Created attachment 105927 --> https://bugs.freedesktop.org/attachment.cgi?id=105927&action=edit GPU lockup followed by "GPU fault detected: 147"
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #132 from darkbasic darkbasic@linuxsystems.it --- This is 100% reproducible: just start 3DMark2003 with a gallium nine enabled wine (and mesa of course) and it will crash your whole system: https://bugs.freedesktop.org/show_bug.cgi?id=83800
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #133 from darkbasic darkbasic@linuxsystems.it --- When radeonsi in the right mood (I guess something triggers an higher instability) then Chrome becomes completely unusable: opening simple urls without videos is enough to trigger segfaults. The system hangs for a few seconds and then it keeps working normally until the next segfault.
Kernel is drm-next-3.18, but I had similar behaviour with any kernel >= 3.15
[ 2122.073875] Watchdog[3418]: segfault at 0 ip 00007f6ceb0e90f8 sp 00007f6cd820e810 error 6 in chrome[7f6ce744e000+5c1c000] [ 3579.978613] Watchdog[7831]: segfault at 0 ip 00007f05578ef0f8 sp 00007f0544a14810 error 6 in chrome[7f0553c54000+5c1c000] [ 3665.106245] Watchdog[8079]: segfault at 0 ip 00007f18609c70f8 sp 00007f184daec810 error 6 in chrome[7f185cd2c000+5c1c000] [ 4352.155130] Watchdog[10289]: segfault at 0 ip 00007fcf005000f8 sp 00007fceed625810 error 6 in chrome[7fcefc865000+5c1c000] [ 4387.874001] Watchdog[26191]: segfault at 0 ip 00007fbcf4cca0f8 sp 00007fbce1def810 error 6 in chrome[7fbcf102f000+5c1c000] [ 4434.438550] Watchdog[4605]: segfault at 0 ip 00007fa1f585c0f8 sp 00007fa1e2981810 error 6 in chrome[7fa1f1bc1000+5c1c000] [16362.244095] Watchdog[25058]: segfault at 0 ip 00007f8e3cf6a0f8 sp 00007f8e2a08f810 error 6 in chrome[7f8e392cf000+5c1c000] [16386.333329] Watchdog[25301]: segfault at 0 ip 00007fd1e34500f8 sp 00007fd1d0575810 error 6 in chrome[7fd1df7b5000+5c1c000] [16495.014110] Watchdog[25410]: segfault at 0 ip 00007f7b9bc4b0f8 sp 00007f7b88d70810 error 6 in chrome[7f7b97fb0000+5c1c000] [16581.203809] Watchdog[25675]: segfault at 0 ip 00007f9489d2e0f8 sp 00007f9476e53810 error 6 in chrome[7f9486093000+5c1c000] [16679.412852] Watchdog[25702]: segfault at 0 ip 00007fa3338c50f8 sp 00007fa3209ea810 error 6 in chrome[7fa32fc2a000+5c1c000] [16758.080893] Watchdog[25824]: segfault at 0 ip 00007fc4d3d8e0f8 sp 00007fc4c0eb3810 error 6 in chrome[7fc4d00f3000+5c1c000] [16782.192107] Watchdog[26032]: segfault at 0 ip 00007ffe9d6af0f8 sp 00007ffe8a7d4810 error 6 in chrome[7ffe99a14000+5c1c000] [16796.275309] Watchdog[26161]: segfault at 0 ip 00007f75faa130f8 sp 00007f75e7b38810 error 6 in chrome[7f75f6d78000+5c1c000]
Next week I will be able to provide remote access if needed.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #134 from edward.81@gmail.com --- I get several crash every day. Right now I'm using the kernel 3.17 with mesa git from ~lcarlier repository (arch linux + kde4) on my radeon hd7970. Switching to stable kernel and mesa don't change the situation. I don't know if this is the same error but I leave my dmesg log from latest crash. The system stop responding and the monitor output garbage. Sometime I can switch to TTY to reboot. Most of time I can't. http://pastebin.com/nkKNQE1f
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #135 from agapito tcxjy@vomoto.com --- This bug still happens in kernel 3.17 rc7 with mesa 10.4
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #136 from agapito tcxjy@vomoto.com --- With the kernel 3.17 rc7 the crashes are a lot more frequent than 3.16.4 kernel.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #137 from Jacob jacobsvenningsen15@hotmail.com --- About three months ago I stumbled upon this very same bug. The system will just totally lockup, and I pretty much have to force restart the machine. It seems to happen on any kernel newer than 3.13, least in my case. I've tried with mesa 10.4, 10.3, 10.2 and even 10.1 and nothing seems to fix the problem. It was only when I started changing the kernel versions, that I finally found 3.13 to be stable.
Been using the Oibaf PPA and Linux 3.13 for the past month and a half, and I've not experienced a single crash, but when I try any new kernel releases it won't be long until I experience yet another system lockup.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #138 from darkbasic darkbasic@linuxsystems.it --- Oct 7 17:17:30 gentoo-desktop kernel: [11674.296906] radeon 0000:01:00.0: ring 0 stalled for more than 10000msec Oct 7 17:17:30 gentoo-desktop kernel: [11674.296912] radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000022e490 last fence id 0x000000000022e49f on ring 0) Oct 7 17:17:31 gentoo-desktop kernel: [11674.706707] radeon 0000:01:00.0: Saved 657 dwords of commands on ring 0. Oct 7 17:17:31 gentoo-desktop kernel: [11674.706750] radeon 0000:01:00.0: GPU softreset: 0x000000EC Oct 7 17:17:31 gentoo-desktop kernel: [11674.706751] radeon 0000:01:00.0: GRBM_STATUS = 0xA0003028 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706752] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706753] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706754] radeon 0000:01:00.0: SRBM_STATUS = 0x200040C0 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706788] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706789] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706790] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00010000 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706791] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00400002 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706793] radeon 0000:01:00.0: R_008680_CP_STAT = 0x84010243 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706794] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x60C83146 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706795] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44E84246 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706796] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11674.706797] radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.106010] radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF Oct 7 17:17:31 gentoo-desktop kernel: [11675.106071] radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00108140 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107217] radeon 0000:01:00.0: GRBM_STATUS = 0x00003028 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107218] radeon 0000:01:00.0: GRBM_STATUS_SE0 = 0x00000006 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107219] radeon 0000:01:00.0: GRBM_STATUS_SE1 = 0x00000006 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107220] radeon 0000:01:00.0: SRBM_STATUS = 0x20000AC0 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107254] radeon 0000:01:00.0: SRBM_STATUS2 = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107255] radeon 0000:01:00.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107256] radeon 0000:01:00.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107257] radeon 0000:01:00.0: R_00867C_CP_BUSY_STAT = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107258] radeon 0000:01:00.0: R_008680_CP_STAT = 0x00000000 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107259] radeon 0000:01:00.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107260] radeon 0000:01:00.0: R_00D834_DMA_STATUS_REG = 0x44C83D57 Oct 7 17:17:31 gentoo-desktop kernel: [11675.107336] radeon 0000:01:00.0: GPU reset succeeded, trying to resume Oct 7 17:17:31 gentoo-desktop kernel: [11675.135080] Watchdog[12138]: segfault at 0 ip 00007f493c6f60f8 sp 00007f492981b810 error 6 in chrome[7f4938a5b000+5c1c000]
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #139 from Edward edward.81@gmail.com --- The other day I noticed that the onboard video (i5-3570K - HD4000) was enabled after a BIOS upgrade. By disabling it the radeon driver crashes have stopped. I write only because it could be of help to someone.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #140 from darkbasic darkbasic@linuxsystems.it --- Do you disable the onboard video in the bios or somewhere else?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jacobsvenningsen15@hotmail. | |com
--- Comment #141 from Michel Dänzer michel@daenzer.net --- (In reply to Jacob from comment #137)
It seems to happen on any kernel newer than 3.13, least in my case.
Can you bisect which change between 3.13 and 3.14 caused the instability for you?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #142 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to Michel Dänzer from comment #141)
(In reply to Jacob from comment #137)
It seems to happen on any kernel newer than 3.13, least in my case.
Can you bisect which change between 3.13 and 3.14 caused the instability for you?
I've only downloaded kernel images as .deb files from http://kernel.ubuntu.com/~kernel-ppa/mainline/ How would I go about bisecting which change caused the instability?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #143 from Edward edward.81@gmail.com --- (In reply to darkbasic from comment #140)
Do you disable the onboard video in the bios or somewhere else?
In the bios. In my case (asrock z77 extreme6) I have to disable the entry "IGPU Multi-Monitor"
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #144 from darkbasic darkbasic@linuxsystems.it --- Unfortunately it doesn't help, on the contrary I just got a new instability world record: it crashed in *KDM*! Yes of course, I didn't even had time to type my username! I never checked GTX 970's price so often: one day or another I will find a good offer and I will say goodbye to radeoncrash forever.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #145 from Daniel Kozak kozzi11@gmail.com --- Same issue with my HD 7770 on 3.16 and 3.17 (much often). Even trying to write this comment here cause crash. So I am unable to write more details on this pc, because of time to crash is reali tiny :(
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #146 from Daniel Kozak kozzi11@gmail.com --- I am able to reproduce, just start VLC with some video and wait. After some secs or few minutes it happens
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #147 from mmstickman@gmail.com --- I'm getting VM Faults within minutes of idling in i3wm now with my 7950 in 3.17. My AMD A4-5000 laptop is unaffected by these bugs, however.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #148 from darkbasic darkbasic@linuxsystems.it --- I just had to book my flight 3 times with HD 7950 because it loves crashing when I use Chromium (plugins disabled).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #149 from mmstickman@gmail.com --- Perhaps you should revert to 3.14 LTS until this issue is fixed. I don't have any issues running 3.14 with my 7950.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #150 from darkbasic darkbasic@linuxsystems.it --- 3.14 is stable with my HD 7950 but some users reported they are stable only with 3.13 (I don't remember their card)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #151 from initzero@gmail.com --- Same issues with my OLAND card here. Archlinux, Mesa 10.3, Xorg 1.16.1 and Radeon 7.5.0.
Kernel 3.14.20 is still stable, 3.16.X and 3.17 may run for approx. 1hr and finally crash during normal desktop usage (Gnome 3.14 + Browser + ...). Didn't check 3.15 recently.
Sooner or later someone needs to bisect that sucker! :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #152 from agapito tcxjy@vomoto.com --- IMPORTANT
This was my first message in this bug report:
I have the same problem with my HD 7950; using hangouts, playing Left for Dead 2, or watching a flash video my screen goes crazy with vertical lines or grey fog. Started when i upgraded to testing repo (Archlinux) and downloaded the newest linux-firmware package, who includes TAHITI_mc2.bin. I suffered this bug on kernels 3.14 and 3.15.
--------------------------------------------------------------------------
In Archlinux i was stable with kernel 3.14, and the problem started when i was using the new firmware. I thought that the new firmware was the cause of this bug, but NO, because i had the same bug using the old firmare, so this bug it was caused by one of this radeon commits backported to kernel 3.14.6 (the first kernel using newest firmware). I am 100% sure.
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/?id...
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #153 from Jacob jacobsvenningsen15@hotmail.com --- I've tried using different kernel versions the past few days and I've failed to trigger the crash with any kernel prior to 3.15-rc3. Today after a few hours of used, my system just locked up again and my screens went black, forcing me to reboot the machine, same thing I've experienced with 3.16 and 3.17 so I believe this is where the bug originated. However, it seems like when I booted up, nothing has been written to kern.log nor does any errors show up in xorg.log, and dmesg shows me nothing. I got the image from: http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc3-utopic/ which also shows what changes were made to this release. The issue might have originated here, but it could very well be that I've missed something.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #154 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to agapito from comment #152)
IMPORTANT
This was my first message in this bug report:
I have the same problem with my HD 7950; using hangouts, playing Left for Dead 2, or watching a flash video my screen goes crazy with vertical lines or grey fog. Started when i upgraded to testing repo (Archlinux) and downloaded the newest linux-firmware package, who includes TAHITI_mc2.bin. I suffered this bug on kernels 3.14 and 3.15.
In Archlinux i was stable with kernel 3.14, and the problem started when i was using the new firmware. I thought that the new firmware was the cause of this bug, but NO, because i had the same bug using the old firmare, so this bug it was caused by one of this radeon commits backported to kernel 3.14.6 (the first kernel using newest firmware). I am 100% sure.
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/ ?id=refs/tags/v3.14.21&ofs=1300
I've just compared the git messages from your link and from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc3-utopic/CHANGES and it seems like the commits made to drm/radeon, are the only commits these two kernel versions have in common. Only seven of them are part of 3.15-rc3, which crashed on me yesterday, so it would seem like the crashes are caused by one of those commits
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #155 from agapito tcxjy@vomoto.com --- With kernel 3.16.5 i have this bug every 2 hours approximately. With kernel 3.17 every 20 minutes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #156 from Malte Schröder maltesch@gmx.de --- Hi, with kernel v3.17 these crashes where much more frequent for me too. Now I've set aspm=0 for the radeon module and the system has been running for some hours straight.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #157 from agapito tcxjy@vomoto.com --- (In reply to Malte Schröder from comment #156)
I've set aspm=0 for the radeon module and the system has been running for
some hours straight.
Not for me.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #158 from Malte Schröder maltesch@gmx.de --- (In reply to agapito from comment #157)
(In reply to Malte Schröder from comment #156)
I've set aspm=0 for the radeon module and the system has been running for
some hours straight.
Not for me.
Yeah, it just crashed on me again. So I was just lucky. I also tried disabling dynclks and dpm, no effect. What I did differently yesterday is I had very litte browser (Debian Iceweasel) usage. Today I had some Youtube running when the crash happened. In fact the crashes happen most time when whatching stuff on Youtube, i.e. when Iceweasel uses vdpau through gstreamer. I now removed mesa vdpau drivers. I will report back if this changes anything.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #159 from Malte Schröder maltesch@gmx.de --- VDPAU doesn't make a difference, still crashes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #160 from Michel Dänzer michel@daenzer.net --- (In reply to Jacob from comment #153)
I've tried using different kernel versions the past few days and I've failed to trigger the crash with any kernel prior to 3.15-rc3.
What was the closest earlier version you tried?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #161 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to Michel Dänzer from comment #160)
(In reply to Jacob from comment #153)
I've tried using different kernel versions the past few days and I've failed to trigger the crash with any kernel prior to 3.15-rc3.
What was the closest earlier version you tried?
The last image I tried was 3.15-rc2, which didn't crash on me during 18-hours of uptime
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #162 from Michel Dänzer michel@daenzer.net --- (In reply to Jacob from comment #161)
The last image I tried was 3.15-rc2, which didn't crash on me during 18-hours of uptime
Can you try 3.15-rc2 again for even longer, to make sure it wasn't just luck?
If it's consistent, it would be really helpful if you could bisect between 3.15-rc2 and 3.15-rc3.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #163 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to Michel Dänzer from comment #162)
(In reply to Jacob from comment #161)
The last image I tried was 3.15-rc2, which didn't crash on me during 18-hours of uptime
Can you try 3.15-rc2 again for even longer, to make sure it wasn't just luck?
If it's consistent, it would be really helpful if you could bisect between 3.15-rc2 and 3.15-rc3.
I'll do that; try rc2 for a couple days, see if I can get it to crash, then try rc3 again to see if I can crash that again just for good measure, then do a bisection between the two, if rc2 turns out to be stable and rc3 does not
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #164 from Jacob jacobsvenningsen15@hotmail.com --- So after testing 3.15-rc2 for about 3 days without any crashes, I decided to once again test 3.15-rc3 to see if it would crash on me again, which it did. The OS just stopped responding to anything, then my monitors went black, just like it has done for me on 3.16 and 3.17 as well.
I ran a bisection between the two releases, and the result was the following: Bisecting: 120 revisions left to test after this (roughly 7 steps) [3fe89d2e768792a924d3c1e9310ba0b4448cb78e] Merge tag 'fixes-3.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Seems weird that arm got anything to do with this issue, but even after running "git bisect bad" until the end, it doesn't pick out anything committed to drm/radeon. Nonetheless, I compiled the kernel, and I'll now test it to see if it'll crash or not, then run git bisect bad until the kernel gets stable once again.
I'll report back if I either manage to compile a kernel which is stable, or if I don't
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #165 from Marti Raudsepp marti@juffo.org --- After upgrading to kernel 3.17.1, these GPU hangs/crashes still occur, but now it doesn't hang the whole machine any more. Sometimes it recovers from the GPU hang completely, sometimes it just drops me into a text console. Thanks, that's an improvement.
I am using Radeon R9 270 on Arch Linux.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #166 from Michel Dänzer michel@daenzer.net --- (In reply to Jacob from comment #164)
I ran a bisection between the two releases, and the result was the following: Bisecting: 120 revisions left to test after this (roughly 7 steps)
That's not the result but just an early step of the bisection. :) As the above says, Git estimates that you'll need to test around 7 more kernels before there is a result. BTW, make sure to only run 'bisect good' after you've tested a kernel long enough to be sure it's not affected by the problem. If you mark a commit as good which is actually bad, the bisection will fail.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #167 from agapito tcxjy@vomoto.com --- 3.18 rc1 still affected.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #168 from DJ Dunn djdunn.safety@gmail.com --- if this helps any, I've been seeing the same error on my gentoo box, its near constant on 3.16+ kernels within 2 or 3 min of loging in X but, ive been seeing it very rarely (once every few hours) on 3.14.22 but still seeing it, and I never seen it happen on 3.14.19
gentoo box with mesa-10.3.1 xorg-server-1.16.1
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #169 from DJ Dunn djdunn.safety@gmail.com --- my card is HD7870
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #170 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to DJ Dunn from comment #168)
if this helps any, I've been seeing the same error on my gentoo box, its near constant on 3.16+ kernels within 2 or 3 min of loging in X but, ive been seeing it very rarely (once every few hours) on 3.14.22 but still seeing it, and I never seen it happen on 3.14.19
gentoo box with mesa-10.3.1 xorg-server-1.16.1
Could you test 3.14.20 and 3.14.21, to see if the issue also occur in any of those, then do a bisection between the first version where the crashes started to occur, and the version prior to it?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #171 from darkbasic darkbasic@linuxsystems.it --- On my system with 3.17+ sometimes it takes days to crash while sometimes it crashes after a few minutes. Only 3.15 did *always* crash in a couple of minutes. I remember 3.15-rc1-pre didn't crash, so the bad commit should be somewhere around 3.15-rc0 and 3.15-rc2.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #172 from darkbasic darkbasic@linuxsystems.it --- Sorry I meant 3.15-rc0 and 3.15-rc*3*.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #173 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to darkbasic from comment #171)
On my system with 3.17+ sometimes it takes days to crash while sometimes it crashes after a few minutes. Only 3.15 did *always* crash in a couple of minutes. I remember 3.15-rc1-pre didn't crash, so the bad commit should be somewhere around 3.15-rc0 and 3.15-rc2.
Takes about a couple hours for me to crash 3.15-rc3, "sadly" not minutes. 3.15-rc2 didn't crash on me, so I'm currently doing a bisection
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #174 from Jacob jacobsvenningsen15@hotmail.com --- Considering the crash occurs within an hour or two on the last kernel, but only occurs every few hours on 3.15, it makes you wonder if the two issues are even related. Nonetheless, I'll report the result from the bisection once I got it
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #175 from Christian König deathsimple@vodafone.de --- (In reply to Jacob from comment #174)
Considering the crash occurs within an hour or two on the last kernel, but only occurs every few hours on 3.15, it makes you wonder if the two issues are even related. Nonetheless, I'll report the result from the bisection once I got it
Kernel 3.15 has some known VM issues which are only fixed in 3.16. Independent of that I think we indeed have multiple different issues that seems to be hard to distinct.
In general you can split the issues into two categories one is with VM faults in the logs and the other ones are without.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #176 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to Christian König from comment #175)
(In reply to Jacob from comment #174)
Considering the crash occurs within an hour or two on the last kernel, but only occurs every few hours on 3.15, it makes you wonder if the two issues are even related. Nonetheless, I'll report the result from the bisection once I got it
Kernel 3.15 has some known VM issues which are only fixed in 3.16. Independent of that I think we indeed have multiple different issues that seems to be hard to distinct.
In general you can split the issues into two categories one is with VM faults in the logs and the other ones are without.
Seems the one I'm bisecting for now would be the one without then. Whenever it crashes, I get nothing in the logs at all. Nothing in dmesg, in xorg.log or in the kern.log. Nothing at all
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #177 from sam tygier samtygier@yahoo.co.uk --- Seeing this lock up on Debian Jessie when watching youtube HTML5 videos in firefox. kernel Linux oberon 3.16-3-amd64 #1 SMP Debian 3.16.5-1 (2014-10-10) x86_64 GNU/Linux Mesa 10.2.8-1 libdrm-radeon1 2.4.58-2 xserver-xorg-video-radeon 1:7.5.0-1 On a [AMD/ATI] Cape Verde PRO [Radeon HD 7750 / R7 250E]
I have logs which i can post, but it looks like you already have quite a few.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #178 from Jacob jacobsvenningsen15@hotmail.com --- The "supposedly" random crashes I encountered with 3.15-rc3 weren't really random at all. I came to a somewhat sad realization that only one application actually crashed with it, so the bisection has mostly been a waste.
I've personally been stable on 3.14.20, and according to DJ Dunn, the issue has now hit 3.14.22, so I'm gonna check 21 and 22 to see if it crashes for me as well, and this time make sure it's application independent.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #179 from Maximilian Böhm winlux@gmail.com --- Just want to remind you that there is a Mesa connection somehow. Either it's a kernel call only later Mesa versions implement or it's a Mesa issue – I'm stable for *months* now on Linux 3.16/3.17 and this downgraded packages on Arch Linux with a Radeon HD 7770: ati-dri-10.1.4-1-x86_64.pkg.tar.xz clang-3.4.1-2-x86_64.pkg.tar.xz lib32-llvm-libs-3.4.1-1-x86_64.pkg.tar.xz lib32-mesa-10.1.4-1-x86_64.pkg.tar.xz lib32-mesa-libgl-10.1.4-1-x86_64.pkg.tar.xz llvm-3.4.1-2-x86_64.pkg.tar.xz llvm-libs-3.4.1-2-x86_64.pkg.tar.xz mesa-10.1.4-1-x86_64.pkg.tar.xz mesa-demos-8.1.0-2-x86_64.pkg.tar.xz mesa-libgl-10.1.4-1-x86_64.pkg.tar.xz
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #180 from agapito tcxjy@vomoto.com --- I know it's early to say this but 3.18 rc2 solved this bug for me.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #181 from agapito tcxjy@vomoto.com --- After 5 hours i am still stable. I've played L4D2, unigine valley, watched vdpau content, flash videos, Google Earth, chromium with a lot of tabs, kwin effects...
HD7950 under Archlinux with 3.18 rc2 kernel and mesa 10.3.2
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #182 from Michel Dänzer michel@daenzer.net --- (In reply to Maximilian Böhm from comment #179)
Just want to remind you that there is a Mesa connection somehow.
I've seen that mentioned before, but the answer is always the same: Please bisect Mesa.
(In reply to agapito from comment #180)
I know it's early to say this but 3.18 rc2 solved this bug for me.
Can you bisect which commit in 3.18-rc2 fixed it for you?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #183 from Aaron B aaronbottegal@gmail.com --- 3.18 is still crashing for me, I doubt it is fixed.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #184 from agapito tcxjy@vomoto.com --- (In reply to Michel Dänzer from comment #182)
Can you bisect which commit in 3.18-rc2 fixed it for you?
Sorry, I do not know how to do it. But these are the changes between RC1 (still crashing) and RC2 (stable):
drm/radeon: reduce sparse false positive warnings Revert "drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table" Revert "drm/radeon/dpm: drop clk/voltage dependency filters for SI" drm/radeon: initialize sadb to NULL in the audio code drm/radeon: fix speaker allocation setup drm/radeon: use gart memory for DMA ring tests drm/radeon: fix vm page table block size calculation
I am not an expert, but probably: drm/radeon: use gart memory for DMA ring tests; could be the good commit.
(In reply to Aaron B from comment #183)
3.18 is still crashing for me, I doubt it is fixed.
rc2 crashed for you? After 24 hours I am still stable.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #185 from Michel Dänzer michel@daenzer.net --- (In reply to agapito from comment #184)
Can you bisect which commit in 3.18-rc2 fixed it for you?
Sorry, I do not know how to do it.
Search the web for 'git bisect howto'. One gotcha is that you'll need to run 'git bisect good' for bad kernels and vice versa, because git bisect can only isolate good -> bad transitions.
I am not an expert, but probably: drm/radeon: use gart memory for DMA ring tests; could be the good commit.
That should have no effect once the driver is initialized. None of the changes between rc1 and rc2 seem like obvious candidates.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #186 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to agapito from comment #184)
(In reply to Michel Dänzer from comment #182)
Can you bisect which commit in 3.18-rc2 fixed it for you?
Sorry, I do not know how to do it. But these are the changes between RC1 (still crashing) and RC2 (stable):
drm/radeon: reduce sparse false positive warnings Revert "drm/radeon: drop btc_get_max_clock_from_voltage_dependency_table" Revert "drm/radeon/dpm: drop clk/voltage dependency filters for SI" drm/radeon: initialize sadb to NULL in the audio code drm/radeon: fix speaker allocation setup drm/radeon: use gart memory for DMA ring tests drm/radeon: fix vm page table block size calculation
I am not an expert, but probably: drm/radeon: use gart memory for DMA ring tests; could be the good commit.
(In reply to Aaron B from comment #183)
3.18 is still crashing for me, I doubt it is fixed.
rc2 crashed for you? After 24 hours I am still stable.
https://wiki.ubuntu.com/Kernel/KernelBisection This guide helped me. Might help you too.
In short, you just have to clone the Linux repository, run "git bisect start <BAD> <GOOD>", then compile the kernel and test it. If it crashes, then run "git bisect bad", and recompile. If you think you've tested it long enough and the version is stable, then run "git bisect good", and recompile. Continue to do so until no revisions are left to be tested
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #187 from Daniel Kozak kozzi11@gmail.com --- After reinstall my arch workstation, I am unable to reproduce this issue anymore. Even with same mesa, linux, and linux-firmware versions as before.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #188 from Aaron B aaronbottegal@gmail.com --- It's random for a reason, it acts like a buffer over run or leak or something that isn't easily produced, as it changes how often it happens every installed package. But I've had worse luck on 3.18-rc2, it's just my install is more prone to it this build where it seems others haven't crashed yet, but give it time. Make sure you run HTML5 youtube for an hour or so. ;) :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #189 from agapito tcxjy@vomoto.com --- This is strange because 5 days ago i was starting to use intel graphic card because i had a lot of lock-ups with 3.17.1 and 3.18 rc1 kernels. When 3.18 rc2 was launched i returned to radeon driver and this bug disappeared under 3.18 rc2, but now i am using 3.17.1 and it seems stable... Maybe this bug is a mesa problem, and not a kernel problem. Mesa 10.3.2 arrived to archlinux 3 days ago.
Changes from 10.3.1 to 10.3.2:
Brian Paul (3): mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error st/wgl: add WINAPI qualifiers on wgl function typedefs glsl: fix several use-after-free bugs
Daniel Manjarres (1): glx: Fix glxUseXFont for glxWindow and glxPixmaps
Dave Airlie (1): mesa: fix GetTexImage for 1D array depth textures
Emil Velikov (3): docs: Add sha256 sums for the 10.3.1 release Update VERSION to 10.3.2 Add release notes for the 10.3.2 release
Ilia Mirkin (4): gm107/ir: add dnz emission for fmul gk110/ir: add dnz flag emission for fmul/fmad nouveau: 3d textures are unsupported, limit 3d levels to 1 st/gbm: fix order of arguments passed to is_format_supported
Kenneth Graunke (3): i965: Add a BRW_MOCS_PTE #define. i965: Use BDW_MOCS_PTE for renderbuffers. i965: Fix register write checks.
Marek Olšák (2): st/mesa: use pipe_sampler_view_release for releasing sampler views glsl_to_tgsi: fix the value of gl_FrontFacing with native integers
Michel Dänzer (4): radeonsi: Clear sampler view flags when binding a buffer r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers winsys/radeon: Use separate caching buffer manager for each set of flags r600g: Drop references to destroyed blend state
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #190 from Daniel Kozak kozzi11@gmail.com --- (In reply to agapito from comment #189)
This is strange because 5 days ago i was starting to use intel graphic card because i had a lot of lock-ups with 3.17.1 and 3.18 rc1 kernels. When 3.18 rc2 was launched i returned to radeon driver and this bug disappeared under 3.18 rc2, but now i am using 3.17.1 and it seems stable... Maybe this bug is a mesa problem, and not a kernel problem. Mesa 10.3.2 arrived to archlinux 3 days ago.
Changes from 10.3.1 to 10.3.2:
Brian Paul (3): mesa: fix spurious wglGetProcAddress / GL_INVALID_OPERATION error st/wgl: add WINAPI qualifiers on wgl function typedefs glsl: fix several use-after-free bugs
Daniel Manjarres (1): glx: Fix glxUseXFont for glxWindow and glxPixmaps
Dave Airlie (1): mesa: fix GetTexImage for 1D array depth textures
Emil Velikov (3): docs: Add sha256 sums for the 10.3.1 release Update VERSION to 10.3.2 Add release notes for the 10.3.2 release
Ilia Mirkin (4): gm107/ir: add dnz emission for fmul gk110/ir: add dnz flag emission for fmul/fmad nouveau: 3d textures are unsupported, limit 3d levels to 1 st/gbm: fix order of arguments passed to is_format_supported
Kenneth Graunke (3): i965: Add a BRW_MOCS_PTE #define. i965: Use BDW_MOCS_PTE for renderbuffers. i965: Fix register write checks.
Marek Olšák (2): st/mesa: use pipe_sampler_view_release for releasing sampler views glsl_to_tgsi: fix the value of gl_FrontFacing with native integers
Michel Dänzer (4): radeonsi: Clear sampler view flags when binding a buffer r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers winsys/radeon: Use separate caching buffer manager for each set of flags r600g: Drop references to destroyed blend state
I don't think so. I try tio downgrade mesa, linux-firmware and lots of other packages, but even with vdpau vlc, html5 youtube videos or flash videos I am unable to frozen my system again (I really try it hard all day). It must be some HW problem or some wierd HW state or something completly different.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #191 from agapito tcxjy@vomoto.com --- 3.17.1 still affected :S I had a crash just 5 minutes ago.
Well, i will use 3.18 rc2 because i didn't have any crash yet.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #192 from Aaron B aaronbottegal@gmail.com --- I guess it is possible there are many different crash types. I'm still crashing left and right. Is everyone else still stable? If so, looks like I'll leave you guys here alone to mark your problem fixed....and find which one I need to be living in for bug reports again. :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #193 from Alex Deucher agd5f@yahoo.com --- (In reply to Aaron B from comment #192)
I guess it is possible there are many different crash types. I'm still crashing left and right. Is everyone else still stable? If so, looks like I'll leave you guys here alone to mark your problem fixed....and find which one I need to be living in for bug reports again. :)
Unfortunately, this bug has become a dumping ground for any kind of stability issue with radeonsi so I'm not really sure how useful it is anymore. I suspect there are actually multiple issues that are now all mixed up.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #194 from Marti Raudsepp marti@juffo.org --- (In reply to Alex Deucher from comment #193)
Unfortunately, this bug has become a dumping ground for any kind of stability issue with radeonsi so I'm not really sure how useful it is anymore. I suspect there are actually multiple issues that are now all mixed up.
What's the way forward? Shouldn't it be up to the developers to try and make sense of the reports and split up the bug entry appropriately?
Should there be one report per affected user, or is there a better way to group them together?
Christian König from comment #175 made one suggestion, is that what we should be doing?
In general you can split the issues into two categories one is with VM faults in the logs and the other ones are without.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #195 from Aaron B aaronbottegal@gmail.com --- Assuming my issue is separate, and your fixes were fixed, ever since my issue was the only issue pertaining to random crashes, mostly by video players/web browsers, it seemed the AMD guys never could reproduce it. After some time with nobody else on the bug report, a few appeared with the same problem, and reported the exact same results. Somewhere after that though, I think there were more "Random crashes" bugs. I'd bet the ones who joined in a little later have the same bug as me, the more current random crashes are probably not the same though.
Maybe we should kill these reports, and make a couple with titles more appropriate, funnel people there, and start over. :)
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #196 from Alex Deucher agd5f@yahoo.com --- There are other bug reports related to stability issues specifically with chrome, firefox, and video playback in certain cases which may not be related. Those bugs may be better fits depending on the exact nature of the issue you are seeing.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #197 from agapito tcxjy@vomoto.com --- After 2 days,3.18 rc2 crashed... Arggg this bug is crazy.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #198 from agapito tcxjy@vomoto.com --- 2 days without crashes... Now 2 crashes in 5 minutes. I will start using my intel graphic card again.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #199 from Jacob jacobsvenningsen15@hotmail.com --- (In reply to agapito from comment #198)
2 days without crashes... Now 2 crashes in 5 minutes. I will start using my intel graphic card again.
Do you know what exactly you're doing when the crashes occur?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #200 from darkbasic darkbasic@linuxsystems.it --- I had the very same behaviour with any 3.17+ kernel: sometimes it doesn't crashes for days, others it crashes multiple times per minute. It doesn't matter what you do, it just crashes (even starting your desktop environment is enough sometimes). I will probably buy a Nvidia GTX 970 while waiting for the new unified driver, then I will try the open source path once again: hopefully having the proprietary driver using the very same kernel code will force AMD to take stability into higher consideration.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #201 from Aaron B aaronbottegal@gmail.com --- Not that it is very prominent, but I also plan on switching to a 780 Ti or similar, if the AMD guys can show their management how many people are not only going to hurt the company, but support the competition it might show AMD it's worth it to get you guys more help over all.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #202 from farmboy0+freedesktop@googlemail.com --- I am declaring kernel 3.18-rc2 preliminary stable for me again. My card is an HD 7750 Pro Cape Verde. I am using 3.18-rc2 with the lower-cased firmware for verde + TAHITI_uvd. Mesa and Llvm is from recent git.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #203 from agapito tcxjy@vomoto.com --- (In reply to Jacob from comment #199)
(In reply to agapito from comment #198)
2 days without crashes... Now 2 crashes in 5 minutes. I will start using my intel graphic card again.
Do you know what exactly you're doing when the crashes occur?
Yeah, on both occasions, I was trying to write a message in a forum. (Chromium)
(In reply to farmboy0+freedesktop from comment #202)
I am declaring kernel 3.18-rc2 preliminary stable for me again. My card is an HD 7750 Pro Cape Verde. I am using 3.18-rc2 with the lower-cased firmware for verde + TAHITI_uvd. Mesa and Llvm is from recent git.
I thought 3.18-rc2 was stable, but is not...
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #204 from Alex Deucher agd5f@yahoo.com --- (In reply to agapito from comment #203)
(In reply to Jacob from comment #199)
(In reply to agapito from comment #198)
2 days without crashes... Now 2 crashes in 5 minutes. I will start using my intel graphic card again.
Do you know what exactly you're doing when the crashes occur?
Yeah, on both occasions, I was trying to write a message in a forum. (Chromium)
If it's happens mostly with chromium, it may be bug 81644. When you say crash what do you mean? Segfault? System hang? GPU hang? GPU page fault? Something else?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #205 from agapito tcxjy@vomoto.com --- (In reply to Alex Deucher from comment #204)
crash what do you mean? Segfault? System hang? GPU hang? GPU page fault? Something else?
My bug is not Chromium related. 4 months ago, my browser was Firefox and i had the same bug. Always is the same behaviour. Sometimes with videos, or games, flash content... It´s totally random.
I think it happens often when I click anywhere or i resize a windows with vdpau content, then my system is freezed 5 seconds (I can move the mouse, but the windows or programs are not responding) after 5 seconds, my screen shows garbage like this: https://bugs.freedesktop.org/attachment.cgi?id=101226 or my monitor turns off completely. Sometimes i can reboot with reisub, sometimes i need a hard reset. Some months ago i posted a picture of my dmesg output https://bugs.freedesktop.org/attachment.cgi?id=104145
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #206 from Aaron B aaronbottegal@gmail.com --- I believe Firefox and Chromium both suffer from the same issue myself, we've been treating it that way at least, and Firefox users have never reported any changes different with patches and updates, so I believe the same issue is being talked about as with Chromium. I think it has to do with video being sent to the GPU at all, which with RadeonSI and any modern browsers, any accelerated browser probably will have the same problems.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #207 from Michel Dänzer michel@daenzer.net --- (In reply to Marti Raudsepp from comment #194)
Shouldn't it be up to the developers to try and make sense of the reports and split up the bug entry appropriately?
We are doing that all the time. However, users tend to focus too much on some symptom(s) they have as well and ignore any differences. It's understandable, but unfortunate.
Should there be one report per affected user, or is there a better way to group them together?
I can't think of anything better than that. In general, it's much better to track things separately. Once several unrelated issues are mixed up in a single report, it's very hard to untangle and keep track of it.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #208 from Marti Raudsepp marti@juffo.org --- (In reply to Michel Dänzer from comment #207)
In general, it's much better to track things separately. Once several unrelated issues are mixed up in a single report, it's very hard to untangle and keep track of it.
So *TELL* users that clearly, to make individual bug reports. Close this bug if necessary. Direct your users instead of going "oh well, users can't report bugs and we can't do anything about it".
When I was beginning to see these issues, I asked around in #radeon whether I should report a new bug, and I was told to see this bug instead. Of course that was another misled user.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #209 from Michel Dänzer michel@daenzer.net --- (In reply to Marti Raudsepp from comment #208)
So *TELL* users that clearly, to make individual bug reports.
What do you think I'm doing what feels like every day? :}
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #210 from Michel Dänzer michel@daenzer.net --- One thing I find interesting is that only Southern Islands seems affected. At least I can't see any mentions of Bonaire, Kaveri, Kabini or Hawaii being affected in this report or other related ones.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #211 from Jacob jacobsvenningsen15@hotmail.com --- I looked back through kern.log and found the last time I encountered a crash, which happened to be with kernel 3.15.10 from the Ubuntu repo. That crash seemed to have been caused by dpm: [drm:si_dpm_set_power_state] *ERROR* si_set_sw_state failed
The 3.15.10 version isn't part of the source, so I instead looked through the list of changes and found that the last set of changes made to drm, landed in kernel 3.16-rc6.
So I installed the rc6 image and has now been running it for some time, but ran into yet another issue, which is unrelated to this bug.
[ 6533.114483] alloc_contig_range test_pages_isolated(1bc800, 1bca8c) failed [ 6533.114492] alloc_contig_range test_pages_isolated(1bc800, 1bca8d) failed [ 6533.114500] alloc_contig_range test_pages_isolated(1bc800, 1bca8e) failed [ 6533.114506] alloc_contig_range test_pages_isolated(1bc800, 1bca8f) failed [ 6533.114511] alloc_contig_range test_pages_isolated(1bc800, 1bca90) failed [ 6533.114516] alloc_contig_range test_pages_isolated(1bc800, 1bca91) failed And so on. It pretty much causes frequent 2-5 second hangs, even while I'm writing this message. Just moving the cursor, causes such a hang. Changing workspace and the hang lasts 40 seconds. In other words, bisecting the dpm issue would be very difficult, since I would more than likely run into this issue as well.
The last change made to drm/radeon/dpm were merged into 3.16-rc1, and it only affect si hardware.
Suppose this issue could be moved to another bug entry, if it hasn't already been fixed.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #212 from Marti Raudsepp marti@juffo.org --- (In reply to Jacob from comment #211)
I looked back through kern.log and found the last time I encountered a crash, which happened to be with kernel 3.15.10 from the Ubuntu repo. That crash seemed to have been caused by dpm: [drm:si_dpm_set_power_state] *ERROR* si_set_sw_state failed
Jacob, please report a separate bug about your symptoms.
(In reply to Michel Dänzer from comment #210)
One thing I find interesting is that only Southern Islands seems affected.
And Pitcairn too.
(In reply to Michel Dänzer from comment #207)
In general, it's much better to track things separately.
Should the two bugs resolved as "duplicate" be de-duplicated then? Bug 80141 and Bug 82886.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #213 from Michel Dänzer michel@daenzer.net --- (In reply to Marti Raudsepp from comment #212)
(In reply to Michel Dänzer from comment #210)
One thing I find interesting is that only Southern Islands seems affected.
And Pitcairn too.
Pitcairn is Southern Islands, just like Cape Verde and Tahiti.
(In reply to Michel Dänzer from comment #207)
In general, it's much better to track things separately.
Should the two bugs resolved as "duplicate" be de-duplicated then? Bug 80141 and Bug 82886.
I reopened 82886, but Aaron already has his own report (about Chromium, but it may or may not turn out to be the same problem).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #214 from Aaron B aaronbottegal@gmail.com --- I'll duplicate my Bugs to Bug #85647 to start over, I KNOW I have that bug at minimum. I'll stay off of other "Random RadeonSI" crash reports until we resole it there.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #215 from initzero@gmail.com --- For me it's also Southern Islands related. Up2date Archlinux + Oland: unstable Up2date Archlinux + Kaveri: stable
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #216 from Cilyan Olowen gaknar@gmail.com --- Not sure if it is related, but I have the same log on dmesg while playing Minecraft with Radeon 6970 (Northern Island, if I'm not mistaken). Linux 3.17.1, temp sensor around 57°C, not critical.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Cilyan Olowen gaknar@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |gaknar@gmail.com
--- Comment #217 from Cilyan Olowen gaknar@gmail.com --- Created attachment 108795 --> https://bugs.freedesktop.org/attachment.cgi?id=108795&action=edit Last 300 lines of dmesg on a Radeon 6970
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #218 from Sean Rhone Espionage724@gmail.com --- Just a bit of feedback, but my 7850 seems relatively stable under Xubuntu 14.10 + 3.18rc3 + Paulo's mesa PPA.
General desktop usage and web browsing over the past week resulted in no crashes or GPU hangs, but I did have a slightly weird issue with fullscreened flash video (while fullscreen once the player OSD disappears, moving the mouse would freeze the video, but double-clicking it to un-fullscreen it was fine and it played back normally).
Was watching a fullscreen video through Plex's web interface (I think videos playback with HTML5?), I had a couple of GPU hangs (right term?) and restarts, but they were really quick (black screen for about 2 seconds, then restore as if nothing happened). If I recall right, there were about 2 or 3 hangs over a 37-minute period.
I'm using Google Chrome (not Chromium) 40.0.2202.3 dev (64-bit) with --ignore-gpu-blacklist enabled. Just checking chrome:gpu, I noticed:
Log Messages [2523:2523:1104/211347:ERROR:sandbox_linux.cc(301)] : InitializeSandbox() called with multiple threads in process gpu-process [2523:2523:1104/211636:WARNING:x11_util.cc(1490)] : X error received: serial 59083, error_code 3 (BadWindow (invalid Window parameter)), request_code 4, minor_code 0 (X_DestroyWindow) [2523:2523:1104/221802:ERROR:gpu_video_decode_accelerator.cc(299)] : Not implemented reached in void content::GpuVideoDecodeAccelerator::Initialize(const media::VideoCodecProfile, IPC::Message *)HW video decode acceleration not available. [2523:2529:1104/225251:ERROR:gpu_watchdog_thread.cc(253)] : The GPU process hung. Terminating after 10000 ms. GpuProcessHostUIShim: The GPU process crashed! GpuProcessHostUIShim: The GPU process crashed!
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #219 from Michel Dänzer michel@daenzer.net --- Might be worth trying the Mesa patches I attached to bug 85647.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #220 from agapito tcxjy@vomoto.com --- Since mesa 10.3.4 update i don't have this bug anymore on Archlinux. I've been "stable" for more than two weeks.
http://www.mesa3d.org/relnotes/10.3.4.html
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #221 from darkbasic darkbasic@linuxsystems.it --- Of course since 'radeonsi: Disable asynchronous DMA except for PIPE_BUFFER' the vast majority of crashes disappeared.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #222 from fdb4c415@opayq.com --- I would like to inform you I got the same problem. My box freezes randomly. Keyboard is almost dead, mouse sometimes working (pointer moves, but cannot click on anything) and usually I can only ssh into that box - to reboot it. Sometimes Ctrl-Alt-F1 works and I can log in, but sometimes not. The monitor switches repeatedly: no signal/black screen/no signal/black screen/... In the log I can see quite similar messages: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000002258 last fence id 0x0000000000002255 on ring 0) VM fault (0x04, vmid 1) at page 30481, read from DMA1 (61)
I have the latest debian test 64 bit installed: Linux <host> 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt2-1 (2014-12-08) x86_64 GNU/Linux and I have: VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750 / R7 250E]
Is there any way to fix it?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #223 from Michel Dänzer michel@daenzer.net --- (In reply to fdb4c415 from comment #222)
VM fault (0x04, vmid 1) at page 30481, read from DMA1 (61)
I have the latest debian test 64 bit installed:
There's a good chance that a newer upstream version of Mesa would help for your problem, if not fix it completely.
For those still having problems, the kernel patches http://lists.freedesktop.org/archives/dri-devel/2015-January/074968.html and http://lists.freedesktop.org/archives/dri-devel/2015-January/074969.html might be worth a try.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #224 from Gedalya gedalya@gedalya.net --- Filed debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=774784
Might need to file a separate bug for the linux package.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #225 from Tilman Sauerbeck tilman@code-monkey.de --- (In reply to Michel Dänzer from comment #223)
For those still having problems, the kernel patches http://lists.freedesktop.org/archives/dri-devel/2015-January/074968.html and http://lists.freedesktop.org/archives/dri-devel/2015-January/074969.html might be worth a try.
I applied http://lists.freedesktop.org/archives/dri-devel/2015-January/074969.html on top of kernel 3.18.4, and got:
radeon 0000:01:00.0: GPU fault detected: 146 0x0008080c radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 radeon 0000:01:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0800800C VM fault (0x0c, vmid 4) at page 0, read from 'TC0' (0x54433000) (8)
(following by an unsuccessful attempt to unwedge the GPU, but I guess the lines above are what's really interesting).
This is with Mesa built from 8d2542fc9d5af4db355b67cc2a1ff2f413685a27 on a bonaire xtx.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #226 from Tilman Sauerbeck tilman@code-monkey.de --- (In reply to Tilman Sauerbeck from comment #225)
(In reply to Michel Dänzer from comment #223)
For those still having problems, the kernel patches http://lists.freedesktop.org/archives/dri-devel/2015-January/074968.html and http://lists.freedesktop.org/archives/dri-devel/2015-January/074969.html might be worth a try.
I applied http://lists.freedesktop.org/archives/dri-devel/2015-January/074969.html on top of kernel 3.18.4, and got:
Oops, I tested with 3.18.2 (the latest stable release as of today).
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #227 from Andrew maaniv@gmail.com --- The monitor switches repeatedly: no signal/black screen/no signal/black screen/... In wine starcraft 2 with gallium nine 100% repeatability(7 of 7 launches). Sorry for my English.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #228 from Michel Dänzer michel@daenzer.net --- (In reply to Andrew from comment #227)
The monitor switches repeatedly: no signal/black screen/no signal/black screen/... In wine starcraft 2 with gallium nine 100% repeatability(7 of 7 launches).
That's not a random crash but a reproducible one, probably a Mesa bug. Please file a separate report for that.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #229 from Andrew maaniv@gmail.com --- My dmesg output is similar to an attachments to this bug. Do I need to create a new bug in this case?
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #230 from Liss lissamour@gmail.com --- Looks like I have similar issue with Radeon 8850M. I already filled bug 88364, but I'm not sure should I mark it as duplicate because I'm not sure that it is same problem.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #231 from Marti Raudsepp marti@juffo.org --- (In reply to Liss from comment #230)
Looks like I have similar issue with Radeon 8850M. I already filled bug 88364, but I'm not sure should I mark it as duplicate
(In reply to Andrew from comment #229)
My dmesg output is similar to an attachments to this bug. Do I need to create a new bug in this case?
Please report all issues you have as separate bugs! This bug mixes together multiple issues and symptoms, so it's almost useless.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #232 from Morgan Jones integ3rs@gmail.com --- Same symptoms with a Hawaii device (R9 290X).
dmesg:
[20174.016203] Watchdog[15659]: segfault at 0 ip 00007fa1902fbb0b sp 00007fa17a6dd560 error 6 in chromium[7fa18c040000+6497000]
lspci:
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii XT [Radeon R9 290X]
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #233 from Morgan Jones integ3rs@gmail.com --- Also, it's worth noting that my crashes are pretty reproducible when running Chromium without --disable-gpu if compton is running. Disabled compton and haven't had any so far.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #234 from Marti Raudsepp marti@juffo.org --- (In reply to Morgan Jones from comment #232)
Same symptoms with a Hawaii device (R9 290X).
There are no "same symptoms" in this bug report, it's a mix of multiple different symptoms and issues. Please report a new bug for your problem.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #235 from Tom Guder kontakt@ib-guder.de --- Hello,
i get random freezes only in dota2. Other OpenGL applications run well. Archlinux, 3.18.6-1-ARCH #1 SMP PREEMPT Sat Feb 7 08:44:05 CET 2015 x86_64 GNU/Linux
[11008.894953] radeon 0000:02:00.0: GPU fault detected: 147 0x000c4402 [11008.894956] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00100000 [11008.894958] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C044002 [11008.894959] VM fault (0x02, vmid 6) at page 1048576, read from TC (68) [11008.894961] radeon 0000:02:00.0: GPU fault detected: 147 0x058c4801 [11008.894962] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x000AA85A [11008.894963] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0C0C8002 [11008.894964] VM fault (0x02, vmid 6) at page 698458, read from TC (200) [11019.062287] radeon 0000:02:00.0: ring 4 stalled for more than 10000msec [11019.062291] radeon 0000:02:00.0: GPU lockup (current fence id 0x0000000000013609 last fence id 0x000000000001360a on ring 4) [11019.062312] radeon 0000:02:00.0: failed to get a new IB (-35) [11019.062315] [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib ! [11019.556350] radeon 0000:02:00.0: Saved 780 dwords of commands on ring 0. . . .
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #236 from Tom Guder kontakt@ib-guder.de --- Same with kernel 3.14.35-1-lts. Dota2 crashes everytimes within one minute spectating a game and freezes the screen and keyboard. Networking works.
Bests Tom
[ 129.619475] radeon 0000:02:00.0: GPU fault detected: 147 0x000a4401 [ 129.619479] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x01000000 [ 129.619480] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0A044001 [ 129.619482] VM fault (0x01, vmid 5) at page 16777216, read from TC (68) [ 129.619484] radeon 0000:02:00.0: GPU fault detected: 146 0x020a440c [ 129.619485] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_ADDR 0x00000000 [ 129.619486] radeon 0000:02:00.0: VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #237 from Marti Raudsepp marti@juffo.org --- (In reply to Tom Guder from comment #235)
i get random freezes only in dota2. Other OpenGL applications run well.
Please report a separate bug for your exact circumstances. This one is being ignored by developers because there are many different causes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #238 from agapito tcxjy@vomoto.com --- I had this bug again :S
Using KDE 5, i had 2 crashes when i changed speed animation in systemsettings5 - screens and monitor - compositor options.
My system is Archlinux 64 bits, kernel 3.19.3 and mesa 10.5.2. I can't report any dmesg or log, my system completely freezes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
--- Comment #239 from Marti Raudsepp marti@juffo.org --- (In reply to agapito from comment #238)
I had this bug again :S
There is no "this bug". Please report a separate bug for your exact circumstances. This one is being ignored by developers because there are many different causes.
https://bugs.freedesktop.org/show_bug.cgi?id=79980
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #240 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/506.
dri-devel@lists.freedesktop.org