https://bugs.freedesktop.org/show_bug.cgi?id=93895
Bug ID: 93895 Summary: GPU lockup on AMD A4-3400 APU when starting X server on opensource drivers. (works fine with fglrx) Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: critical Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: azari4096@gmail.com
I've had this lockup on this machine for the past few years, across several different kernel versions, different distributions, etc. Booting with KMS works fine, but the second a graphical environment starts (whether X or wayland-based), it locks up.
Booting Ubuntu with user-space mode-setting works, and then I can install FGLRX from there and everything works fine. After speaking with airlied on IRC, they suggested it could be a workaround that AMD has put into FGLRX that never made it into the opensource drivers, and that AMD might have to look into it.
CPU/GPU : A4-3400 Motherboard : GA-A75M-D2H ( http://www.gigabyte.com/products/product-page.aspx?pid=3930#ov )
journalctl log of the lockup:
------------------------------------------------------------
Jan 27 18:28:32 miku dbus-daemon[374]: Successfully activated service 'org.freedesktop.systemd1' Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10000msec Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x0000000000000001 last fence id 0x0000000000000003 on ring 0) Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: Saved 55 dwords of commands on ring 0. Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GPU softreset: 0x00000009 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS = 0xB1403828 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x28000007 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000840 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x40000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00008000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x80228643 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00007F6B Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00000100 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS = 0x00003828 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS_SE0 = 0x00000007 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GRBM_STATUS_SE1 = 0x00000007 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: SRBM_STATUS = 0x20000040 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: SRBM_STATUS2 = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008674_CP_STALLED_STAT1 = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008678_CP_STALLED_STAT2 = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_00867C_CP_BUSY_STAT = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_008680_CP_STAT = 0x00000000 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: R_00D034_DMA_STATUS_REG = 0x44C83D57 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: GPU reset succeeded, trying to resume Jan 27 18:28:42 miku kernel: [drm] Found smc ucode version: 0x00011100 Jan 27 18:28:42 miku kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000274000). Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: WB enabled Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000020000c00 and cpu addr 0xffff8800c613fc00 Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000020000c0c and cpu addr 0xffff8800c613fc0c Jan 27 18:28:42 miku kernel: radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90002432118 Jan 27 18:28:42 miku kernel: [drm] ring test on 0 succeeded in 1 usecs Jan 27 18:28:42 miku kernel: [drm] ring test on 3 succeeded in 3 usecs Jan 27 18:28:42 miku kernel: [drm] ring test on 5 succeeded in 1 usecs Jan 27 18:28:42 miku kernel: [drm] UVD initialized successfully. Jan 27 18:28:52 miku kernel: radeon 0000:00:01.0: ring 0 stalled for more than 10370msec Jan 27 18:28:52 miku kernel: radeon 0000:00:01.0: GPU lockup (current fence id 0x0000000000000002 last fence id 0x0000000000000004 on ring 0) Jan 27 18:28:52 miku kernel: [drm:r600_ib_test [radeon]] *ERROR* radeon: fence wait failed (-35). Jan 27 18:28:52 miku kernel: [drm:radeon_ib_ring_tests [radeon]] *ERROR* radeon: failed testing IB on GFX ring (-35). Jan 27 18:29:22 miku systemd[1]: Started Getty on tty2.
------------------------------------------------------------
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #1 from Azari azari4096@gmail.com --- Created attachment 121337 --> https://bugs.freedesktop.org/attachment.cgi?id=121337&action=edit dmesg log from the machine in question.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #2 from Azari azari4096@gmail.com --- Created attachment 121338 --> https://bugs.freedesktop.org/attachment.cgi?id=121338&action=edit Xorg log file from the machine in question.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #3 from Azari azari4096@gmail.com --- Created attachment 121339 --> https://bugs.freedesktop.org/attachment.cgi?id=121339&action=edit lspci output with pci IDs for everything, from the machine in question.
I forgot to add the PCI ID for the GPU in the initial report, it's 1002:9644
I have also added an attachment with lspci output for all the other devices, as well as attachments of the dmesg ouput and the xorg log.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #4 from Alex Deucher alexdeucher@gmail.com --- Created attachment 121340 --> https://bugs.freedesktop.org/attachment.cgi?id=121340&action=edit possible fix
Does this kernel patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #5 from Azari azari4096@gmail.com --- (In reply to Alex Deucher from comment #4)
Created attachment 121340 [details] [review] possible fix
Does this kernel patch help?
I just finished compiling and testing a kernel with that patch; it didn't help, it still has the same issues. =(
Thanks for the prompt reply by the way.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #6 from Azari azari4096@gmail.com --- I have something new to report after doing more testing.
It seems that if I launch weston with the pixman backend (weston --use-pixman), that works, but meanwhile, startx with 'exec twm' in .xinitrc doesn't work on first attempt, it causes the lockup.
However, after the lockup, if i try to startx again (still with twm), it suddenly works and i can start applications in twm and use the desktop. I managed to reproduce this with Xfce as well, the first 'startxfce4' after bootup will fail and lockup the GPU, and after it resets, I try again and Xfce works.
One thing of note is that when I finally do manage to get a DE started (after the GPU has locked up and reset once), glxinfo shows only "gallium on llvmpipe"; no hardware acceleration available.
So whatever is causing the lockup is something that X does at startup (even when a minimal X window manager like twm is used), but weston-pixman doesn't do.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #7 from Alex Deucher alexdeucher@gmail.com --- Check your xorg log and make sure acceleration is enabled.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #8 from Azari azari4096@gmail.com --- Created attachment 121447 --> https://bugs.freedesktop.org/attachment.cgi?id=121447&action=edit startx with 'exec twm' in .xinitrc right after bootup; causes lockup.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #9 from Azari azari4096@gmail.com --- Created attachment 121448 --> https://bugs.freedesktop.org/attachment.cgi?id=121448&action=edit second attempt after lockup; startx with 'exec twm' in .xinitrc suddenly works fine.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
--- Comment #10 from Azari azari4096@gmail.com --- (In reply to Alex Deucher from comment #7)
Check your xorg log and make sure acceleration is enabled.
It seems that acceleration disables itself after the first startx attempt locks up the GPU:
[ 326.027] (--) RADEON(0): Chipset: "SUMO2" (ChipID = 0x9644) [ 326.027] (II) RADEON(0): GPU accel disabled or not working, using shadowfb for KMS [ 326.027] (II) Loading sub module "shadow" [ 326.027] (II) LoadModule: "shadow" [ 326.027] (II) Loading /usr/lib/xorg/modules/libshadow.so [ 326.047] (II) Module shadow: vendor="X.Org Foundation" [ 326.047] compiled for 1.18.0, module version = 1.1.0 [ 326.047] ABI class: X.Org ANSI C Emulation, version 0.4 [ 326.047] (II) RADEON(0): KMS Color Tiling: disabled [ 326.047] (II) RADEON(0): KMS Color Tiling 2D: disabled [...] [ 326.247] (WW) RADEON(0): Direct rendering disabled [ 326.247] (II) RADEON(0): Acceleration disabled [...] [ 326.252] (II) AIGLX: Screen 0 is not DRI2 capable [ 326.252] (EE) AIGLX: reverting to software rendering [ 326.329] (II) AIGLX: enabled GLX_MESA_copy_sub_buffer [ 326.331] (II) AIGLX: Loaded and initialized swrast [ 326.331] (II) GLX: Initialized DRISWRAST GL provider for screen 0
The full log is in this attachment: https://bugs.freedesktop.org/attachment.cgi?id=121448
I also uploaded another xorg log from the first attempt that causes the lockup, in case you want to see the difference between the two or something.
https://bugs.freedesktop.org/show_bug.cgi?id=93895
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #11 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/694.
dri-devel@lists.freedesktop.org