https://bugs.freedesktop.org/show_bug.cgi?id=107950
Bug ID: 107950 Summary: Delayed freeze with DRI_PRIME=1 on Topaz Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: nmset@netcourrier.com
Host : laptop with Kaveri iGPU and Topaz dGPU kernel : 4.18.4 Xorg : 1.20.1 Mesa : 18.2.0
When running 'DRI_PRIME=1 glmark2', the systems hangs after about 60 seconds. Must reboot wildly with the power button or magic sysrq, the latter may not completely power off the laptop.
The iGPU is driven by radeon module, and the dGPU with amdgpu. No module options, or the following options (amdgpu cik_support=0 si_support=1; radeon cik_support=1 si_support=0) yield the same result.
I can't say it started with 4.18.4. It's observed on 4.19-rc2/3 also. This never happened with older kernels.
No such event occurs when using the iGPU.
I cannot bisect, because the last crash badly corrupted the home partition, and my home directory simply vanished after fsck recreated the ext4 journal. I could recover from backup fortunately.
May be it's not related to amdgpu, but rather to Xorg, mesa or anything else. I am reporting it here in case it could be amdgpu in such offloading context.
Regards.
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #1 from Alex Deucher alexdeucher@gmail.com --- Can you attach the xorg log and dmesg output from your system?
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #2 from SET nmset@netcourrier.com --- Created attachment 141623 --> https://bugs.freedesktop.org/attachment.cgi?id=141623&action=edit Xorg log
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #3 from SET nmset@netcourrier.com --- Created attachment 141624 --> https://bugs.freedesktop.org/attachment.cgi?id=141624&action=edit dmesg after reboot
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #4 from SET nmset@netcourrier.com --- Created attachment 141625 --> https://bugs.freedesktop.org/attachment.cgi?id=141625&action=edit kernel log during problems
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #5 from SET nmset@netcourrier.com --- Please see attachments.
https://bugs.freedesktop.org/show_bug.cgi?id=107950
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #141623|text/x-log |text/plain mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=107950
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #141625|text/x-log |text/plain mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #6 from Michel Dänzer michel@daenzer.net --- Does updating xf86-video-ati to 18.1.0 or using EXA instead of glamor help by any chance? Your system is affected by bug 105381.
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #7 from SET nmset@netcourrier.com --- With EXA, sddm login screen does not show up.
xf86-video-ati 18.1.0 is in testing branch at Arch repositories. Will test when it'll be available as stable.
https://bugs.freedesktop.org/show_bug.cgi?id=107950
--- Comment #8 from SET nmset@netcourrier.com --- (In reply to Michel Dänzer from comment #6)
Tried with xf86-video-ati 18.1.0 :
Same delayed freeze.
I think the host gets overheated. The last line in kernel.log is
Sep 19 20:44:44 hp2 kernel: [ 337.131484] amdgpu: [powerplay] GPU over temperature range detected on PCIe 0:0.0!
I was monitoring the temperature with 'sensors' command. Last output for amdgpu sensor was :
amdgpu-pci-0100 Adapter: PCI adapter vddgfx: +0.82 V fan1: N/A temp1: +190.0°C (crit = +104000.0°C, hyst = -273.1°C) power1: 1.04 kW (cap = 30.00 W)
Perhaps powerplay needs some fix ?
Regards.
https://bugs.freedesktop.org/show_bug.cgi?id=107950
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #9 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/531.
dri-devel@lists.freedesktop.org