https://bugs.freedesktop.org/show_bug.cgi?id=103370
Bug ID: 103370 Summary: `DRI_PRIME=1 glxgears -info` halts the system with Intel Graphics [8086:5917] + AMD Graphics [1002:6665]. Product: DRI Version: XOrg git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: fourdollars@gmail.com
While I am doing the tests with AC plugged in by `DRI_PRIME=1 glxgears -info` and `DRI_PRIME=0 glxgears -info`, the system halts and then is forced to shutdown automatically. I tried mainline kernels from 4.10rc7 to v4.14rc5 and they have the same problem.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #1 from Michel Dänzer michel@daenzer.net --- Please attach the corresponding dmesg output.
(In reply to Shih-Yuan Lee from comment #0)
While I am doing the tests with AC plugged in by `DRI_PRIME=1 glxgears -info` and `DRI_PRIME=0 glxgears -info`, the system halts and then is forced to shutdown automatically.
To clarify, DRI_PRIME=1 glxgears works, the problem only occurs with DRI_PRIME=0 glxgears?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #2 from Shih-Yuan Lee fourdollars@gmail.com --- If I executed the command under battery mode, it won't halt the system.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #3 from Shih-Yuan Lee fourdollars@gmail.com --- Using DRI_PRIME=0 is just to switch between Intel and AMD Graphics. Of course, we can omit it completely.
The system halt issue happens when executing `DRI_PRIME=1 glxgears -info`.
u@u:~$ DRI_PRIME=1 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Gallium 0.4 on AMD HAINAN (DRM 2.46.0, LLVM 3.8.0) GL_VERSION = 3.0 Mesa 11.2.0 GL_VENDOR = X.Org ... u@u:~$ DRI_PRIME=0 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Mesa DRI Intel(R) Kabylake GT1.5 GL_VERSION = 3.0 Mesa 11.2.0 GL_VENDOR = Intel Open Source Technology Center ...
(In reply to Michel Dänzer from comment #1)
Please attach the corresponding dmesg output.
(In reply to Shih-Yuan Lee from comment #0)
While I am doing the tests with AC plugged in by `DRI_PRIME=1 glxgears -info` and `DRI_PRIME=0 glxgears -info`, the system halts and then is forced to shutdown automatically.
To clarify, DRI_PRIME=1 glxgears works, the problem only occurs with DRI_PRIME=0 glxgears?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #4 from Shih-Yuan Lee fourdollars@gmail.com --- Sorry. I pasted wrong logs.
GL_VERSION should be "3.0 Mesa 17.0.7" instead.
(In reply to Shih-Yuan Lee from comment #3)
Using DRI_PRIME=0 is just to switch between Intel and AMD Graphics. Of course, we can omit it completely.
The system halt issue happens when executing `DRI_PRIME=1 glxgears -info`.
u@u:~$ DRI_PRIME=1 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Gallium 0.4 on AMD HAINAN (DRM 2.46.0, LLVM 3.8.0) GL_VERSION = 3.0 Mesa 11.2.0 GL_VENDOR = X.Org ... u@u:~$ DRI_PRIME=0 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Mesa DRI Intel(R) Kabylake GT1.5 GL_VERSION = 3.0 Mesa 11.2.0 GL_VENDOR = Intel Open Source Technology Center ...
(In reply to Michel Dänzer from comment #1)
Please attach the corresponding dmesg output.
(In reply to Shih-Yuan Lee from comment #0)
While I am doing the tests with AC plugged in by `DRI_PRIME=1 glxgears -info` and `DRI_PRIME=0 glxgears -info`, the system halts and then is forced to shutdown automatically.
To clarify, DRI_PRIME=1 glxgears works, the problem only occurs with DRI_PRIME=0 glxgears?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #5 from Shih-Yuan Lee fourdollars@gmail.com --- There is no problem under the battery mode, and because the system halts that makes unable to collect any dmesg.
$ DRI_PRIME=1 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Gallium 0.4 on AMD HAINAN (DRM 2.50.0 / 4.14.0-041400rc5-generic, LLVM 4.0.0) GL_VERSION = 3.0 Mesa 17.0.7 GL_VENDOR = X.Org ...
$ DRI_PRIME=0 glxgears -info Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. GL_RENDERER = Mesa DRI Intel(R) Kabylake GT1.5 GL_VERSION = 3.0 Mesa 17.0.7 GL_VENDOR = Intel Open Source Technology Center ...
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #6 from Michel Dänzer michel@daenzer.net --- (In reply to Shih-Yuan Lee from comment #5)
There is no problem under the battery mode, and because the system halts that makes unable to collect any dmesg.
Please attach dmesg captured in battery mode or before the problem occurs.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #7 from Shih-Yuan Lee fourdollars@gmail.com --- Created attachment 134936 --> https://bugs.freedesktop.org/attachment.cgi?id=134936&action=edit dmesg by drm.debug=0xe
The messages just before the system halt.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #134936|text/x-log |text/plain mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Mike Lothian mike@fireburn.co.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |mike@fireburn.co.uk
--- Comment #8 from Mike Lothian mike@fireburn.co.uk --- Hmm, I notice these errors:
[ 2.050887] [drm:radeon_acpi_init [radeon]] Call to ATCS verify_interface failed: -5 [ 2.050994] [drm:radeon_acpi_init [radeon]] Call to ATIF verify_interface failed: -5
which I think are ACPI calls, it might be worth checking your BIOS/EFI is up to date and if that doesn't fix things maybe play around with the acpi_osi= options
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #9 from Shih-Yuan Lee fourdollars@gmail.com --- I have tried acpi_osi="Windows 2009", "Windows 2012", "Windows 2013" and "Windows 2015" on the latest mainline kernel 4.14rc6, and they all have the same errors and halt the system. The BIOS is also up to date.
(In reply to Mike Lothian from comment #8)
Hmm, I notice these errors:
[ 2.050887] [drm:radeon_acpi_init [radeon]] Call to ATCS verify_interface failed: -5 [ 2.050994] [drm:radeon_acpi_init [radeon]] Call to ATIF verify_interface failed: -5
which I think are ACPI calls, it might be worth checking your BIOS/EFI is up to date and if that doesn't fix things maybe play around with the acpi_osi= options
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #10 from Mike Lothian mike@fireburn.co.uk --- Did this ever work for you?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #11 from Shih-Yuan Lee fourdollars@gmail.com --- (In reply to Mike Lothian from comment #10)
Did this ever work for you?
What do you mean by this?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #12 from Shih-Yuan Lee fourdollars@gmail.com --- BTW, this is a new Dell laptop in the development.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #13 from Mike Lothian mike@fireburn.co.uk --- I was meaning, is this a regression, as in it used to work with an older kernel or mesa. If it's a new system perhaps not.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #14 from Shih-Yuan Lee fourdollars@gmail.com --- Yup, this is a new system. `DRI_PRIME=1 glxgears` never worked properly before.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #15 from Mike Lothian mike@fireburn.co.uk --- Are there any changes when you boot the system with radeon.runpm=0, this will mean the card never powers down
What distro are you running?
You mention trying older kernel version, did you try older mesa versions too?
Can you attach your Xorg.0.log too
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #16 from Mike Lothian mike@fireburn.co.uk --- Do you also see the issue with amdgpu rather than using the radeon kernel driver?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #17 from Shih-Yuan Lee fourdollars@gmail.com --- Created attachment 135027 --> https://bugs.freedesktop.org/attachment.cgi?id=135027&action=edit Xorg.0.log
(In reply to Mike Lothian from comment #15)
Are there any changes when you boot the system with radeon.runpm=0, this will mean the card never powers down
What distro are you running?
You mention trying older kernel version, did you try older mesa versions too?
Can you attach your Xorg.0.log too
radeon.runpm=0 doesn't make any change.
I am running Ubuntu 16.04 LTS which using Linux kernel 4.4 and Mesa 11.2.0 before upgrading the system. After the system upgraded, it uses Mesa 17.0.7 instead.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #18 from Shih-Yuan Lee fourdollars@gmail.com --- (In reply to Mike Lothian from comment #16)
Do you also see the issue with amdgpu rather than using the radeon kernel driver?
amdgpu doesn't support on this AMD graphics with the kernel parameters "amdgpu.si_support=1 radeon.si_support=0" on Linux kernel 4.14rc6. X window system can not start up.
01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Jet PRO [Radeon R5 M230] [1002:6665] (rev c3) Subsystem: Dell Jet PRO [Radeon R5 M230] [1028:0844] Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Interrupt: pin A routed to IRQ 129 Region 0: Memory at c0000000 (64-bit, prefetchable) [size=256M] Region 2: Memory at d0000000 (64-bit, non-prefetchable) [size=256K] Region 4: I/O ports at e000 [size=256] Expansion ROM at d0040000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: radeon Kernel modules: radeon, amdgpu
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #19 from Mike Lothian mike@fireburn.co.uk --- You have to blacklist radeon to use amdgpu as both modules try and claim the device
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #20 from Shih-Yuan Lee fourdollars@gmail.com --- Created attachment 135048 --> https://bugs.freedesktop.org/attachment.cgi?id=135048&action=edit blacklist radeon dmesg
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #21 from Shih-Yuan Lee fourdollars@gmail.com --- Created attachment 135049 --> https://bugs.freedesktop.org/attachment.cgi?id=135049&action=edit blacklist radeon Xorg.0.log
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #22 from Shih-Yuan Lee fourdollars@gmail.com --- (In reply to Mike Lothian from comment #19)
You have to blacklist radeon to use amdgpu as both modules try and claim the device
After I blacklist radeon, there is no AMD graphics provider from `xrandr --listproviders`.
[ 1.937326] amdgpu 0000:01:00.0: enabling device (0000 -> 0003) [ 1.937633] amdgpu 0000:01:00.0: SI support provided by radeon. [ 1.937635] amdgpu 0000:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.
After I use 'radeon.si_support=0 amdgpu.si_support=1', X window system can not start up.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #23 from Mike Lothian mike@fireburn.co.uk --- Can you show the dmesg and Xorg.0.log with radeon.si_support=0 amdgpu.si_support=1
On Thu, 26 Oct 2017 at 05:17 bugzilla-daemon@freedesktop.org wrote:
*Comment # 22 https://bugs.freedesktop.org/show_bug.cgi?id=103370#c22 on bug 103370 https://bugs.freedesktop.org/show_bug.cgi?id=103370 from Shih-Yuan Lee fourdollars@gmail.com *
(In reply to Mike Lothian from comment #19 https://bugs.freedesktop.org/show_bug.cgi?id=103370#c19)> You have to blacklist radeon to use amdgpu as both modules try and claim the
device
After I blacklist radeon, there is no AMD graphics provider from `xrandr --listproviders`.
[ 1.937326] amdgpu 0000:01:00.0: enabling device (0000 -> 0003) [ 1.937633] amdgpu 0000:01:00.0: SI support provided by radeon. [ 1.937635] amdgpu 0000:01:00.0: Use radeon.si_support=0 amdgpu.si_support=1 to override.
After I use 'radeon.si_support=0 amdgpu.si_support=1', X window system can not start up.
You are receiving this mail because:
- You are on the CC list for the bug.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #24 from Shih-Yuan Lee fourdollars@gmail.com --- If I used radeon.dpm=0, there is no such issue.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #25 from Shih-Yuan Lee fourdollars@gmail.com --- There is no such issue when I used mesa 11.2.0 on Ubuntu 16.04. I found this issue on mesa 17.0.7 and mesa 17.2.4 also has this issue.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Timo Aaltonen tjaalton@ubuntu.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |tjaalton@ubuntu.com
--- Comment #26 from Timo Aaltonen tjaalton@ubuntu.com --- this was tested to regress between mesa 12.0.3 and 12.0.5, and bisect points out
commit d3d33918c79d9e87aedaf6f70ed39f75eed262a0 Author: Michel Dänzer michel.daenzer@amd.com Date: Wed Aug 17 17:02:04 2016 +0900
loader/dri3: Overhaul dri3_update_num_back
as the first bad commit
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #27 from Michel Dänzer michel@daenzer.net --- Thanks for bisecting, but I don't think that commit can be directly responsible for a GPU hang. Before that commit, the DRI3 code in Mesa would only use one back buffer for glxgears, which means that the GPU could only start rendering a new frame after the previous one had finished presenting. Maybe that somehow prevented the hang.
A possible test for this theory is running
vblank_mode=0 DRI_PRIME=1 glxgears
with Mesa 12.0.3; does that also trigger the hang?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #28 from Shih-Yuan Lee fourdollars@gmail.com --- `vblank_mode=0 DRI_PRIME=1 glxgears` will also introduce the GPU lock up. However when using radeon.dpm=0, it won't happen but it is tearing all the time.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Shih-Yuan Lee fourdollars@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|`DRI_PRIME=1 glxgears |`DRI_PRIME=1 glxgears |-info` halts the system |-info` halts the system |with Intel Graphics |with Intel Graphics |[8086:5917] + AMD Graphics |[8086:5917] + AMD Graphics |[1002:6665]. |[1002:6665] (rev c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Shih-Yuan Lee fourdollars@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|`DRI_PRIME=1 glxgears |`vblank_mode=0 DRI_PRIME=1 |-info` halts the system |glxgears` will introduce |with Intel Graphics |GPU lock up on Intel |[8086:5917] + AMD Graphics |Graphics [8086:5917] + AMD |[1002:6665] (rev c3) |Graphics [1002:6665] (rev | |c3)
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #29 from Michel Dänzer michel@daenzer.net --- Tearing is expected with vblank_mode=0.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #30 from Shih-Yuan Lee fourdollars@gmail.com --- Tearing won't happen on battery power, but it will only happen when plugged in AC power. Is this behavior also expected?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #31 from Michel Dänzer michel@daenzer.net --- With vblank_mode=0, the only thing that can prevent tearing is luck.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #32 from Timo Aaltonen tjaalton@ubuntu.com --- forwarding a comment from an engineer:
"During viewing the source code of radeon module, I found there is a bug [1] related to the dpm and clocks. So I decided to do some experiments. Tried to set different max_sclk and max_mclk to see if the issue is gone. 1. max_sclk: 70000, max_mclk: 75000 --> have the same issue 2. max_sclk: 50000, max_mclk: 60000 --> pass multi-run test (more than 50 runs)
[1] https://bugs.freedesktop.org/show_bug.cgi?id=76490 "
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #33 from Alex Deucher alexdeucher@gmail.com --- (In reply to Michel Dänzer from comment #27)
Thanks for bisecting, but I don't think that commit can be directly responsible for a GPU hang. Before that commit, the DRI3 code in Mesa would only use one back buffer for glxgears, which means that the GPU could only start rendering a new frame after the previous one had finished presenting. Maybe that somehow prevented the hang.
That commit "fixed" a performance regression at the time because it ended up causing enough of a delay that the clocks didn't ramp up. So it probably exposed a kernel dpm issue. Without it, the clocks never ramped up enough to cause an issue. With it, they did.
(In reply to Timo Aaltonen from comment #32)
forwarding a comment from an engineer:
"During viewing the source code of radeon module, I found there is a bug [1] related to the dpm and clocks. So I decided to do some experiments. Tried to set different max_sclk and max_mclk to see if the issue is gone.
- max_sclk: 70000, max_mclk: 75000 --> have the same issue
- max_sclk: 50000, max_mclk: 60000 --> pass multi-run test (more than 50
runs)
I think Sonny fixed this. It was due to using the wrong firmware. [ 1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665 0x1028:0x0844 0xC3). This chip should be using radeon/banks_k_2_smc.bin smc firmware. Is that available on the test system and kernel?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #34 from Alex Deucher alexdeucher@gmail.com --- The following commits are relevant: abb2e3c1ce64c8bba678973800c34ea1dc97c42c 6458bd4dfd9414cba5804eb9907fe2a824278c34 ef736d394e85b1bf1fd65ba5e5257b85f6c82325 4e6e98b1e48c9474aed7ce03025ec319b941e26e
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #35 from Alex Deucher alexdeucher@gmail.com --- Does reverting a628392cf03e0eef21b345afbb192cbade041741 fix the issue?
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #36 from Robert Liu tsunghanliu@gmail.com --- (In reply to Alex Deucher from comment #33)
I think Sonny fixed this. It was due to using the wrong firmware. [ 1.827060] [drm] initializing kernel modesetting (HAINAN 0x1002:0x6665 0x1028:0x0844 0xC3). This chip should be using radeon/banks_k_2_smc.bin smc firmware. Is that available on the test system and kernel?
The firmware radeon/banks_k_2_smc.bin is on the test system. With Ubuntu kernel 4.4.0-101-generic, I am not pretty sure the radeon driver is using this firmware. With Ubuntu kernel 4.13.0-16-generic, I tried both amdgpu and radeon drivers, but the system hang. as soon as the system hang, the amdgpu_pm_info shows 'invalid dpm profile 15'.
(In reply to Alex Deucher from comment #34)
The following commits are relevant: abb2e3c1ce64c8bba678973800c34ea1dc97c42c 6458bd4dfd9414cba5804eb9907fe2a824278c34 ef736d394e85b1bf1fd65ba5e5257b85f6c82325 4e6e98b1e48c9474aed7ce03025ec319b941e26e
These commits would be already included in Ubuntu kernel 4.13.0-16-generic.
(In reply to Alex Deucher from comment #35)
Does reverting a628392cf03e0eef21b345afbb192cbade041741 fix the issue?
Removing this commit does not fix the issue.
BTW, with 4.13.0-16-generic, I change the max_sclk in drm/radeon/si_dpm.c (what we did with Ubuntu kernel 4.4.0-101-generic) from 75000 to 65000, but still met the hang issue.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #37 from Robert Liu tsunghanliu@gmail.com --- (In reply to Robert Liu from comment #36)
BTW, with 4.13.0-16-generic, I change the max_sclk in drm/radeon/si_dpm.c (what we did with Ubuntu kernel 4.4.0-101-generic) from 75000 to 65000, but still met the hang issue.
By restricting max_sclk to 65000 and max_mclk to 80000, both radeon and amdgpu do not have the issue.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #38 from Alex Deucher alexdeucher@gmail.com --- Created attachment 135647 --> https://bugs.freedesktop.org/attachment.cgi?id=135647&action=edit workaround for radeon
workarounds for radeon and amdgpu to fix the issue.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #39 from Alex Deucher alexdeucher@gmail.com --- Created attachment 135648 --> https://bugs.freedesktop.org/attachment.cgi?id=135648&action=edit workaround for amdgpu
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #40 from Shih-Yuan Lee fourdollars@gmail.com --- Created attachment 135662 --> https://bugs.freedesktop.org/attachment.cgi?id=135662&action=edit dmesg
(In reply to Alex Deucher from comment #38)
Created attachment 135647 [details] [review] workaround for radeon
workarounds for radeon and amdgpu to fix the issue.
I applied this patch on top of Ubuntu-4.4.0-101.124 Linux kernel and it seems to fix the issue in the beginning. But it has some problem later on.
$ seq 20 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 5; done Loop 1 radeon: Failed to allocate virtual address for buffer: radeon: size : 65536 bytes radeon: alignment : 4096 bytes radeon: domains : 4 radeon: va : 0x0000000000800000 radeon: Failed to deallocate virtual address for buffer: radeon: size : 65536 bytes radeon: va : 0x800000 radeon: Failed to allocate virtual address for buffer: radeon: size : 65536 bytes radeon: alignment : 4096 bytes radeon: domains : 4 radeon: va : 0x0000000000800000 radeon: Failed to deallocate virtual address for buffer: radeon: size : 65536 bytes radeon: va : 0x800000 radeonsi: Failed to create a context. Loop 2 ...
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #41 from Robert Liu tsunghanliu@gmail.com --- So far, setting max_sclk to 60000 and max_mclk to 80000, the system passed a 24hours burn-in test (vblank_mode=0 DRI_PRIME=1 glmark2 --run-forever).
Another issue found is when removing the adapter, the system goes to suspend. After I wake it up, it continues running the benchmark.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #42 from Michel Dänzer michel@daenzer.net --- (In reply to Robert Liu from comment #41)
Another issue found is when removing the adapter, the system goes to suspend.
That's not directly related to graphics drivers.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #43 from Shih-Yuan Lee fourdollars@gmail.com --- I can still reduplicate the issue after setting max_sclk to 60000 and max_mclk to 80000.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #44 from Shih-Yuan Lee fourdollars@gmail.com --- I tried max_sclk = 50000 and max_mclk = 60000 on Ubuntu-4.4.0-112.135, but I can still reduplicate the GPU lock up issue. It can pass the first run of `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 3; done`. But it failed when I tried the second run of `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n 3; done`.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
--- Comment #45 from Shih-Yuan Lee fourdollars@gmail.com --- I can still reduplicate this issue on Ubuntu 18.04 by `seq 100 | while read i; do echo Loop $i; DRI_PRIME=1 glxgears -info|head -n2; done`.
https://bugs.freedesktop.org/show_bug.cgi?id=103370
Shih-Yuan Lee fourdollars@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED
--- Comment #46 from Shih-Yuan Lee fourdollars@gmail.com --- The Linux kernel of Comment 45 is 4.15.0-10.11 from Ubuntu 18.04. When I tried a later version 4.15.0-12.13, I can not reduplicate this issue on Ubuntu 18.04. 4.15.0-12.13 contains the following commit.
commit 239b5f64e12b1f09f506c164dff0374924782979 Author: Alex Deucher alexander.deucher@amd.com Date: Tue Nov 21 12:09:38 2017 -0500
drm/radeon: Add dpm quirk for Jet PRO (v2)
Fixes stability issues.
v2: clamp sclk to 600 Mhz
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=103370 Acked-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org
diff --git a/drivers/gpu/drm/radeon/si_dpm.c b/drivers/gpu/drm/radeon/si_dpm.c index ee3e742..97a0a63 100644 --- a/drivers/gpu/drm/radeon/si_dpm.c +++ b/drivers/gpu/drm/radeon/si_dpm.c @@ -2984,6 +2984,11 @@ static void si_apply_state_adjust_rules(struct radeon_device *rdev, (rdev->pdev->device == 0x6667)) { max_sclk = 75000; } + if ((rdev->pdev->revision == 0xC3) || + (rdev->pdev->device == 0x6665)) { + max_sclk = 60000; + max_mclk = 80000; + } } else if (rdev->family == CHIP_OLAND) { if ((rdev->pdev->revision == 0xC7) || (rdev->pdev->revision == 0x80) ||
dri-devel@lists.freedesktop.org