https://bugs.freedesktop.org/show_bug.cgi?id=110865
Bug ID: 110865 Summary: Rx480 consumes 20w more power in idle than under Windows Product: Mesa Version: 19.1 Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/radeonsi Assignee: dri-devel@lists.freedesktop.org Reporter: mwolf@adiumentum.com QA Contact: dri-devel@lists.freedesktop.org
Created attachment 144485 --> https://bugs.freedesktop.org/attachment.cgi?id=144485&action=edit logfiles as requested in the amd bugreport guide
First I am not sure where to file that bug, so please be gentle with me, if I selected the wrong component.
I noticed for a while higher temperatures of my Videocard when my pc was just idling with gnome. Then I dug deeper and found out that my "zero fan" videocard does not stop the fan when I run Linux.
So I ran this line here: watch -n 0.5 cat /sys/kernel/debug/dri/0/amdgpu_pm_info and it showed me that the MCLK does not clock down to 300MHz as it does with Windows 10. GFX Clocks and Power: 2000 MHz (MCLK) 300 MHz (SCLK) 300 MHz (PSTATE_SCLK) 300 MHz (PSTATE_MCLK) 1000 mV (VDDGFX) 24.75 W (average GPU)
GPU Temperature: 45 C GPU Load: 0 %
I have a multimonitor setup with two 1920x1200 pixel screens. When I use Windows 10, the MCLK does not go beyond 300MHz when the desktop is idling. (measured with hwmonitor) When I power-off one screen under linux the (average GPU) goes down to 8-10W and the MCLK drops to 300MHz, so the card can clock down, but is somehow prohibited by the driver or configuration?
I followed this bug report guide from amd: https://www.amd.com/en/support/kb/faq/amdgpu-installation#faq-Reporting-Bugs and attached several logfiles.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
Timothy Arceri t_arceri@yahoo.com.au changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|19.1 |DRI git Product|Mesa |DRI QA Contact|dri-devel@lists.freedesktop | |.org | Component|Drivers/Gallium/radeonsi |DRM/AMDgpu
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #1 from Martin mwolf@adiumentum.com --- My bug is now two months old, do you need more information, or what can I do to get your attention?
I think this is a serious issue, because it seems to affect a lot, maybe even all polaris cards. (tested two more in the last weeks).
Shouldn't it be a priority to stop the waste of so much energy?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement
--- Comment #2 from Alex Deucher alexdeucher@gmail.com --- This is the expected behavior for multiple monitors on Linux. mclk switching must happen in the monitors' blanking period. Since they likely don't align, especially if the monitors have different timing, we have to use a fixed mclk. The DC modesetting code can lock the timing of multiple monitors if they are using the exact same timing so that the blanking periods align, but I don't think the Linux power management code takes this into account at the moment.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #3 from Martin mwolf@adiumentum.com --- Thank you for your explanation. How do I find out the blanking periods?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #4 from Alex Deucher alexdeucher@gmail.com --- (In reply to Martin from comment #3)
Thank you for your explanation. How do I find out the blanking periods?
They are based on the timing for the mode on the display. As for the relevant driver code, take a look at smu7_apply_state_adjust_rules().
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #5 from Alex Deucher alexdeucher@gmail.com --- Created attachment 144978 --> https://bugs.freedesktop.org/attachment.cgi?id=144978&action=edit possible fix
Does this patch fix the issue?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #6 from Martin mwolf@adiumentum.com --- Sadly it did not help. the MCLK is still fixed at 2000MHz.
How can I verify that I did everything correctly? I just rebuilt Kernel 5.2.6 from Fedoras srpm and added the patch in the spec file.
Or could it be that I have two different 1920x1200 screens? one from HP and one from Dell?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #7 from Alex Deucher alexdeucher@gmail.com --- (In reply to Martin from comment #6)
Sadly it did not help. the MCLK is still fixed at 2000MHz.
How can I verify that I did everything correctly?
You can add a printk to the patch to verify that it's being applied. Maybe print the value of hwmgr->display_config->multi_monitor_in_sync to see if the monitors are synced or not.
I just rebuilt Kernel 5.2.6 from Fedoras srpm and added the patch in the spec file.
Or could it be that I have two different 1920x1200 screens? one from HP and one from Dell?
That is likely the issue. If the timings for the displays are slightly different, they won't be synced. It could also be that the DC code doesn't set the multi_monitor_in_sync flag properly.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #8 from Alex Deucher alexdeucher@gmail.com --- looks like the DC code does not set up the multi_monitor_in_sync flag properly.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #9 from Alex Deucher alexdeucher@gmail.com --- Created attachment 144983 --> https://bugs.freedesktop.org/attachment.cgi?id=144983&action=edit fix DC code
Can you try applying both of these patches? Assuming both of your monitors have the same timing this might work.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #10 from Martin mwolf@adiumentum.com --- Sorry for the delay, I had to figure out which kernel to use, because only Kernel 5.3.0-rc3 accepts your second patch. Stable 5.2.6 and .7 generate errors at :25 In about 2 hours I will have it built. You don't have a spare Ryzen 9 3900X for me to speed it up? Kernel building shows me quite drastically that my Haswell I7 is out of date ;)
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #11 from Martin mwolf@adiumentum.com --- Kernel 5.3.0-rc3 does not boot on my system It hangs at detecting the discs.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #12 from Dieter Nützel Dieter@nuetzel-hh.de --- (In reply to Alex Deucher from comment #9)
Created attachment 144983 [details] [review] fix DC code
Can you try applying both of these patches? Assuming both of your monitors have the same timing this might work.
Didn't apply on amd-staging-drm-next, too.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #13 from Dieter Nützel Dieter@nuetzel-hh.de --- (In reply to Dieter Nützel from comment #12)
(In reply to Alex Deucher from comment #9)
Created attachment 144983 [details] [review] [review] fix DC code
Can you try applying both of these patches? Assuming both of your monitors have the same timing this might work.
Didn't apply on amd-staging-drm-next, too.
BTW
Alex, is this the same problem? My card never was below ~32 W (even with single monitor but I have two identical HDMI 1920x1080) PSTATE_xxxx is much higher than Martin's didn't saw "zero fan" / zero core (no spinning fans)
Polaris 20 / 8GB Sapphire Radeon RX 580 Nitro+ single monitor
GFX Clocks and Power: 300 MHz (MCLK) 300 MHz (SCLK) 600 MHz (PSTATE_SCLK) 1000 MHz (PSTATE_MCLK) 750 mV (VDDGFX) 32.17 W (average GPU)
GPU Temperature: 31 C GPU Load: 0 %
amdgpu-pci-0100 Adapter: PCI adapter vddgfx: +0.75 V fan1: 909 RPM (min = 0 RPM, max = 3200 RPM) temp1: +30.0°C (crit = +94.0°C, hyst = -273.1°C) power1: 32.09 W (cap = 175.00 W)
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #14 from Alex Deucher alexdeucher@gmail.com --- (In reply to Dieter Nützel from comment #13)
Alex, is this the same problem?
No.
GFX Clocks and Power: 300 MHz (MCLK) 300 MHz (SCLK)
Your mclk is going to a lower state when it's idle.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #15 from Martin mwolf@adiumentum.com --- Finally with rc5 of Kernel 5.3 I was able to boot the kernel, sadly your two patches did not lower the power consumption.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #144983|0 |1 is obsolete| |
--- Comment #16 from Alex Deucher alexdeucher@gmail.com --- Created attachment 145136 --> https://bugs.freedesktop.org/attachment.cgi?id=145136&action=edit fix DC code
Can you try this patch along with attachment 144978?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #17 from Alex Deucher alexdeucher@gmail.com --- Note that it will only work if your monitors have identical timing.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #18 from Martin mwolf@adiumentum.com --- Hello,
sorry that it took me that long. I was on a historic cycling event in Germany.
Your patch indeed did something. The power consumption drops sometimes to 70W, but now both screen flicker and produce errors similar to a dying video-memory.
GFX Clocks and Power: 300 MHz (MCLK) 308 MHz (SCLK) 300 MHz (PSTATE_SCLK) 300 MHz (PSTATE_MCLK) 800 mV (VDDGFX) 12.222 W (average GPU)
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #19 from Martin mwolf@adiumentum.com --- with 70W i mean total system power-consumption of course. This is roughly the same / a little more as with Windows. So we are on a good path I think.
If i do "echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level" the flickering stops. So the flickering is caused by the automatic powermanagement / reclocking.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #145136|0 |1 is obsolete| |
--- Comment #20 from Alex Deucher alexdeucher@gmail.com --- Created attachment 145157 --> https://bugs.freedesktop.org/attachment.cgi?id=145157&action=edit fix DC code
Updated patch.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #21 from Martin mwolf@adiumentum.com --- sadly the screen still flickers
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #22 from Dieter Nützel Dieter@nuetzel-hh.de --- Hello Alex and Martin,
I've tried both on my
Polaris 20, RX580 8 GB Sapphire Technology Limited Nitro+ Radeon RX 580
- v2 patched into amd-staging-drm-next (before inclusion of v3) - v3 with amd-staging-drm-next https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&...
Both flicker with green/black (?) horizontally lines over both screens. Mostly during power level switch. For example during mouse movement/interaction (wheel) and mouse pointer traverse from konsole/etc. to desktop (KDE5 Plasma 5.xx, here).
UVD load (mplayer etc.) is not enough to fix it. E.g. radv (vkcube) not.
But other gfx load (vkmark/glmark2, etc.). When there is lower gfx demand during the above tests (glmark2 -b buffer) the flicker came up, again.
Martin's observation
[-] If i do "echo low > /sys/class/drm/card0/device/power_dpm_force_performance_level" the flickering stops. So the flickering is caused by the automatic powermanagement / reclocking. [-]
Works here, too (tested with v3).
But I never could go below ~32 W !!! Tested with both Nitro+ BIOS modes.
The PSTATE_xxxx wouldn't change on my card. They stay @ 600/1000 all the time!?
GFX Clocks and Power: 300 MHz (MCLK) 300 MHz (SCLK) 600 MHz (PSTATE_SCLK) 1000 MHz (PSTATE_MCLK) 750 mV (VDDGFX) 32.76 W (average GPU)
GPU Temperature: 31 C GPU Load: 0 % MEM Load: 3 %
Any hints?
And sorry for my bad English this time - my best friend from beginning of German Gymnasium died after 6 years of fight against cancer. He aged only 52. Leaving a wife and two little girls...
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #23 from Dieter Nützel Dieter@nuetzel-hh.de --- Oh, BTW Martin which type are your 2 identical displays? HDMI (like mine) or DisplayPort?
I use sound over HDMI, too. And only one display present it.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #24 from Martin mwolf@adiumentum.com --- I am so sorry to hear that Dieter, this is really terrible. But I can assure you your English is fine.
I also think your problem is somewhere else. You mentioned it yourself. and I think Alex Deucher did as well, that your P-States are not changing.
I have two different screens. One HP ZR2440W connected via DVI and a Dell U2412M connected via Displayport. Both run at 59.95Hz@1920x1200
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #25 from Alex Deucher alexdeucher@gmail.com --- The patches I posted only affect multiple monitors with identical timing. That means identical modelines, not just the same resolution and refresh rate. In practice this generally means you need to use identical monitors. If you are using a single monitor or multiple different monitors, the patches are not relevant for you.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #26 from Dieter Nützel Dieter@nuetzel-hh.de --- Created attachment 145180 --> https://bugs.freedesktop.org/attachment.cgi?id=145180&action=edit xrandr -q
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #27 from Dieter Nützel Dieter@nuetzel-hh.de --- This is going nuts...
Martins has
2 different displays (both 59.95Hz@1920x1200), RX 480 and _very nice_ numbers (only 12.222 W), now
GFX Clocks and Power: 300 MHz (MCLK) <= !!! 308 MHz (SCLK) 300 MHz (PSTATE_SCLK) 300 MHz (PSTATE_MCLK) 800 mV (VDDGFX) <= !!! 12.222 W (average GPU)
=> working (?!) but flicker
I have
2 identical displays BenQ GL2440H (both 60.00 Hz @ 1920x1080), RX580 and 'normal' numbers (~32 W - but to high?!), now
GFX Clocks and Power: 300 MHz (MCLK) <= !!! 300 MHz (SCLK) 600 MHz (PSTATE_SCLK) 1000 MHz (PSTATE_MCLK) 750 mV (VDDGFX) <= !!! mine is better, but... 32.76 W (average GPU)
=> working (?!) but flicker, too.
This 600 MHz (PSTATE_SCLK) 1000 MHz (PSTATE_MCLK)
must be a different problem (compare with Martin's RX 480). I open another ticket for it.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #28 from Dieter Nützel Dieter@nuetzel-hh.de --- I've tried solving the flicker with both fixes (sent by magist3r) from this bug
Bug 102646 - Screen flickering under amdgpu-experimental [buggy auto power profile] https://bugs.freedesktop.org/show_bug.cgi?id=102646
But no success.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #29 from Martin mwolf@adiumentum.com --- @Alex Deucher Is there a fix for the graphical glitches I experience? They seem to be similar to the glitches I get when I enable overclocking with amdgpu.ppfeaturemask=0xffffffff
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #30 from Alex Deucher alexdeucher@gmail.com --- (In reply to Martin from comment #29)
@Alex Deucher Is there a fix for the graphical glitches I experience? They seem to be similar to the glitches I get when I enable overclocking with amdgpu.ppfeaturemask=0xffffffff
It would appear that the monitors don't actually quite sync up in your case otherwise you wouldn't see the flicker.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #31 from Martin mwolf@adiumentum.com --- well the flickering goes away, if I lock the clocks to "low" or "high"
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #32 from Alex Deucher alexdeucher@gmail.com --- (In reply to Martin from comment #31)
well the flickering goes away, if I lock the clocks to "low" or "high"
Exactly. In that case the mclk never changes so there is no flicker. The mclk has to change during the vblank period otherwise you see flickering. If the vblank periods are not synced up across monitors, you see flickering.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #33 from Martin mwolf@adiumentum.com --- thank you for the clarification. Right now I switch manually between low and high when necessary, so I can work around the glitches. Do you think it will be possible to achieve feature parity with windows soon?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #34 from Alex Deucher alexdeucher@gmail.com --- (In reply to Martin from comment #33)
thank you for the clarification. Right now I switch manually between low and high when necessary, so I can work around the glitches. Do you think it will be possible to achieve feature parity with windows soon?
I don't think windows enables mclk switching with multiple monitors either. It's not clear what's different between windows and Linux on your board unfortunately.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #35 from Martin mwolf@adiumentum.com --- Thank you for clarification, do you think there is a solution for the problem on the linux side, since it works absolutely fine on windows.
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #36 from tempel.julian@gmail.com --- (In reply to Dieter Nützel from comment #28)
I've tried solving the flicker with both fixes (sent by magist3r) from this bug
Bug 102646 - Screen flickering under amdgpu-experimental [buggy auto power profile] https://bugs.freedesktop.org/show_bug.cgi?id=102646
But no success.
Have you also applied Ahzo's patch, just in case?
https://bugs.freedesktop.org/show_bug.cgi?id=110865
--- Comment #37 from Dieter Nützel Dieter@nuetzel-hh.de --- (In reply to tempel.julian from comment #36)
(In reply to Dieter Nützel from comment #28)
I've tried solving the flicker with both fixes (sent by magist3r) from this bug
Bug 102646 - Screen flickering under amdgpu-experimental [buggy auto power profile] https://bugs.freedesktop.org/show_bug.cgi?id=102646
But no success.
Have you also applied Ahzo's patch, just in case?
Thanks for the hint.
v2 is already in 'amd-staging-drm-next' f659bb6dae58c113805f92822e4c16ddd3156b79 drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2)
https://bugs.freedesktop.org/show_bug.cgi?id=110865
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #38 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/817.
dri-devel@lists.freedesktop.org