https://bugzilla.kernel.org/show_bug.cgi?id=213561
Bug ID: 213561 Summary: [bisected] AMD GPU can no longer idle state after commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3 Product: Drivers Version: 2.5 Kernel Version: 5.13rc7 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: untaintableangel@hotmail.co.uk Regression: No
Nature of the problem: RX 5700 is unable to enter low power state at idle (see below for usual behaviour)
Sensors at idle prior to the commit: amdgpu-pci-0f00 Adapter: PCI adapter vddgfx: 775.00 mV fan1: 0 RPM (min = 0 RPM, max = 3200 RPM) edge: +48.0°C (crit = +100.0°C, hyst = -273.1°C) (emerg = +105.0°C) junction: +48.0°C (crit = +110.0°C, hyst = -273.1°C) (emerg = +115.0°C) mem: +52.0°C (crit = +105.0°C, hyst = -273.1°C) (emerg = +110.0°C) power1: 8.00 W (cap = 165.00 W)
After the commit, the lowest is: amdgpu-pci-0f00 Adapter: PCI adapter vddgfx: 1.03 V fan1: 0 RPM (min = 0 RPM, max = 3200 RPM) edge: +54.0°C (crit = +100.0°C, hyst = -273.1°C) (emerg = +105.0°C) junction: +56.0°C (crit = +110.0°C, hyst = -273.1°C) (emerg = +115.0°C) mem: +52.0°C (crit = +105.0°C, hyst = -273.1°C) (emerg = +110.0°C) power1: 31.00 W (cap = 165.00 W)
This problem wasn't present in rc6 but is present in 5.13rc7 and bisects to:
1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3 is the first bad commit commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3 Author: Yifan Zhang yifan1.zhang@amd.com Date: Thu Jun 10 10:10:07 2021 +0800
drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.
Signed-off-by: Yifan Zhang yifan1.zhang@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org
The device is a Sapphire Pulse RX5700 and this problem is seen even with one monitor set at 60Hz. GPU: 0f:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] [1002:731f] (rev c4)