https://bugs.freedesktop.org/show_bug.cgi?id=107277
Bug ID: 107277 Summary: Raven: pci_pm_suspend takes over 1 second Product: DRI Version: DRI git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: pmenzel+bugs.freedesktop@molgen.mpg.de
Created attachment 140698 --> https://bugs.freedesktop.org/attachment.cgi?id=140698&action=edit Linux 4.18-rc5+ messages
Profiling the suspend and resume time [1] on a MSI B350M MORTAR (MS-7A37) with AMD Ryzen 3 2200G with Radeon Vega Graphics with Linux 4.18-rc5+, `pci_pm_suspend()` takes over one second, which is too long.
[ 74.694600] amdgpu 0000:38:00.0: calling pci_pm_suspend+0x0/0x120 @ 1664, parent: 0000:00:08.1 [ 74.700292] sd 8:0:0:0: scsi_bus_suspend+0x0/0x20 [scsi_mod] returned 0 after 25514 usecs [ 74.700326] scsi target8:0:0: calling scsi_bus_suspend+0x0/0x20 [scsi_mod] @ 175, parent: host8 [ 74.700346] scsi target8:0:0: scsi_bus_suspend+0x0/0x20 [scsi_mod] returned 0 after 6 usecs [ 74.700382] scsi host8: calling scsi_bus_suspend+0x0/0x20 [scsi_mod] @ 181, parent: ata9 [ 74.700404] scsi host8: scsi_bus_suspend+0x0/0x20 [scsi_mod] returned 0 after 6 usecs [ 74.700439] ata9: calling ata_port_pm_suspend+0x0/0x40 [libata] @ 173, parent: 0000:39:00.0 [ 74.700522] ata9: ata_port_pm_suspend+0x0/0x40 [libata] returned 0 after 67 usecs [ 74.700544] ahci 0000:39:00.0: calling pci_pm_suspend+0x0/0x120 @ 7, parent: 0000:00:08.2 [ 74.700565] ahci 0000:39:00.0: pci_pm_suspend+0x0/0x120 returned 0 after 14 usecs [ 74.700588] pcieport 0000:00:08.2: calling pci_pm_suspend+0x0/0x120 @ 1666, parent: pci0000:00 [ 74.700620] pcieport 0000:00:08.2: pci_pm_suspend+0x0/0x120 returned 0 after 25 usecs [ 74.712063] usb 1-10: usb_dev_suspend+0x0/0x10 [usbcore] returned 0 after 36874 usecs [ 74.712092] usb usb1: calling usb_dev_suspend+0x0/0x10 [usbcore] @ 174, parent: 0000:15:00.0 [ 74.712156] usb usb1: usb_dev_suspend+0x0/0x10 [usbcore] returned 0 after 51 usecs [ 74.712173] xhci_hcd 0000:15:00.0: calling pci_pm_suspend+0x0/0x120 @ 1669, parent: 0000:00:01.2 [ 74.714213] xhci_hcd 0000:15:00.0: pci_pm_suspend+0x0/0x120 returned 0 after 1985 usecs [ 74.714233] pcieport 0000:00:01.2: calling pci_pm_suspend+0x0/0x120 @ 1671, parent: pci0000:00 [ 74.714248] pcieport 0000:00:01.2: pci_pm_suspend+0x0/0x120 returned 0 after 11 usecs [ 75.670011] amdgpu 0000:38:00.0: 000000007bb4b3e1 unpin not necessary [ 75.736060] amdgpu 0000:38:00.0: pci_pm_suspend+0x0/0x120 returned 0 after 1017036 usecs
[1]: https://github.com/01org/pm-graph/
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |pmenzel+bugs.freedesktop@mo | |lgen.mpg.de
--- Comment #1 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- Created attachment 140699 --> https://bugs.freedesktop.org/attachment.cgi?id=140699&action=edit HTML output of `sudo ./sleepgraph.py -config config/suspend-callgraph.cfg -maxdepth=5`
According to the trace, most of the time is spent in the functions below.
amdgpu_bo_evict_vram [amdgpu] (306.331 ms @ 74.694620) amdgpu_fence_driver_suspend [amdgpu] (0.023 ms @ 75.000953) amdgpu_device_ip_suspend [amdgpu] (694.390 ms @ 75.000977) amdgpu_bo_evict_vram [amdgpu] (24.217 ms @ 75.695369)
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Christian König ckoenig.leichtzumerken@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO
--- Comment #2 from Christian König ckoenig.leichtzumerken@gmail.com --- There isn't much you can do here:
amdgpu_bo_evict_vram [amdgpu] (306.331 ms @ 74.694620)
This is evacuating the content of VRAM to RAM/disk to make sure we don't lose screen content while suspended.
amdgpu_fence_driver_suspend [amdgpu] (0.023 ms @ 75.000953)
Waiting for the evacuation to be completed.
amdgpu_device_ip_suspend [amdgpu] (694.390 ms @ 75.000977)
This is hardware teardown and rather interesting and the only point we could actually do something. Can you figure out what takes so long here?
amdgpu_bo_evict_vram [amdgpu] (24.217 ms @ 75.695369)
Again evacuating VRAM which was locked before because the hardware was still using it.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #3 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- (In reply to Christian König from comment #2)
There isn't much you can do here:
amdgpu_bo_evict_vram [amdgpu] (306.331 ms @ 74.694620)
This is evacuating the content of VRAM to RAM/disk to make sure we don't lose screen content while suspended.
I do not understand that. The integrated graphics device uses the system RAM as VRAM doesn’t it? So why does it have to be evicted at all? Also, I believe it’s 1 GB of VRAM. That means the speed would be 3 GB/s, where it should be much higher with DDR4 shouldn’t it?
amdgpu_fence_driver_suspend [amdgpu] (0.023 ms @ 75.000953)
Waiting for the evacuation to be completed.
amdgpu_device_ip_suspend [amdgpu] (694.390 ms @ 75.000977)
This is hardware teardown and rather interesting and the only point we could actually do something. Can you figure out what takes so long here?
I’ll try to figure that out.
amdgpu_bo_evict_vram [amdgpu] (24.217 ms @ 75.695369)
Again evacuating VRAM which was locked before because the hardware was still using it.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #4 from Alex Deucher alexdeucher@gmail.com --- (In reply to Paul Menzel from comment #3)
(In reply to Christian König from comment #2)
There isn't much you can do here:
amdgpu_bo_evict_vram [amdgpu] (306.331 ms @ 74.694620)
This is evacuating the content of VRAM to RAM/disk to make sure we don't lose screen content while suspended.
I do not understand that. The integrated graphics device uses the system RAM as VRAM doesn’t it? So why does it have to be evicted at all? Also, I believe it’s 1 GB of VRAM. That means the speed would be 3 GB/s, where it should be much higher with DDR4 shouldn’t it?
It's not a problem with S3 (suspend to ram), but it is for S4 (suspend to disk) because power will be lost and the vram carve out area is not managed by the OS. Currently S3 and S4 share the same code paths, so they would need to be reworked to handle S3 and S4 differently. As for the time it takes, depending on how much system memory is in use, some stuff may have to be swapped out to disk to make room for all of the buffers in vram, so a lot of shuffling may need to take place.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #5 from Alex Deucher alexdeucher@gmail.com --- Created attachment 140702 --> https://bugs.freedesktop.org/attachment.cgi?id=140702&action=edit quick hack for s#
Here's a quick hack for S3, but it will need more work to not break S4 support as well.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Alex Deucher alexdeucher@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #140702|quick hack for s# |quick hack for S3 description| |
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW
--- Comment #6 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- (In reply to Paul Menzel from comment #3)
(In reply to Christian König from comment #2)
[…]
amdgpu_device_ip_suspend [amdgpu] (694.390 ms @ 75.000977)
This is hardware teardown and rather interesting and the only point we could actually do something. Can you figure out what takes so long here?
I’ll try to figure that out.
I increased the maximum depth to 10, and according to the trace the loop in `gfx_v9_0_enter_rlc_safe_mode()` is the culprit. Also, in all the function is called three times.
static void gfx_v9_0_enter_rlc_safe_mode(struct amdgpu_device *adev) { uint32_t rlc_setting, data; unsigned i;
if (adev->gfx.rlc.in_safe_mode) return;
/* if RLC is not enabled, do nothing */ rlc_setting = RREG32_SOC15(GC, 0, mmRLC_CNTL); if (!(rlc_setting & RLC_CNTL__RLC_ENABLE_F32_MASK)) return;
if (adev->cg_flags & (AMD_CG_SUPPORT_GFX_CGCG | AMD_CG_SUPPORT_GFX_MGCG | AMD_CG_SUPPORT_GFX_3D_CGCG)) { data = RLC_SAFE_MODE__CMD_MASK; data |= (1 << RLC_SAFE_MODE__MESSAGE__SHIFT); WREG32_SOC15(GC, 0, mmRLC_SAFE_MODE, data);
/* wait for RLC_SAFE_MODE */ for (i = 0; i < adev->usec_timeout; i++) { if (!REG_GET_FIELD(SOC15_REG_OFFSET(GC, 0, mmRLC_SAFE_MODE), RLC_SAFE_MODE, CMD)) break; udelay(1); } adev->gfx.rlc.in_safe_mode = true; } }
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #7 from Michel Dänzer michel@daenzer.net --- (In reply to Paul Menzel from comment #6)
I increased the maximum depth to 10,
Can you attach the resulting HTML output, or is it too large?
and according to the trace the loop in `gfx_v9_0_enter_rlc_safe_mode()` is the culprit.
Can you find out if the loop times out or not?
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #8 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- (In reply to Michel Dänzer from comment #7)
(In reply to Paul Menzel from comment #6)
I increased the maximum depth to 10,
Can you attach the resulting HTML output, or is it too large?
With a max-depth of 10 it’s too large, and even with 16 GB memory the browser takes ages to render it. I can try to trim the trace file. It should be possible to filter stuff, but I haven’t figured out how to pass the device.
Though, you can easily try it yourself.
``` $ git clone https://github.com/01org/pm-graph $ cd pm-graph $ vim config/suspend-callgraph.cfg # change maxdepth to 10 $ sudo ./sleepgraph.py -config config/suspend-callgraph.py # wait $ ls -ltr # shows you the output directory ```
It normally easier to look at the trace file directly.
and according to the trace the loop in `gfx_v9_0_enter_rlc_safe_mode()` is the culprit.
Can you find out if the loop times out or not?
I’ll try.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #9 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- Created attachment 140713 --> https://bugs.freedesktop.org/attachment.cgi?id=140713&action=edit Trimmed ftrace output
Here is the trimmed ftrace output.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #10 from Michel Dänzer michel@daenzer.net --- Created attachment 140714 --> https://bugs.freedesktop.org/attachment.cgi?id=140714&action=edit drm/amdgpu: Fix RLC safe mode test in gfx_v9_0_enter_rlc_safe_mode
Does this patch help?
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #11 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- Created attachment 140734 --> https://bugs.freedesktop.org/attachment.cgi?id=140734&action=edit HTML output of `sudo ./sleepgraph.py -config config/suspend-callgraph.cfg` with filter for amdgpu
(In reply to Michel Dänzer from comment #10)
Created attachment 140714 [details] [review] drm/amdgpu: Fix RLC safe mode test in gfx_v9_0_enter_rlc_safe_mode
Does this patch help?
It looks like it.
Tested-by: Paul Menzel pmenzel@molgen.mpg.de
amdgpu @ 0000:38:00.0 {amdgpu} async_device (Total Suspend: 140.661 ms Total Resume: 1339.330 ms)
(Filtering with amdgpu works now, so I upload the HTML file.)
Now the remaining hot spots are listed below.
amdgpu_device_ip_suspend [amdgpu] (102.888 ms @ 534.324902) → … amdgpu_vcn_suspend [amdgpu] (23.804 ms @ 534.325065) → … drm_atomic_helper_suspend [drm_kms_helper] (53.051 ms @ 534.349412) → __drm_atomic_helper_disable_all.constprop.28 [drm_kms_helper] (52.919 ms @ 534.349543) → … psp_v10_0_ring_stop [amdgpu] (19.894 ms @ 534.406785) […] amdgpu_bo_evict_vram [amdgpu] (17.507 ms @ 534.427790) → fixed by *quick hack for S3* […] pci_set_power_state (20.071 ms @ 534.445402)
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #12 from Michel Dänzer michel@daenzer.net --- (In reply to Paul Menzel from comment #11)
Does this patch help?
It looks like it.
Tested-by: Paul Menzel pmenzel@molgen.mpg.de
Cool, thanks for testing. Looks like that cut down suspend time by ~600 ms. Would that and a solution for VRAM eviction (reducing suspend time to 140 ms) be enough to resolve this report?
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #13 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- (In reply to Michel Dänzer from comment #12)
(In reply to Paul Menzel from comment #11)
Does this patch help?
It looks like it.
Tested-by: Paul Menzel pmenzel@molgen.mpg.de
Cool, thanks for testing. Looks like that cut down suspend time by ~600 ms. Would that and a solution for VRAM eviction (reducing suspend time to 140 ms) be enough to resolve this report?
Yes, that would be enough. For the remaining optimization possibilities separate issues/tickets should be created.
I can also create a new issue/ticket for the VRAM eviction issue, if you like. Then this issue/ticket can be closed directly after your commit is submitted.
https://bugs.freedesktop.org/show_bug.cgi?id=107277
--- Comment #14 from Paul Menzel pmenzel+bugs.freedesktop@molgen.mpg.de --- For the record, bug #100941 [1] is the same for radeon (like Fusion devices).
[1]: https://bugs.freedesktop.org/show_bug.cgi?id=100941
https://bugs.freedesktop.org/show_bug.cgi?id=107277
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #15 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/454.
dri-devel@lists.freedesktop.org