https://bugs.freedesktop.org/show_bug.cgi?id=110886
Bug ID: 110886 Summary: After S3 resume, kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:57:crtc-0] flip_done timed out Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: kai.heng.feng@canonical.com
System: HP ProBook 645 G4 APU: Ryzen 3 PRO 2300U
After system S3 resume, the system may freeze:
Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:57:crtc-0] flip_done timed out Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:57:crtc-0] flip_done timed out Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:65:eDP-1] flip_done timed out Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:50:plane-3] flip_done timed out Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: WARNING: CPU: 1 PID: 1058 at drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:5580 amdgpu_dm_atomic_commit_tail+0x19f4/0x1a80 [amdgpu] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: Modules linked in: ccm nls_iso8859_1 amdgpu snd_hda_codec_conexant arc4 iwlmvm snd_hda_codec_generic amd_iommu_v2 ledtrig_audio snd_hda_codec_hdmi gpu_sched kvm_amd snd_hda_intel i2c_ algo_bit snd_hda_codec ccp ttm snd_hwdep kvm snd_hda_core drm_kms_helper mac80211 snd_pcm irqbypass syscopyarea snd_seq sysfillrect iwlwifi snd_timer sysimgblt snd_seq_device snd fb_sys_fops drm crct10dif_pclmul crc32_pclmul so undcore cfg80211 ghash_clmulni_intel rtsx_pci_ms aesni_intel hp_wmi sparse_keymap k10temp wmi_bmof memstick aes_x86_64 ucsi_acpi glue_helper hp_accel typec_ucsi typec crypto_simd cryptd video hp_wireless wmi joydev input_leds l is3lv02d mac_hid input_polldev serio_raw sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 rtsx_pci_sdmmc psmouse i2c_piix4 ahci rtsx_pci libahci r8169 realtek i2c_hid hid Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: CPU: 1 PID: 1058 Comm: kworker/u32:6 Not tainted 5.2.0-rc1+ #2 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: Hardware name: HP HP ProBook 645 G4/8401, BIOS Q82 Ver. 01.07.01 05/06/2019 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: Workqueue: events_unbound async_run_entry_fn Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: RIP: 0010:amdgpu_dm_atomic_commit_tail+0x19f4/0x1a80 [amdgpu] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: Code: ff ff 8b b0 90 04 00 00 48 c7 c7 61 bc bf c0 e8 c2 0a b5 ff 0f b6 85 06 fe ff ff 88 85 08 fe ff ff 49 8b 45 08 e9 f9 f1 ff ff <0f> 0b e9 1d f3 ff ff 0f 0b 48 8b 06 0f b6 8e e0 0 1 00 00 bf 04 00 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: RSP: 0018:ffffb1e4c243b8e0 EFLAGS: 00010002 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: RAX: 0000000000000002 RBX: 0000000000000202 RCX: ffff9a8fd18b6970 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffff9a8fd02a5958 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: RBP: ffffb1e4c243bb20 R08: ffffb1e4c243b7f4 R09: 0000000000000000 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: R10: 0000000000000000 R11: ffffb1e4c243b838 R12: ffff9a8fe2ba0400 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: R13: ffff9a8fe1495f80 R14: ffff9a8fd18b6800 R15: ffff9a8fd2280000 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: FS: 0000000000000000(0000) GS:ffff9a8fe7c40000(0000) knlGS:0000000000000000 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: CR2: 0000000000000000 CR3: 000000020f434000 CR4: 00000000003406e0 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: Call Trace: Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: commit_tail+0x42/0x70 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ? commit_tail+0x42/0x70 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_atomic_helper_commit+0x113/0x120 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: amdgpu_dm_atomic_commit+0xb1/0xf0 [amdgpu] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_atomic_commit+0x4a/0x50 [drm] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: restore_fbdev_mode_atomic+0x1bf/0x1d0 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: restore_fbdev_mode+0x4e/0x160 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ? _cond_resched+0x19/0x30 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_fb_helper_restore_fbdev_mode_unlocked+0x4e/0xa0 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_fb_helper_hotplug_event.part.41+0x97/0xc0 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: drm_kms_helper_hotplug_event+0x2a/0x40 [drm_kms_helper] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: amdgpu_device_resume+0x319/0x3a0 [amdgpu] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: amdgpu_pmops_resume+0x31/0x60 [amdgpu] Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: pci_pm_resume+0x6d/0xa0 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ? pci_pm_suspend_late+0x40/0x40 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: dpm_run_callback+0x5b/0x150 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: device_resume+0xb8/0x200 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: async_resume+0x1d/0x30 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: async_run_entry_fn+0x3c/0x150 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: process_one_work+0x20f/0x410 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: worker_thread+0x34/0x400 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: kthread+0x120/0x140 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ? process_one_work+0x410/0x410 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ? __kthread_parkme+0x70/0x70 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ret_from_fork+0x22/0x40 Jun 11 01:40:21 u-HP-ProBook-645-G4 kernel: ---[ end trace 55daf5798b2f5f1a ]---
Test conducted on latest amdgpu/amd-staging-drm-next, it's commit 40cc64619a2580b26f924bcabdefd555e7831a14 as of now.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #1 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 144498 --> https://bugs.freedesktop.org/attachment.cgi?id=144498&action=edit Full kernel log
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #2 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 144502 --> https://bugs.freedesktop.org/attachment.cgi?id=144502&action=edit Another kind of fail
Jun 11 03:02:41 u-HP-ProBook-645-G4 kernel: [drm] psp command failed and response status is (-65529)
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #3 from Alex Deucher alexdeucher@gmail.com --- Is this a regression? If so, can you bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #4 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Alex Deucher from comment #3)
Is this a regression? If so, can you bisect?
No this is not a regression.
This issue (S3 resume fail) also happens on previous kernel versions, but without any stack trace logged. On amd-staging-drm-next we can observe the same issue and a stacktrace.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #5 from Alex Deucher alexdeucher@gmail.com --- Does disabling the IOMMU help?
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #6 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 145044 --> https://bugs.freedesktop.org/attachment.cgi?id=145044&action=edit failed log when iommu is disabled.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #7 from Kai-Heng Feng kai.heng.feng@canonical.com --- I also tried disabling GFXOFF but the same issue still happens: diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c index a24beaa4fb01..62a8394b1f5f 100644 --- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c +++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c @@ -173,6 +173,7 @@ int hwmgr_early_init(struct pp_hwmgr *hwmgr) case AMDGPU_FAMILY_RV: switch (hwmgr->chip_id) { case CHIP_RAVEN: + hwmgr->feature_mask &= ~PP_GFXOFF_MASK; hwmgr->od_enabled = false; hwmgr->smumgr_funcs = &smu10_smu_funcs; smu10_init_function_pointers(hwmgr);
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #8 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- (In reply to Kai-Heng Feng from comment #6)
Created attachment 145044 [details] failed log when iommu is disabled.
What was the failur ewith IOMMU disabled ? Is it the same as with IOMMU enabled ? In the log I only see PSP errors on resume. Can you confirm that the only failure/error you observed in the log in that use case ?
Can you please provide your FW versions by cat /sys/kernel/debug/dri/0/amdgpu_firmware_info
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #9 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Andrey Grodzovsky from comment #8)
(In reply to Kai-Heng Feng from comment #6)
Created attachment 145044 [details] failed log when iommu is disabled.
What was the failur ewith IOMMU disabled ?
Blanked screen. Graphics no longer works.
Is it the same as with IOMMU enabled ?
Yes.
In the log I only see PSP errors on resume. Can you confirm that the only failure/error you observed in the log in that use case ?
Yes. I haven't seen "[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:57:crtc-0] flip_done timed out" for a while.
Now it always shows PSP fail.
Can you please provide your FW versions by cat /sys/kernel/debug/dri/0/amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x00000000 UVD feature version: 0, firmware version: 0x00000000 MC feature version: 0, firmware version: 0x00000000 ME feature version: 40, firmware version: 0x00000099 PFP feature version: 40, firmware version: 0x000000ae CE feature version: 40, firmware version: 0x0000004d RLC feature version: 1, firmware version: 0x00000213 RLC SRLC feature version: 1, firmware version: 0x00000001 RLC SRLG feature version: 1, firmware version: 0x00000001 RLC SRLS feature version: 1, firmware version: 0x00000001 MEC feature version: 40, firmware version: 0x0000018b MEC2 feature version: 40, firmware version: 0x0000018b SOS feature version: 0, firmware version: 0x00000000 ASD feature version: 0, firmware version: 0x001ad4d4 TA XGMI feature version: 0, firmware version: 0x00000000 TA RAS feature version: 0, firmware version: 0x00000000 SMC feature version: 0, firmware version: 0x00001e4f SDMA0 feature version: 41, firmware version: 0x000000a9 VCN feature version: 0, firmware version: 0x0110901c DMCU feature version: 0, firmware version: 0x00000000 VBIOS version: SWBRT32481.001
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #10 from Samantha McVey samantham@posteo.net --- I am getting this same issue (at least I believe the same). It is in the 5.2 series but not in the 5.1 series of the kernel. If needed I can post my logs. I have Lenovo A485 w/ 2700U
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #11 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Samantha McVey from comment #10)
I am getting this same issue (at least I believe the same). It is in the 5.2 series but not in the 5.1 series of the kernel. If needed I can post my logs. I have Lenovo A485 w/ 2700U
Can you please build a kernel from branch [1], reproduce the issue, and attach `journalctl -b -1 -k` so we can check if is really a same issue.
[1] https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #12 from Kai-Heng Feng kai.heng.feng@canonical.com ---
Now it always shows PSP fail.
I've dug up more info about this issue. It always times out in psp_cmd_submit_buf(). Particularly, this code section:
while (*((unsigned int *)psp->fence_buf) != index) { if (--timeout == 0) break; msleep(1); }
psp->fence_buf stuck at 406 and index stuck at 407 and it eventually times out. This _always_ happens at 27th time of S3, and freeze the whole system at 28th S3 attempt.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #13 from Samantha McVey samantham@posteo.net --- Created attachment 145085 --> https://bugs.freedesktop.org/attachment.cgi?id=145085&action=edit amd-staging-drm-net dmesg log
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #14 from Samantha McVey samantham@posteo.net --- Created attachment 145086 --> https://bugs.freedesktop.org/attachment.cgi?id=145086&action=edit amd-staging-drm-next xorg log
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #15 from Samantha McVey samantham@posteo.net --- I have uploaded my dmesg log and xorg log from amd-staging-drm-next
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #16 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Samantha McVey from comment #13)
Created attachment 145085 [details] amd-staging-drm-net dmesg log
Doesn't look like the same one.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #17 from Alex Deucher alexdeucher@gmail.com --- Does this system support conventional S3 or is it a reduced ACPI platform that only supports suspend to idle?
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #18 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Alex Deucher from comment #17)
Does this system support conventional S3 or is it a reduced ACPI platform that only supports suspend to idle?
This system defaults to S3, and the issue happens under S3. Is there any first gen Raven Ridge supports s2idle?
https://bugs.freedesktop.org/show_bug.cgi?id=110886
Kai-Heng Feng kai.heng.feng@canonical.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|After S3 resume, kernel: |After S3 resume, kernel: |[drm:drm_atomic_helper_wait |[drm] psp command failed |_for_flip_done |and response status is |[drm_kms_helper]] *ERROR* |(-65529) at 27th time of |[CRTC:57:crtc-0] flip_done |S3. 28th time of S3 freeze |timed out |the system.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #19 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- (In reply to Kai-Heng Feng from comment #9)
(In reply to Andrey Grodzovsky from comment #8)
(In reply to Kai-Heng Feng from comment #6)
Created attachment 145044 [details] failed log when iommu is disabled.
What was the failur ewith IOMMU disabled ?
Blanked screen. Graphics no longer works.
Is it the same as with IOMMU enabled ?
Yes.
In the log I only see PSP errors on resume. Can you confirm that the only failure/error you observed in the log in that use case ?
Yes. I haven't seen "[drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:57:crtc-0] flip_done timed out" for a while.
Now it always shows PSP fail.
Can you please provide your FW versions by cat /sys/kernel/debug/dri/0/amdgpu_firmware_info
VCE feature version: 0, firmware version: 0x00000000 UVD feature version: 0, firmware version: 0x00000000 MC feature version: 0, firmware version: 0x00000000 ME feature version: 40, firmware version: 0x00000099 PFP feature version: 40, firmware version: 0x000000ae CE feature version: 40, firmware version: 0x0000004d RLC feature version: 1, firmware version: 0x00000213 RLC SRLC feature version: 1, firmware version: 0x00000001 RLC SRLG feature version: 1, firmware version: 0x00000001 RLC SRLS feature version: 1, firmware version: 0x00000001 MEC feature version: 40, firmware version: 0x0000018b MEC2 feature version: 40, firmware version: 0x0000018b SOS feature version: 0, firmware version: 0x00000000 ASD feature version: 0, firmware version: 0x001ad4d4 TA XGMI feature version: 0, firmware version: 0x00000000 TA RAS feature version: 0, firmware version: 0x00000000 SMC feature version: 0, firmware version: 0x00001e4f SDMA0 feature version: 41, firmware version: 0x000000a9 VCN feature version: 0, firmware version: 0x0110901c DMCU feature version: 0, firmware version: 0x00000000 VBIOS version: SWBRT32481.001
Can you please confirm the issue happens regardless of graphic enabled, load system in console mode and verify you still observe the problem.(In reply to Kai-Heng Feng from comment #12)
Now it always shows PSP fail.
I've dug up more info about this issue. It always times out in psp_cmd_submit_buf(). Particularly, this code section:
while (*((unsigned int *)psp->fence_buf) != index) { if (--timeout == 0) break; msleep(1); }
psp->fence_buf stuck at 406 and index stuck at 407 and it eventually times out. This _always_ happens at 27th time of S3, and freeze the whole system at 28th S3 attempt.
Does it happen also when no acceleration in system - i mean if you do S3 cycles from console mode ?
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #20 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Andrey Grodzovsky from comment #19)
Can you please confirm the issue happens regardless of graphic enabled, load system in console mode and verify you still observe the problem.
I guess you mean without graphical session? Yes I already tested that. 1. If amdgpu.ko is loaded, the issue happens under both console or graphical session. 2. If amdgpu.ko is not loaded, the issue doesn't happen regardless of console or graphical session.
Does it happen also when no acceleration in system - i mean if you do S3 cycles from console mode ?
Please refer to the point 2 above.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #21 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- In fact please rebase latest drm-next from here - https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next, there are 2 changes by Alex in communication with PSP with might help
drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp
See if the PSP errors go away with that.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #22 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Andrey Grodzovsky from comment #21)
In fact please rebase latest drm-next from here - https://cgit.freedesktop.org/~agd5f/linux/log/?h=amd-staging-drm-next, there are 2 changes by Alex in communication with PSP with might help
drm/amdgpu/psp: invalidate the hdp read cache before reading the psp response drm/amdgpu/psp: flush HDP write fifo after submitting cmds to the psp
See if the PSP errors go away with that.
The slightly different error message still popped out after 27th S3, and 28th S3 attempt froze the system: Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: [drm:psp_hw_start.cold [amdgpu]] *ERROR* PSP load asd failed! Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP resume failed Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -22 Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-22). Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -22 Sep 28 05:38:44 u-HP-ProBook-645-G4 kernel: PM: Device 0000:04:00.0 failed to resume async: error -22
$ journalctl -b -1 -k | grep "suspend entry (deep)" | wc -l 28
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #23 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 145576 --> https://bugs.freedesktop.org/attachment.cgi?id=145576&action=edit journalctl last boot kernel message
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #24 from Andrey Grodzovsky andrey.grodzovsky@amd.com --- (In reply to Kai-Heng Feng from comment #23)
Created attachment 145576 [details] journalctl last boot kernel message
Can u retry with latest FW from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
and also load kernel with drm.debug=1 as there seems a failure in PSP command submission during FW loading but the actual code of failure is now under debug log level.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #25 from Kai-Heng Feng kai.heng.feng@canonical.com --- (In reply to Andrey Grodzovsky from comment #24)
(In reply to Kai-Heng Feng from comment #23)
Created attachment 145576 [details] journalctl last boot kernel message
Can u retry with latest FW from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
Still same issue.
and also load kernel with drm.debug=1 as there seems a failure in PSP command submission during FW loading but the actual code of failure is now under debug log level.
I can reproduce the issue on latest firmware ("amdgpu: update vega20 ucode for 19.30") and latest amd-staging-drm-next ("drm/amdgpu: remove redundant variable r and redundant return statement").
I don't see keep trying latest kernel/firmware makes us going anywhere. If you need a physical hardware to debug, please just let me know.
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #26 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 145666 --> https://bugs.freedesktop.org/attachment.cgi?id=145666&action=edit PSP failed with drm.debug=1
https://bugs.freedesktop.org/show_bug.cgi?id=110886
--- Comment #27 from Kai-Heng Feng kai.heng.feng@canonical.com --- Created attachment 145667 --> https://bugs.freedesktop.org/attachment.cgi?id=145667&action=edit ring test failed with drm.debug=1
https://bugs.freedesktop.org/show_bug.cgi?id=110886
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #28 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/822.
dri-devel@lists.freedesktop.org