https://bugs.freedesktop.org/show_bug.cgi?id=107880
Bug ID: 107880 Summary: Regression: System fails to boot on raven ridge 4.18 vs 4.19 rc Product: DRI Version: DRI git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: blocker Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: marvin.damschen@gullz.de
Created attachment 141500 --> https://bugs.freedesktop.org/attachment.cgi?id=141500&action=edit Successful boot with 4.18.7 on Raven Ridge
System boots fine with kernel 4.18 on a Raven Ridge (AMD Ryzen 5 2500U, Lenovo E485, latest firmware from linux-firmware.git), but boot fails with kernel 4.19 (tested rc2 and rc3). System hangs after "fb: switching to amdgpudrmfb from EFI VGA". I am unable to obtain any logs of the crash (LUKS encryption might be the reason?). I will attach a log of a working boot with 4.18.7, please let me know how to provide more info.
Thank you Marvin
https://bugs.freedesktop.org/show_bug.cgi?id=107880
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #141500|text/x-log |text/plain mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #1 from Michel Dänzer michel@daenzer.net --- (In reply to Marvin Damschen from comment #0)
System hangs after "fb: switching to amdgpudrmfb from EFI VGA".
How long have you waited for? E.g. if a microcode file is missing, the attempt to load it can hang for one or several minutes before timing out.
I am unable to obtain any logs of the crash (LUKS encryption might be the reason?).
One possibility is to prevent the driver from loading by passing
modprobe.blacklist=amdgpu
on the kernel command line, then you can try manually loading it with
sudo modprobe amdgpu
and should get the full dmesg output.
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #2 from Marvin Damschen marvin.damschen@gullz.de --- (In reply to Michel Dänzer from comment #1)
(In reply to Marvin Damschen from comment #0)
System hangs after "fb: switching to amdgpudrmfb from EFI VGA".
How long have you waited for? E.g. if a microcode file is missing, the attempt to load it can hang for one or several minutes before timing out.
Waited for ~5min now, but nothing changed.
I am unable to obtain any logs of the crash (LUKS encryption might be the reason?).
One possibility is to prevent the driver from loading by passing
modprobe.blacklist=amdgpu
on the kernel command line, then you can try manually loading it with
sudo modprobe amdgpu
and should get the full dmesg output.
Thank you, this worked. Full output is attached, but appears fine. Still, the video output freezes.
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #3 from Marvin Damschen marvin.damschen@gullz.de --- Created attachment 141502 --> https://bugs.freedesktop.org/attachment.cgi?id=141502&action=edit Modprobe amdgpu freezes video output with 4.19-rc3
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #4 from Michel Dänzer michel@daenzer.net --- Looks like there may be an issue with the VCN microcode loading. Does
amdgpu.ip_block_mask=0xff
on the kernel command line avoid the problem?
Can you bisect?
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #5 from Marvin Damschen marvin.damschen@gullz.de --- (In reply to Michel Dänzer from comment #4)
Looks like there may be an issue with the VCN microcode loading. Does
amdgpu.ip_block_mask=0xff
on the kernel command line avoid the problem?
It does! dmesg contains a lot of call traces though (I will attach).
Can you bisect?
I can, but will probably find time by the end of the week only.
Thank you Marvin
https://bugs.freedesktop.org/show_bug.cgi?id=107880
Marvin Damschen marvin.damschen@gullz.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #141500|0 |1 is obsolete| |
--- Comment #6 from Marvin Damschen marvin.damschen@gullz.de --- Created attachment 141503 --> https://bugs.freedesktop.org/attachment.cgi?id=141503&action=edit 4.19-rc3 with amdgpu.ip_block_mask=0xff
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #7 from jamesz@amd.com jamesz@amd.com --- Dmesg shows: Found VCN firmware Version: 1.24 Family ID: 18. It is really old. Please update with latest vcn firmware for raven (1.73) https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/....
https://bugs.freedesktop.org/show_bug.cgi?id=107880
Marvin Damschen marvin.damschen@gullz.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |INVALID
--- Comment #8 from Marvin Damschen marvin.damschen@gullz.de --- I overlooked that the ubuntu kernel builds bring their own firmware for each version. In my case, I had the recent firmware in /lib/firmware/amdgpu, but /lib/firmware/4.19.0-041900rc3-generic/amdgpu/ was actually used. I moved the files accordingly and 4.19-rc3 boots perfectly fine now without any extra parameters. Thank you and sorry for the trouble.
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #9 from jamesz@amd.com jamesz@amd.com --- Hi Marvin,
That is great! I want to check with you where this old VCN firmware came from.
Did you install old AMD ROCm package on this system before?
Best Regards! James Zhu
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #10 from Marvin Damschen marvin.damschen@gullz.de --- (In reply to jamesz@amd.com from comment #9)
Did you install old AMD ROCm package on this system before?
Yes, I did. I now believe that removing all traces of rocm-dkms eliminated the root of the problem.
Best regards Marvin
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #11 from jamesz@amd.com jamesz@amd.com --- I think latest ROCm package has fixed this issue. You are welcome to try it.
Best Regards! James zhu
https://bugs.freedesktop.org/show_bug.cgi?id=107880
--- Comment #12 from Marvin Damschen marvin.damschen@gullz.de --- rocm-dkms (http://repo.radeon.com/rocm/apt/debian/pool/main/r/rock-dkms/rock-dkms_1.8-1...) currently still contains the old firmware. I will open an issue on ROCm's GitHub repo.
Best regards Marvin
dri-devel@lists.freedesktop.org