https://bugzilla.kernel.org/show_bug.cgi?id=208981
--- Comment #9 from florian.laroche@googlemail.com --- Hello,
Am Mi., 14. Okt. 2020 um 11:44 Uhr schrieb bugzilla-daemon@bugzilla.kernel.org:
- Kernel 5.8.14 and 5.9 with mostly Gentoo kernel config
- AMD Ryzen 7 PRO 4750G CPU+iGPU
- ASRock A520M-ITX/ac mainboard + ECC UDIMM memory
The trace mentioned above disappeared when I updated BIOS (v. 1.20 from 2020/9/18, it contains AGESA 1.0.8.0). However, I'm still not able to run ROCm
I have updated my motherboard Gigabyte B550I AORUS PRO AX to BIOS F10 from 09/18/2020 with AMD AGESA ComboV2 1.0.8.1.
The trace is still present, so this issue is still open for me.
OpenCL (tried various versions, including 3.7 and 3.8), system either hangs, or (if the program is killed early) dmesg shows
Evicting PASID 0x8001 queues
BTW, clinfo causes GPU resets, and leaves 99% GPU utilization, while dmesg shows something like
qcm fence wait loop timeout expired The cp might be in an unrecoverable state due to an unsuccessful queues preemption amdgpu: Failed to evict process queues amdgpu: Failed to quiesce KFD amdgpu 0000:07:00.0: amdgpu: GPU reset begin! [drm] free PSP TMP buffer amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume ...(and similarly for kernel 5.9.0)
It is probably an off-topic, but it seems to be related to amdgpu driver, and I don't know how to move forward (and somebody reported that ROCk 3.7 driver works well with APU Renoir).
Seems this is all unrelated to my bug-report.
best regards,
Florian La Roche