https://bugs.freedesktop.org/show_bug.cgi?id=108585
Bug ID: 108585 Summary: *ERROR* hw_init of IP block <gfx_v8_0> failed -22 Product: DRI Version: unspecified Hardware: PowerPC OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: dan@danny.cz CC: bcrocker@redhat.com
Created attachment 142253 --> https://bugs.freedesktop.org/attachment.cgi?id=142253&action=edit full dmesg output
amdgpu driver fails to initialize Radeon WX4100 PRO on my Talos Power9 system with kernel 4.19 (GA). There is no such problem with 4.19-rc8 (and earlier).
... [ 2.421393] [drm] amdgpu kernel modesetting enabled. [ 2.421512] amdgpu 0000:01:00.0: enabling device (0540 -> 0542) [ 2.421732] [drm] initializing kernel modesetting (POLARIS11 0x1002:0x67E3 0x1002:0x0B0D 0x00). [ 2.421776] [drm] register mmio base: 0x00000000 [ 2.421781] [drm] register mmio size: 262144 [ 2.421787] [drm] PCI I/O BAR is not found. [ 2.421798] [drm] add ip block number 0 <vi_common> [ 2.421801] [drm] add ip block number 1 <gmc_v8_0> [ 2.421805] [drm] add ip block number 2 <tonga_ih> [ 2.421808] [drm] add ip block number 3 <powerplay> [ 2.421811] [drm] add ip block number 4 <dce_v11_0> [ 2.421814] [drm] add ip block number 5 <gfx_v8_0> [ 2.421818] [drm] add ip block number 6 <sdma_v3_0> [ 2.421821] [drm] add ip block number 7 <uvd_v6_0> [ 2.421824] [drm] add ip block number 8 <vce_v3_0> [ 2.421837] [drm] UVD is enabled in VM mode [ 2.421840] [drm] UVD ENC is enabled in VM mode [ 2.421845] [drm] VCE enabled in VM mode [ 2.609475] md/raid1:md127: active with 2 out of 2 mirrors [ 2.625800] md127: detected capacity change from 0 to 481708474368 [ 2.627770] md/raid1:md126: active with 2 out of 2 mirrors [ 2.643643] md126: detected capacity change from 0 to 1072693248 [ 2.769520] usb 1-4: new high-speed USB device number 4 using xhci_hcd [ 2.769550] ATOM BIOS: 113-D0150600-103 [ 2.769747] [drm] vm size is 256 GB, 2 levels, block size is 10-bit, fragment size is 9-bit [ 2.769846] pci 0000:01 : [PE# 00] pseudo-bypass sizes: tracker 32800 bitmap 8192 TCEs 65536 [ 2.769851] pci 0000:01 : [PE# 00] TCE tables configured for pseudo-bypass [ 2.769903] amdgpu 0000:01:00.0: BAR 2: releasing [mem 0x6000010000000-0x60000101fffff 64bit pref] [ 2.769907] amdgpu 0000:01:00.0: BAR 0: releasing [mem 0x6000000000000-0x600000fffffff 64bit pref] [ 2.769939] pci 0000:00:00.0: BAR 15: releasing [mem 0x6000000000000-0x6003fbff0ffff 64bit pref] [ 2.769956] pci 0000:00:00.0: BAR 15: assigned [mem 0x6000000000000-0x600017fffffff 64bit pref] [ 2.769961] amdgpu 0000:01:00.0: BAR 0: assigned [mem 0x6000000000000-0x60000ffffffff 64bit pref] [ 2.769972] amdgpu 0000:01:00.0: BAR 2: assigned [mem 0x6000100000000-0x60001001fffff 64bit pref] [ 2.770004] pci 0000:00:00.0: PCI bridge to [bus 01] [ 2.770009] pci 0000:00:00.0: bridge window [mem 0x600c000000000-0x600c07fefffff] [ 2.770015] pci 0000:00:00.0: bridge window [mem 0x6000000000000-0x6003fbff0ffff 64bit pref] [ 2.770066] amdgpu 0000:01:00.0: VRAM: 4096M 0x000000F400000000 - 0x000000F4FFFFFFFF (4096M used) [ 2.770069] amdgpu 0000:01:00.0: GART: 256M 0x0000000000000000 - 0x000000000FFFFFFF [ 2.770075] [drm] Detected VRAM RAM=4096M, BAR=4096M [ 2.770077] [drm] RAM width 128bits GDDR5 [ 2.770162] [TTM] Zone kernel: Available graphics memory: 32717248 kiB [ 2.770165] [TTM] Zone dma32: Available graphics memory: 2097152 kiB [ 2.770166] [TTM] Initializing pool allocator [ 2.771771] [drm] amdgpu: 4096M of VRAM memory ready [ 2.771774] [drm] amdgpu: 4096M of GTT memory ready. [ 2.771790] [drm] GART: num cpu pages 4096, num gpu pages 65536 [ 2.771839] [drm] PCIE GART of 256M enabled (table at 0x000000F4008D0000). [ 2.771911] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 2.771913] [drm] Driver supports precise vblank timestamp query. [ 2.772311] [drm] AMDGPU Display Connectors [ 2.772313] [drm] Connector 0: [ 2.772315] [drm] DP-1 [ 2.772316] [drm] HPD5 [ 2.772318] [drm] DDC: 0x4868 0x4868 0x4869 0x4869 0x486a 0x486a 0x486b 0x486b [ 2.772320] [drm] Encoders: [ 2.772322] [drm] DFP1: INTERNAL_UNIPHY1 [ 2.772323] [drm] Connector 1: [ 2.772325] [drm] DP-2 [ 2.772326] [drm] HPD4 [ 2.772328] [drm] DDC: 0x486c 0x486c 0x486d 0x486d 0x486e 0x486e 0x486f 0x486f [ 2.772330] [drm] Encoders: [ 2.772332] [drm] DFP2: INTERNAL_UNIPHY1 [ 2.772333] [drm] Connector 2: [ 2.772335] [drm] DP-3 [ 2.772336] [drm] HPD3 [ 2.772338] [drm] DDC: 0x4870 0x4870 0x4871 0x4871 0x4872 0x4872 0x4873 0x4873 [ 2.772340] [drm] Encoders: [ 2.772341] [drm] DFP3: INTERNAL_UNIPHY [ 2.772343] [drm] Connector 3: [ 2.772345] [drm] DP-4 [ 2.772346] [drm] HPD2 [ 2.772348] [drm] DDC: 0x4874 0x4874 0x4875 0x4875 0x4876 0x4876 0x4877 0x4877 [ 2.772350] [drm] Encoders: [ 2.772351] [drm] DFP4: INTERNAL_UNIPHY [ 2.772477] [drm] Chained IB support enabled! [ 2.773607] [drm] Found UVD firmware Version: 1.130 Family ID: 16 [ 2.775588] [drm] Found VCE firmware Version: 53.26 Binary ID: 3 [ 2.780989] amdgpu: [powerplay] dpm has been enabled [ 2.990665] [drm:gfx_v8_0_ring_test_ring [amdgpu]] *ERROR* amdgpu: ring 0 test failed (scratch(0xC040)=0xCAFEDEAD) [ 2.990695] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v8_0> failed -22 [ 2.990698] amdgpu 0000:01:00.0: amdgpu_device_ip_init failed [ 2.990701] amdgpu 0000:01:00.0: Fatal error during GPU init [ 2.990703] [drm] amdgpu: finishing device. [ 3.833155] ------------[ cut here ]------------ [ 3.833157] Memory manager not clean during takedown. [ 3.833188] WARNING: CPU: 0 PID: 338 at drivers/gpu/drm/drm_mm.c:950 drm_mm_takedown+0x3c/0x60 [drm] [ 3.833191] Modules linked in: raid1 amdgpu(+) mfd_core chash i2c_algo_bit gpu_sched drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_vpmsum tg3 aacraid drm_panel_orientation_quirks i2c_core [ 3.833204] CPU: 0 PID: 338 Comm: kworker/0:2 Not tainted 4.19.0-1.fc30.op.1.ppc64le #1 [ 3.833210] Workqueue: events work_for_cpu_fn [ 3.833213] NIP: c00800000cdfea14 LR: c00800000cdfea10 CTR: c0000000006ff6e0 [ 3.833215] REGS: c0000007f85d74d0 TRAP: 0700 Not tainted (4.19.0-1.fc30.op.1.ppc64le) [ 3.833217] MSR: 9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 44002222 XER: 20040000 [ 3.833224] CFAR: c000000000119b04 IRQMASK: 0 GPR00: c00800000cdfea10 c0000007f85d7750 c00800000ce6f200 0000000000000029 GPR04: 0000000000000001 0000000000000399 ffffffffffffffff 0000000000000000 GPR08: 0000000000000007 0000000000000007 0000000000000001 0769077207750764 GPR12: 0000000000002000 c000000001820000 c000000000148e68 c0002006d7c997c0 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 fffffffffffffef7 c0000007e02b3060 GPR24: c0000007e02b3080 c0000007e02b3088 c0000007e02b3078 0000000000000000 GPR28: 0000000000000000 c0000007e02b2980 c0000007e02b29a0 c0000007f9f38300 [ 3.833258] NIP [c00800000cdfea14] drm_mm_takedown+0x3c/0x60 [drm] [ 3.833265] LR [c00800000cdfea10] drm_mm_takedown+0x38/0x60 [drm] [ 3.833267] Call Trace: [ 3.833275] [c0000007f85d7750] [c00800000cdfea10] drm_mm_takedown+0x38/0x60 [drm] (unreliable) [ 3.833307] [c0000007f85d77b0] [c00800000dba9058] amdgpu_vram_mgr_fini+0x40/0xb0 [amdgpu] [ 3.833313] [c0000007f85d77e0] [c00800000cf15904] ttm_bo_clean_mm+0x10c/0x1a0 [ttm] [ 3.833341] [c0000007f85d7860] [c00800000db7a35c] amdgpu_ttm_fini+0x94/0x180 [amdgpu] [ 3.833370] [c0000007f85d78e0] [c00800000db7d1f8] amdgpu_bo_fini+0x20/0x40 [amdgpu] [ 3.833404] [c0000007f85d7900] [c00800000dc0ce50] gmc_v8_0_sw_fini+0x58/0x98 [amdgpu] [ 3.833440] [c0000007f85d7930] [c00800000dd55718] amdgpu_device_fini+0x3c4/0x628 [amdgpu] [ 3.833469] [c0000007f85d79e0] [c00800000db67b04] amdgpu_driver_unload_kms+0x6c/0x100 [amdgpu] [ 3.833496] [c0000007f85d7a10] [c00800000db67d84] amdgpu_driver_load_kms+0x1ec/0x280 [amdgpu] [ 3.833504] [c0000007f85d7a90] [c00800000cdfa830] drm_dev_register+0x1a8/0x270 [drm] [ 3.833533] [c0000007f85d7b30] [c00800000db60708] amdgpu_pci_probe+0x160/0x290 [amdgpu] [ 3.833537] [c0000007f85d7bc0] [c0000000006d7ddc] local_pci_probe+0x6c/0x140 [ 3.833541] [c0000007f85d7c50] [c00000000013ad48] work_for_cpu_fn+0x38/0x60 [ 3.833543] [c0000007f85d7c80] [c00000000013f880] process_one_work+0x250/0x500 [ 3.833546] [c0000007f85d7d20] [c00000000013fda0] worker_thread+0x270/0x5b0 [ 3.833550] [c0000007f85d7dc0] [c00000000014900c] kthread+0x1ac/0x1c0 [ 3.833553] [c0000007f85d7e30] [c00000000000bdd4] ret_from_kernel_thread+0x5c/0x68 [ 3.833556] Instruction dump: [ 3.833558] 60000000 e9230038 38630038 7fa34800 4d9e0020 7c0802a6 f8010010 f821ffa1 [ 3.833563] 3c620000 e8638540 4803297d e8410018 <0fe00000> 38210060 e8010010 7c0803a6 [ 3.833569] ---[ end trace 6009d10b516b7f29 ]--- [ 3.833576] [TTM] Finalizing pool allocator [ 3.833612] [TTM] Zone kernel: Used memory at exit: 6 kiB [ 3.833617] [TTM] Zone dma32: Used memory at exit: 6 kiB [ 3.833620] [drm] amdgpu: ttm finalized [ 3.833902] amdgpu: probe of 0000:01:00.0 failed with error -22 ...
Both 4.19-rc8 and 4.19 kernels use the same firmware from linux-firmware-20180815-86.gitf1b95fe5.fc28 package.