https://bugs.freedesktop.org/show_bug.cgi?id=108710
Bug ID: 108710 Summary: Since 4.20 kernel Vega 56 hangs when I surf pages in steam client Product: DRI Version: XOrg git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: mikhail.v.gavrilov@gmail.com
Created attachment 142434 --> https://bugs.freedesktop.org/attachment.cgi?id=142434&action=edit dmesg
$ inxi -bM System: Host: localhost.localdomain Kernel: 4.20.0-0.rc1.git4.1.fc30.x86_64 x86_64 bits: 64 Desktop: Gnome 3.30.1 Distro: Fedora release 30 (Rawhide) Machine: Type: Desktop Mobo: ASUSTeK model: ROG STRIX X470-I GAMING v: Rev 1.xx serial: <root required> UEFI: American Megatrends v: 0901 date: 07/23/2018 CPU: 8-Core: AMD Ryzen 7 2700X type: MT MCP speed: 3427 MHz min/max: 2200/4000 MHz Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] driver: amdgpu v: kernel Display: wayland server: Fedora Project X.org 1.20.3 driver: amdgpu resolution: 3840x2160~60Hz OpenGL: renderer: Radeon RX Vega (VEGA10 DRM 3.27.0 4.20.0-0.rc1.git4.1.fc30.x86_64 LLVM 7.0.0) v: 4.5 Mesa 18.2.4 Network: Device-1: Intel I211 Gigabit Network driver: igb Device-2: Realtek RTL8822BE 802.11a/b/g/n/ac WiFi adapter driver: r8822be Drives: Local Storage: total: 11.36 TiB used: 5.93 TiB (52.2%) Info: Processes: 455 Uptime: 16m Memory: 31.30 GiB used: 15.99 GiB (51.1%) Shell: bash inxi: 3.0.27
[ 3852.511166] gmc_v9_0_process_interrupt: 56 callbacks suppressed [ 3852.511182] amdgpu 0000:0b:00.0: [mmhub] VMC page fault (src_id:0 ring:169 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 3852.511184] amdgpu 0000:0b:00.0: in page starting at address 0x000000401080c000 from 18 [ 3852.511186] amdgpu 0000:0b:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00040152 [ 3862.673344] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma1 timeout, signaled seq=72072, emitted seq=72074 [ 3862.673356] [drm] GPU recovery disabled. [ 4044.170764] sysrq: SysRq : Show Blocked State [ 4044.170959] task PC stack pid father [ 4044.171026] kworker/u32:5 D10872 253 2 0x80000000 [ 4044.171060] Workqueue: events_unbound commit_work [drm_kms_helper] [ 4044.171063] Call Trace: [ 4044.171073] ? __schedule+0x2f3/0xb90 [ 4044.171077] ? __lock_acquire+0x279/0x1650 [ 4044.171085] ? dma_fence_default_wait+0x242/0x330 [ 4044.171089] schedule+0x2f/0x90 [ 4044.171092] schedule_timeout+0x31c/0x4f0 [ 4044.171096] ? find_held_lock+0x34/0xa0 [ 4044.171099] ? find_held_lock+0x34/0xa0 [ 4044.171104] ? mark_held_locks+0x57/0x80 [ 4044.171134] ? _raw_spin_unlock_irqrestore+0x4b/0x60 [ 4044.171140] ? dma_fence_default_wait+0x242/0x330 [ 4044.171143] dma_fence_default_wait+0x26e/0x330 [ 4044.171147] ? dma_fence_release+0x120/0x120 [ 4044.171153] dma_fence_wait_timeout+0x182/0x200 [ 4044.171160] reservation_object_wait_timeout_rcu+0x236/0x4e0 [ 4044.171263] amdgpu_dm_do_flip+0x112/0x380 [amdgpu] [ 4044.171378] amdgpu_dm_atomic_commit_tail+0x6d0/0xd30 [amdgpu] [ 4044.171386] ? _raw_spin_unlock_irq+0x29/0x40 [ 4044.171391] ? wait_for_completion_timeout+0x73/0x1a0 [ 4044.171408] commit_tail+0x3d/0x70 [drm_kms_helper] [ 4044.171413] process_one_work+0x27d/0x600 [ 4044.171423] worker_thread+0x3c/0x390 [ 4044.171428] ? drain_workqueue+0x180/0x180 [ 4044.171433] kthread+0x120/0x140 [ 4044.171437] ? kthread_park+0x80/0x80 [ 4044.171442] ret_from_fork+0x27/0x50 [ 4044.172479] (time-dir) D13944 15221 1 0x00000000 [ 4044.172487] Call Trace: [ 4044.172496] ? __schedule+0x2f3/0xb90 [ 4044.172501] ? prepare_to_wait_event+0xd2/0x180 [ 4044.172508] schedule+0x2f/0x90 [ 4044.172514] drm_sched_entity_flush+0x1df/0x1f0 [gpu_sched] [ 4044.172518] ? finish_wait+0x80/0x80 [ 4044.172580] amdgpu_ctx_mgr_entity_flush+0x7c/0xc0 [amdgpu] [ 4044.172637] amdgpu_flush+0x1f/0x30 [amdgpu] [ 4044.172640] filp_close+0x34/0x70 [ 4044.172645] __x64_sys_close+0x1e/0x50 [ 4044.172649] do_syscall_64+0x60/0x1f0 [ 4044.172653] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 4044.172656] RIP: 0033:0x7f5a96622ec7 [ 4044.172662] Code: Bad RIP value. [ 4044.172665] RSP: 002b:00007ffcce3d00e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 [ 4044.172668] RAX: ffffffffffffffda RBX: 000000000000007c RCX: 00007f5a96622ec7 [ 4044.172671] RDX: 0000000000000000 RSI: 00007ffcce3d0180 RDI: 000000000000007c [ 4044.172673] RBP: 000055d29a73aa60 R08: 000055d29a73b676 R09: 0000000000000000 [ 4044.172675] R10: 00007f5a965bbae0 R11: 0000000000000293 R12: 00007f5a95939750 [ 4044.172677] R13: 0000000000000000 R14: 0000000000000001 R15: 00007ffcce3d0180 [ 4057.229953] INFO: task kworker/u32:5:253 blocked for more than 120 seconds. [ 4057.229957] Tainted: G WC 4.20.0-0.rc1.git4.1.fc30.x86_64 #1 [ 4057.229959] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4057.229962] kworker/u32:5 D10872 253 2 0x80000000 [ 4057.229979] Workqueue: events_unbound commit_work [drm_kms_helper] [ 4057.229982] Call Trace: [ 4057.229994] ? __schedule+0x2f3/0xb90 [ 4057.229998] ? __lock_acquire+0x279/0x1650 [ 4057.230006] ? dma_fence_default_wait+0x242/0x330 [ 4057.230010] schedule+0x2f/0x90 [ 4057.230013] schedule_timeout+0x31c/0x4f0 [ 4057.230017] ? find_held_lock+0x34/0xa0 [ 4057.230020] ? find_held_lock+0x34/0xa0 [ 4057.230025] ? mark_held_locks+0x57/0x80 [ 4057.230028] ? _raw_spin_unlock_irqrestore+0x4b/0x60 [ 4057.230034] ? dma_fence_default_wait+0x242/0x330 [ 4057.230037] dma_fence_default_wait+0x26e/0x330 [ 4057.230041] ? dma_fence_release+0x120/0x120 [ 4057.230047] dma_fence_wait_timeout+0x182/0x200 [ 4057.230052] reservation_object_wait_timeout_rcu+0x236/0x4e0 [ 4057.230134] amdgpu_dm_do_flip+0x112/0x380 [amdgpu] [ 4057.230221] amdgpu_dm_atomic_commit_tail+0x6d0/0xd30 [amdgpu] [ 4057.230228] ? _raw_spin_unlock_irq+0x29/0x40 [ 4057.230232] ? wait_for_completion_timeout+0x73/0x1a0 [ 4057.230249] commit_tail+0x3d/0x70 [drm_kms_helper] [ 4057.230254] process_one_work+0x27d/0x600 [ 4057.230263] worker_thread+0x3c/0x390 [ 4057.230269] ? drain_workqueue+0x180/0x180 [ 4057.230272] kthread+0x120/0x140 [ 4057.230276] ? kthread_park+0x80/0x80 [ 4057.230281] ret_from_fork+0x27/0x50 [ 4057.230571] Showing all locks held in the system: [ 4057.230581] 1 lock held by khungtaskd/94: [ 4057.230583] #0: 00000000a1fc4e6f (rcu_read_lock){....}, at: debug_show_all_locks+0x15/0x183 [ 4057.230596] 3 locks held by kworker/u32:5/253: [ 4057.230597] #0: 00000000156505f1 ((wq_completion)"events_unbound"){+.+.}, at: process_one_work+0x1f3/0x600 [ 4057.230603] #1: 000000000d248f14 ((work_completion)(&state->commit_work)){+.+.}, at: process_one_work+0x1f3/0x600 [ 4057.230608] #2: 000000003df03870 (reservation_ww_class_mutex){+.+.}, at: amdgpu_dm_do_flip+0xd6/0x380 [amdgpu] [ 4057.230700] 2 locks held by gnome-shell/2152: [ 4057.230702] #0: 00000000a2cb2cbf (crtc_ww_class_acquire){+.+.}, at: drm_mode_cursor_common+0x95/0x220 [drm] [ 4057.230721] #1: 00000000e86bda0d (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x101/0x120 [drm] [ 4057.230746] 5 locks held by Xwayland/2222: [ 4057.230784] 1 lock held by htop/3225: [ 4057.230848] 1 lock held by CPU 0/KVM/4333: [ 4057.230989] 1 lock held by (time-dir)/15221: [ 4057.230991] #0: 000000006ef8a6af (&mgr->lock){+.+.}, at: amdgpu_ctx_mgr_entity_flush+0x3c/0xc0 [amdgpu]
[ 4057.231068] =============================================