https://bugzilla.kernel.org/show_bug.cgi?id=204181
--- Comment #34 from Sergey Kondakov (virtuousfox@gmail.com) --- (In reply to Nicholas Kazlauskas from comment #33)
I(In reply to Sergey Kondakov from comment #26)
Created attachment 284083 [details] dmesg_2019-08-02-amdgpu_fail_on_patched_5.2.5
(In reply to Nicholas Kazlauskas from comment #24)
This should be fixed with the series linked below:
https://patchwork.freedesktop.org/series/64505/
But it still needs review and backporting to older kernels.
Celebration might have been premature. Hours later I've got another freeze with different error in amdgpu. Only this time, mouse cursor was movable over frozen frame right until I tried switching VT. Here's trace: BUG: unable to handle page fault for address: 0000000800000184 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 2 PID: 21044 Comm: kworker/u16:0 Tainted: G W IO 5.2.5-1396.g79b6a9c-HSF #1 openSUSE Tumbleweed (unreleased) Hardware name: Gigabyte Technology Co., Ltd. GA-990XA-UD3/GA-990XA-UD3,
BIOS
F14e 09/09/2014 Workqueue: events_unbound commit_work RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2e6/0xd60 [amdgpu]
Are you able to consistently reproduce this issue? Is it the same setup and same conditions as before? I haven't been able to see it in my testing at least.
Yes, just having PageFlip enabled in amdgpu guarantees it. Changing anything other than PageFlip doesn't seem to affect it. Forcing TearFree on with PageFlip disabled may also trigger it, I think. You may try my previously linked kernel build in your testing but I doubt that it has something specific for it.
It may be not reproducible with modesetting X driver because it fails to engage page flipping on init and throws a bunch of errors about it in Xorg.0.log. For some reason I'm unable to use modesetting X driver at all, even with page flipping disabled, it draws only mouse cursor on black background instead of sddm login screen. So I have to use amdgpu with PageFlip and TearFree explicitly disabled. But then another, rarer 0010:amdgpu_vm_update_directories+0xe7/0x260 dereference may happen regardless (which I suspect is connected with vm_update_mode option, unlike the first one).
By the way, is there any disadvantage in forcing TearFree to be always on when it works ? Like additional frame of latency or something like that ?