https://bugzilla.kernel.org/show_bug.cgi?id=210123
Bug ID: 210123 Summary: drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] - flip_done time out with vmwgfx Product: Drivers Version: 2.5 Kernel Version: 5.3.18-24.9.1 Hardware: x86-64 OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Video(DRI - non Intel) Assignee: drivers_video-dri@kernel-bugs.osdl.org Reporter: stefan+kernel@mayr-stefan.de Regression: No
Since we upgraded SUSE Linux Enterprise Server 15 from kernel 4.12.14-197.45.1 (SLES15 SP1) to 5.3.18-24.9.1 (SLES15 SP2) or later we see the following error messages on some virtual machines:
[102215.857602] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:38:crtc-0] flip_done timed out [102226.097847] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:34:plane-0] flip_done timed out
We were also provided some more current kernels from SUSE support: - 5.3.18 with updated modules - 5.8.15 - 5.9.1
The issue stays the same. All affected machines are running in runlevel 3. The only graphical "thing" is the boot screen and when this error appears in the logs this screen is sitting at an empty login prompt. All virtual machines are running in a VMware environment on ESXi-Hosts with versions between 6.0.x and 6.7.x. We could not track it down to specific ESXi versions, load on the ESXi host or even the virtual machine. This happens on different versions, loaded and also on idle hosts and virtual machines.
The issue goes away when we add vmwgfx to the grub module_blacklist.
I know our kernel versions are somehow SUSE specific. But what changed between 4.12 and 5.3 and later that may cause this message between drm and vmwgfx?
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #1 from Stefan Mayr (stefan+kernel@mayr-stefan.de) --- #208373 seems simliar but for us it started with an older kernel version
https://bugzilla.kernel.org/show_bug.cgi?id=210123
Stefan Mayr (stefan+kernel@mayr-stefan.de) changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://bugzilla.kernel.org | |/show_bug.cgi?id=208373
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #2 from Michel Dänzer (michel@daenzer.net) --- (In reply to Stefan Mayr from comment #1)
#208373 seems simliar but for us it started with an older kernel version
It's about amdgpu, unlikely to be directly related.
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #3 from Stefan Mayr (stefan+kernel@mayr-stefan.de) --- Did some more test with Kernel versions provided by SUSE:
Kernel 5.0.13 - 6 days without issues Kernel 5.2.14 - 2 days until we got the error message
Today I installed 5.1.16 and we wait if this versions shows the error or not
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #4 from Stefan Mayr (stefan+kernel@mayr-stefan.de) --- I had another look at bug 208373: the inital reporter also uses vmwgfx and amdgpu is only mentioned in bug 208373, comment 2
https://bugzilla.kernel.org/show_bug.cgi?id=210123
Zack Rusin (zackr@vmware.com) changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |zackr@vmware.com
--- Comment #5 from Zack Rusin (zackr@vmware.com) --- Do you know if there are any errors in the vmware.log? It sounds like it's guest isolated bug in vmwgfx but it'd be great to be able to take a look at vmware.log from one of the sessions with errors.
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #6 from Stefan Mayr (stefan+kernel@mayr-stefan.de) --- Kernel 5.1.16 also showed the error message after 2 days.
So this seems to be triggered by changes between 5.0.13 and 5.1.16. I'm out of SUSE Kernels to narrow it down even more.
I also checked for a vmware.log on this host but I couldn't find one. Installed are the open-vm-tools that are part of SLES15. Other logfiles of vmtoolsd don't show any error messages.
https://bugzilla.kernel.org/show_bug.cgi?id=210123
--- Comment #7 from Stefan Mayr (stefan+kernel@mayr-stefan.de) --- Same issue with Kernel 5.10.9
dri-devel@lists.freedesktop.org