https://bugs.freedesktop.org/show_bug.cgi?id=100745
Bug ID: 100745 Summary: amdgpu fails to wake up DisplayPort DELL monitors with 'clock recovery failed' Product: DRI Version: unspecified Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: DRM/AMDgpu Assignee: dri-devel@lists.freedesktop.org Reporter: mr.nuke.me@gmail.com
On a Fedora 25 system, under kernel 4.10.9, I have an RX480 with three Dell P2715Q monitors connected via displayport.
1. The machine is left alone, until the monitors are put into sleep mode. 2. The mouse is moved until the monitors show signs of coming up.
It is expected that all monitors come up cleanly and an unlock screen is presented.
What actually happens is that not all monitors come up. Some monitors indicate that no signal is coming. Which monitor or monitors fail to come up is non-deterministic.
Every time this happens, dmesg shows exactly three entries of the form: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed
It doesn't matter how many of the three monitors come up, dmesg always shows this message three times.
I've modified the failure point to print the return value of drm_dp_dpcd_read_link_status(), and it comes back as -5. I believe that is -EIEIO
Also, switching to VT2, via Ctrl-Alt-F2 brings up all the monitors with 100% success rate. Switching back to VT1 may either: * present a working unlock screen (20% of the time) * present an unlock screen with Xorg being locked up in a poll() call (50% of the time) * or completely crash Xorg (20% of the time) * lock up the machine (10% of the time) This procedure crashes wayland with 100% yield.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #1 from Edward O'Callaghan funfunctor@folklore1984.net --- OK, I had a short look into this,
So it seems that we have that,
amdgpu_atombios_dp_aux_transfer() calls amdgpu_atombios_dp_process_aux_chan()
which has either a ucReplyStatus == 2 or 3 from atombios returned.
If you could please attach dmesg logs after running
# echo 0xf > /sys/module/drm/parameters/debug
and waiting for the situation to reoccur that would be most useful.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #2 from mr.nuke.me@gmail.com --- Created attachment 130957 --> https://bugs.freedesktop.org/attachment.cgi?id=130957&action=edit log around the time the problem happens (with excessive debug info)
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #3 from Edward O'Callaghan funfunctor@folklore1984.net --- (In reply to mr.nuke.me from comment #2)
Created attachment 130957 [details] log around the time the problem happens (with excessive debug info)
yes ok, so we are indeed hitting 'ucReplyStatus == 2' from atombios. Someone from AMD will have to determine the problem with that then because atombios is a closed component.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #4 from Harry Wentland harry.wentland@amd.com --- Are the monitors set to DP input or to auto-select? If they are in auto-select will setting input to DP help?
I've seen auto-select mode have problems with DP many times, especially with scenarios like coming back from DPMS or S3 resume.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #5 from mr.nuke.me@gmail.com --- P2715Q does not have auto-select mode. They're always listening on the same input.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
Eddie Ringle eddie@ringle.io changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |eddie@ringle.io
--- Comment #6 from Eddie Ringle eddie@ringle.io --- Wanted to add that I'm seeing this issue now under a similar setup. I've seen it in the past, but the last few kernel releases have been pretty smooth. Once I upgraded to GNOME 3.26, however, both 4.13 and now 4.14-rc3 are displaying this issue.
I'm on Arch (using Wayland primarily), with a Fury X and three Dell P2415Q monitors, also connected via DisplayPort. I have MST disabled on all three, since (even Dell has documented) this model has issues hitting 4K@60Hz with it enabled.
Same "displayport link status failed" and "clock recovery failed" messages appear for me, also three times in a row. This more often than not leads to gnome-shell crashing. I see it most often after I've put my computer to sleep when I try to wake it up. Other times when putting it to sleep, one monitor will stay powered and show a backlit blank screen.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
Benjamin Bellec b.bellec@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |b.bellec@gmail.com
--- Comment #7 from Benjamin Bellec b.bellec@gmail.com --- Created attachment 135668 --> https://bugs.freedesktop.org/attachment.cgi?id=135668&action=edit dmesg log error
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #8 from Benjamin Bellec b.bellec@gmail.com --- I hit the same problem today after enabling amdgpu.dc=1 The screen doesn't light up at all if I boot the kernel with amdgpu.dc=1
Config is: Fedora 27 + kernel 4.15.0-0.rc0.git7.1.fc28.x86_64 Radeon R9 380X Dell U2414H
dmesg error is: kernel: [drm:dm_logger_write [amdgpu]] *ERROR* perform_clock_recovery_sequence: Link Training Error, could not get CR after 100 tries.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #130957|text/x-log |text/plain mime type| |
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #9 from Michel Dänzer michel@daenzer.net --- (In reply to Benjamin Bellec from comment #8)
I hit the same problem today after enabling amdgpu.dc=1 The screen doesn't light up at all if I boot the kernel with amdgpu.dc=1
AFAICT this report is about the non-DC code, please file your own report about the issue with DC.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #10 from Dimitrios Liappis dimitrios.liappis@gmail.com --- This is a real problem for me as well, for some time now, with amdgpu (Radeon RX560), Fedora-27, gnome-shell and Dell P2715Q monitor. It happens both on Xorg and Wayland.
kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status failed kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* displayport link status failed kernel: [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* clock recovery failed
The nuisance here is this almost always crashes gnome-shell. Attached coredump excerpt.
I used to be able to circumvent the gnome-shell crash by disabling dpms ("xset -dpms" and/or "xset dpms force off") but this doesn't seem to help anymore.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #11 from Dimitrios Liappis dimitrios.liappis@gmail.com --- Created attachment 137017 --> https://bugs.freedesktop.org/attachment.cgi?id=137017&action=edit gnome-shell coredump after amdgpu displayport link status failed
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #12 from Michel Dänzer michel@daenzer.net --- (In reply to Dimitrios Liappis from comment #10)
The nuisance here is this almost always crashes gnome-shell. Attached coredump excerpt.
FWIW, that's most likely a gnome-shell/mutter bug.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #13 from Dimitrios Liappis dimitrios.liappis@gmail.com --- (In reply to Michel Dänzer from comment #12)
FWIW, that's most likely a gnome-shell/mutter bug.
Thank you, indeed this is a mutter bug; I hunted the bug in https://bugzilla.gnome.org/show_bug.cgi?id=789501 and there is a specific patch for a monitor-manager/kms bug that fixes it, as described in https://bugzilla.gnome.org/show_bug.cgi?id=789501.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #14 from Kimmo sleijeri@gmail.com --- Confirming still similar problem (Screen stays black while trying resume from suspend) 2x DELL u2415h + display port daisy chain + RX480 (amdgpu 18.0.1-2)
https://bugs.freedesktop.org/show_bug.cgi?id=100745
--- Comment #15 from Kimmo sleijeri@gmail.com --- (In reply to Kimmo from comment #14)
Confirming still similar problem (Screen stays black while trying resume from suspend) 2x DELL u2415h + display port daisy chain + RX480 (amdgpu 18.0.1-2)
Actually need to correct myself. The suspend problem seems to be fixed and working ok for me so far. Problem seems to be more related if Dell monitor is allowed to shutdown by itself due to inactivity, but not sure if it has any relations to amdgpu. Using KDE plasma desktop 5.13.3. Sorry for inconvenience.
https://bugs.freedesktop.org/show_bug.cgi?id=100745
Martin Peres martin.peres@free.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |MOVED
--- Comment #16 from Martin Peres martin.peres@free.fr --- -- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/158.
dri-devel@lists.freedesktop.org