https://bugs.freedesktop.org/show_bug.cgi?id=97635
Bug ID: 97635 Summary: radeon fails to initialize some DisplayPort monitors Product: DRI Version: XOrg git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: normal Priority: medium Component: DRM/Radeon Assignee: dri-devel@lists.freedesktop.org Reporter: nybbles2bytes@gmail.com
Created attachment 126301 --> https://bugs.freedesktop.org/attachment.cgi?id=126301&action=edit Logs to compare all screens properly booted to some not
It took a mistake or two but I have been directed that this is the place to report this issue. I believe I am in a unique position to help with DisplayPort issues (and want to do so) because I have been able to generate both working and non-working logs and because I have a significant quantity of DisplayPorts on my system, 6 in total. Also, I put a wealth of information together (automated for completeness and consistency) that should help the development team nail down the cause of this issue.
Here's everything I have been able to determine but first the hardware setup: My graphics card is "HD 5870 Eyefinity 6" which has 6 DisplayPorts. I have them setup in a grid of 3 across by 2 down. Each display is at a resolution of 2560x1440 creating a total work area of 7680x2880 in a Xinerama setup running on the KDE4 desktop.
I currently have 3 kernels in my grub list which are: kernel-3.16.7 kernel-4.7.0 kernel-4.7.2
These are all with suse's Tumbleweed however kernel-3.16.7 came with openSUSE 13.2.
I have no evidence that my problem is related to so many screens of DisplayPorts but it does allow me to see more variations of the problem than most do which helps pinpoint what the real problem is (hopefully!)
Focusing on kernel-4.7.2 the kernel would only turn on the first two displays. That happens during boot long before Xorg gets loaded.
In Xorg the behavior is a little strange when it gets DisplayPorts off from the kernel. Xorg will acknowledge all 6 displays but it is not able to turn on any that are initially off when the kernel was handling them. E.g.: the last 4 monitors in the case of the 4.x kernels.
The upshot is that when I go to the multidisplay setup part of KDE all 6 displays are showing as active even though only the first two are turned on in reality. If I disable and re-enable the displays turned off, they don't turn on. If I use xrandr to turn them on, no dice. That is, if they are off when the kernel was handling them they are off for good, nothing in Xorg or KDE can change it that I have found.
That said, adding radeon.audio=0 to the boot makes things better but doesn't fix the issue completely. With that settings sometimes I'll get all 6 boot good, more often I'll get 5 out of six boot good and one bad. Usually, the last one (DisplayPort 5) is the one that fails when one does, however, not always.
I went to the trouble to write a script to gather information and I think I got enough to show where things are going wrong. At least enough to show a difference between a good and bad boot and I will help with more information as needed. I really want to get this problem solved and I'll do whatever I can to help.
In the tarred file, to see what's different between a good and bad boot all you have to do is a diff on the files: ./logs/timing-stripped/filtered-drm/
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
screens-0-5-good_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
Anybody who wanted to also gather comprehensive information for the developers could take the file ./gather-info-for-diagnostics.sh in the tarred file and modify as needed for their own system.
That said, below explains in detail what's in the tarred compressed file.
Directory structure =================== . +-- logs +-- filtered-drm +-- timing-stripped +-- filtered-drm
This structure is as follows: . = The script that creates the log files and script to turn on any screens that are off during boot (more on this one later).
./logs ====== The raw log files the script gathered which include: dmsg.txt - from dmesg proc-cmdline.txt - from /proc/cmdline module-kernel-parameters.txt - from /sys/module/kernel/parameters/* module-processor-parameters.txt - from /sys/module/processor/parameters/* sys-module-radeon-parameters.txt - from /sys/module/radeon/parameters/* Xorg.0.log.txt - from /var/log/Xorg.0.log
./logs/filtered-drm =================== Some of the above raw log files with lines that do not contain radeon information removed - makes it easier to see what's relevant. If you want to know exactly how the lines were filtered you can look at the script ./gather-info-for-diagnostics.sh.
./logs/timing-stripped ====================== The above raw log files with the timing at the beginning of each line removed. This makes using diff programs easier (I use meld on Linux). If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.
./logs/timing-stripped/filtered-drm =================================== Some of the above raw log files with the timing at the beginning of each line removed and lines that do not contain radeon information removed. Again, makes it easier to see what's relevant. If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.
Scripts =======
./gather-info-for-diagnostics.sh -------------------------------- Does all the heavy lifting in gathering the info.
./display-on.sh --------------- This was a curious discovery and may make fixing the issue easier. This is because I found when the script was like this:
xrandr --output DisplayPort-${1} --mode 1920x1080 xrandr --output DisplayPort-${1} --mode 2560x1440
it sometimes it would turn the display on but others it would turn it off. To consistantly turn the display on I had to change it to this:
xrandr --output DisplayPort-${1} --mode 1920x1080 sleep 5 xrandr --output DisplayPort-${1} --mode 2560x1440
suggesting there might be a timing problem that needs to be addressed. Even though running this script can turn the display on that was erroneously off during boot the display will turn itself back off after a few seconds or so so it's not a usable workaround. I guess there is some status flag during boot in the kernel that ultimately can't be changed or overridden that eventually reasserts itself.
Update: It may not be that the 5 second delay solved the issue. It may be that just running it again was the solution. Perhaps the first time some cache got cleared, I'm not really sure, some experimenting is in need on this one.
File Names ==========
File names take the form of: <what happened to the screens at boot>_<partial command line when booting the kernel>_<the file name>.txt E.g. The file:
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
can be broken down to: screens-0-4-good-5-bad = The first 5 of the 6 screens came on as they should during boot but the 6th one (number 5) did not. kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects = shows most of the boot command line dmsg = A key indicating the file contents, from dmesg in this case .txt = That this is a text file
If the file starts off with something like this: screens-0-5-good-after-5-fixed-with_display-on.sh it means after booting and logging in I ran the script ./display-on.sh to turn on the display and then gathered all the log information. I will have gathered the log information prior to running the script as well so you will also see files prefixed with just screens-0-5-good in such a case.
Let me know what else I can do to help.