On 09/30/16 10:28, Hans de Goede wrote:
Hi,
On 30-09-16 05:09, Laszlo Ersek wrote:
Hello Daniel,
On 06/21/16 14:08, daniel.vetter at ffwll.ch (Daniel Vetter) wrote:
We already have a fallback in place to fill out the unique from dev->unique, which is set to something reasonable in drm_dev_alloc.
Which means we only need to have a special set_busid for pci devices, to be able to care the backwards compat code for drm 1.1 around, which libdrm still needs.
While developing and testing this patch things blew up in really interesting ways, and the code is rather confusing in naming things between the kernel code, ioctl #defines and libdrm. For the next brave dragon slayer, document all this madness properly in the userspace interface section of gpu.tmpl.
v2: Make drm_dev_set_unique static and update kerneldoc.
v3: Entire rewrite, plus document what's going on for posterity in the gpu docbook uapi section.
v4: Drop accidental amdgpu hunk (Emil).
v5: Drop accidental omapdrm vblank counter change (Emil).
Cc: Gustavo Padovan <gustavo.padovan at collabora.co.uk> Cc: Emil Velikov <emil.l.velikov at gmail.com> Tested-by: Gustavo Padovan <gustavo.padovan at collabora.co.uk> (virt_gpu) Reviewed-by: Emil Velikov <emil.l.velikov at gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
Documentation/DocBook/gpu.tmpl | 4 ++ drivers/gpu/drm/armada/armada_drv.c | 1 - drivers/gpu/drm/drm_ioctl.c | 58 +++++++++++++++++++++++++ drivers/gpu/drm/drm_platform.c | 18 -------- drivers/gpu/drm/etnaviv/etnaviv_drv.c | 1 - drivers/gpu/drm/exynos/exynos_drm_drv.c | 1 - drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 1 - drivers/gpu/drm/imx/imx-drm-core.c | 1 - drivers/gpu/drm/msm/msm_drv.c | 1 - drivers/gpu/drm/nouveau/nouveau_drm.c | 1 - drivers/gpu/drm/omapdrm/omap_drv.c | 1 - drivers/gpu/drm/shmobile/shmob_drm_drv.c | 1 - drivers/gpu/drm/tilcdc/tilcdc_drv.c | 1 - drivers/gpu/drm/virtio/virtgpu_drm_bus.c | 10 ----- drivers/gpu/drm/virtio/virtgpu_drv.c | 1 - drivers/gpu/drm/virtio/virtgpu_drv.h | 1 - include/drm/drmP.h | 1 - 17 files changed, 62 insertions(+), 41 deletions(-)
This patch (commit a325725633c2) regresses X.org on QEMU's virtio-vga device. Please see
https://bugzilla.redhat.com/show_bug.cgi?id=1366842
complete with a bisection log under
drivers/gpu/drm/virtio/
(comment 20).
Copying Thorsten so he can include this report in his next v4.8-rc8 regression report, if he chooses so. (Commit a325725633c2 is part of v4.8-rc1, but we only managed to identify it now.) The last such report I know of is archived e.g. at http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1239220.html.
Reported-by: Joachim Frieben jfrieben@hotmail.com
First of all Joachim thanks for bisecting this.
(Small correction: while Joachim reported the BZ, the bisection was done by yours truly. The bisection was painful enough that I'd want to take "credit" for it -- using the stock Fedora kernel config for the bisection, I think my laptop must have burned through enough electricity to power a small town from Christmas to New Year's Eve. I *literally* took naps between the test cycles. (And I mean literally literally.) I know about "localmodconfig" but it has broken on me before, so I opted for the Fedora config.)
I was thinking about this bug / issue, while doing my laps in the swimming pool.
If you do that, it's easy to lose count of your laps ;)
I wanted to add a comment to the bug to tell you that this is likely a Xorg xserver issue and not a kernel issue and that there is no need to bisect, but it is too late for that now.
Ouch. :/
Xorg when running without a Xorg.conf searches for what it considers a "primary" gpu / video-card, basically it attempts to bring up the right card in setups where there are multiple cards and if it does not find one exits with an error.
The xserver has a 2 step process for finding the primary card:
- It searches for is a card which has a vga-bios mapped,
as we've already determined in the mentioned Red Hat bug that works for the classic qemu emulated video-cards, but not for qemu's virtio-vga.
- If that does not work Xorg will fallback to any video class device
on pci-bus 1.
This fallback actually has been broken in the Xorg xserver for quite a while now and only 2 days ago a patch from Laszlo was merged to fix this.
Only for things to break again due to this kernel patch.
Since the whole step 2) thingie is very much tied to x86 machines where pci-bus 0 used to be the main bus and pci-bus 1 the agp, which is sorta an obsolete assumption now a days and since relying on bus numbers / enumeration order is a bad idea in general I'm not entirely sure if this counts as a regression.
I've discussed the problem of the xserver exiting with an error when no primary device can be found with some people (ajax) at XDC last week since there are other use-cases where the pci-bus 1 fallback does not work.
As such I've been working on a xserver patch-set to make the xserver try harder (pick the first available device) when both steps described above fail to find one, which should make things work even with the newest (broken / regressed) kernels.
Given this mail thread, I guess I'm working after all today (I had planned a day off)
Apologies...
and I'll try to wrap up this patch-set and reply to this mail with the server patches attached for Joachim and/or Laszlo to test.
Thank you Hans, that's very kind of you. (And I also greatly appreciate your description of the primary card selection logic.)
p.s.
It would be interesting to do a lspci on both a working and a non-working kernel to see what exactly is going on here.
I'll upload the outputs to the RHBZ soon.
Thanks! Laszlo