Hi Laszlo,
On 6 April 2018 at 13:15, Laszlo Ersek lersek@redhat.com wrote:
On 04/04/18 19:29, Laszlo Ersek wrote:
Hi Emil,
On 04/03/18 19:13, Emil Velikov wrote:
On 29 March 2018 at 12:17, Laszlo Ersek lersek@redhat.com wrote:
On 03/28/18 16:35, Emil Velikov wrote:
On 28 March 2018 at 11:27, Laszlo Ersek lersek@redhat.com wrote:
On 03/28/18 03:24, Emil Velikov wrote:
> Gents, can someone double-check this please? Just in case.
I guess I should test whether this series regresses the use case described in c2cbc38b97; is that correct?
Precisely.
[3] https://github.com/evelikov/linux/commits/ioctl-cleanups
Unfortunately, this series seems to reintroduce the regression fixed / described earlier in commit c2cbc38b97.
Thank you very much for testing.
Believe I've tracked it down to a broken commit from 2014 ;-) Please try the following branch [1] - it's untested, but I'm 99% sure it will work like a charm.
Alas, it does not work.
I've done some more digging. Let me quote the commit message on the proposed patch again:
Ealier commit a325725633c26aa66ab940f762a6b0778edf76c0 did not attribute that virtio can be either PCI or a platform device and removed the .set_busid hook. Whereas only the "platform" instance should have been removed.
Since then, two things have happened:
- the driver manually calls drm_dev_set_unique, addressing the Xserver
regression - see commit 9785b4321b0bd701f8d21d3d3c676a7739a5cf22
- core itself calls drm_pci_set_busid, on drm_set_busid IOCTL setting
the busid, so we don't need to fallback to dev->unique - see commit 5c484cee7ef9c4fd29fa0ba09640d55960977145
With that in place we can remove the local workaround.
This write-up of events is not precise enough. Instead, I think the timeline is as follows:
(1) Commit a325725633c2 ("drm: Lobotomize set_busid nonsense for !pci drivers", 2016-06-21) introduced the regression. By removing drm_virtio_set_busid(), commit a325725633c2 changed the behavior of the following function:
static int drm_set_busid(struct drm_device *dev, struct drm_file *file_priv) { struct drm_master *master = file_priv->master; int ret;
if (master->unique != NULL) drm_unset_busid(dev, master); if (dev->driver->set_busid) { ret = dev->driver->set_busid(dev, master); if (ret) { drm_unset_busid(dev, master); return ret; } } else { WARN_ON(!dev->unique); master->unique = kstrdup(dev->unique, GFP_KERNEL); if (master->unique) master->unique_len = strlen(dev->unique); } return 0;
}
When a325725633c2 removed drm_virtio_set_busid(), drm_set_busid() started taking the second branch, which wasn't doing the right thing for virtio-vga at the time.
(2) There were two ways to fix the regression: either (a) return drm_set_busid() to taking the first branch, or (b) tweak the virtio-vga driver so that the second branch in drm_set_busid() start to behave right.
My commit c2cbc38b9715 ("drm: virtio: reinstate drm_virtio_set_busid()", 2016-10-04) implemented (a), by reverting a few chunks of a325725633c2.
(3) Gerd thought that approach (b) was superior (and I totally defer to him on this, now that I'm learning of his patches in the first place :) ). Namely, in commit 9785b4321b0b ("drm/virtio: fix busid regression", 2016-11-15), he implemented approach (b), and right after, in commit 1775db074a32 ("Revert "drm: virtio: reinstate drm_virtio_set_busid()"", 2016-11-15), he undid my commit c2cbc38b9715.
In other words, Gerd replaced approach (a) with approach (b).
(4) Subsequently, commit 5c484cee7ef9 ("drm: Remove drm_driver->set_busid hook", 2017-06-20), changed drm_set_busid() to the following:
static int drm_set_busid(struct drm_device *dev, struct drm_file *file_priv) { struct drm_master *master = file_priv->master; int ret;
if (master->unique != NULL) drm_unset_busid(dev, master); if (dev_is_pci(dev->dev)) { ret = drm_pci_set_busid(dev, master); if (ret) { drm_unset_busid(dev, master); return ret; } } else { WARN_ON(!dev->unique); master->unique = kstrdup(dev->unique, GFP_KERNEL); if (master->unique) master->unique_len = strlen(dev->unique); } return 0;
}
Perhaps surprisingly, this change did not affect (or "help") virtio-vga at all, despite the fact that drm_virtio_set_busid() also used to call drm_pci_set_busid(). The reason for commit 5c484cee7ef9 not affecting virtio-vga is that the first branch would not be taken just the same, because dev_is_pci() returns false for virtio-vga. (So, the difference with drm_virtio_set_busid() is that drm_virtio_set_busid() would call drm_pci_set_busid() without checking dev_is_pci() first.) Here's the definition of the dev_is_pci() macro, from "include/linux/pci.h":
#define dev_is_pci(d) ((d)->bus == &pci_bus_type)
However, the bus type for virtio-vga is "virtio_bus", not "pci_bus_type". Namely, virtio_pci_probe() [drivers/virtio/virtio_pci_common.c] calls register_virtio_device() [drivers/virtio/virtio.c], and there we have
int register_virtio_device(struct virtio_device *dev) { int err;
dev->dev.bus = &virtio_bus;
This means that post-5c484cee7ef9, we remain reliant on the second branch in drm_set_busid(), and therefore Gerd's commit 9785b4321b0b, from point (3), should not be backed out.
What I could see as justified is a loud comment in drm_virtio_init(), just above the call to drm_dev_set_unique(), explaining why it is necessary to set "unique" manually. The reason is that virtio-vga technically has "virtio_bus", not "pci_bus_type", for bus type, and so the generic PCI BusID-setting will not cover it.
I've reached to roughly the same conclusion yesterday. Or to put it in slightly differently:
Unlike the other virtual GPU drivers (vmxgfx, qxl, bosh...) virtio abstracts the underlying bus type by using struct virtio_device. Hence the dev_is_pci() check will fail and the unique returned will be the virtio_device' "virtio0", while the "pci:..." one is required.
Apart from a beefy comment I've also considered: - Extending the dev_is_pci() case [in drm_set_busid] to consider virtio. It seems like a bigger hack that what we have currently. - point drm_device::dev to the parent of the virtio_device The biggest hack imaginable, if even possible.
Above said, consider patches 3+4 dropped and I'll follow-up with a patch adding inline documentation about this.
Thank you very much for the help. Emil P.S. On the plus side, I am working on a X/modesetting series that removes all the mess which requires this workaround ;-)