Hi All,
Fedora has received a bug report here:
https://bugzilla.redhat.com/show_bug.cgi?id=2072556
That Fedora rawhide VMs no longer boot under the VirtualBox hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Regards,
Hans
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Hi All,
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
z
Hi Zack,
On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Hi All,
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works. For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
At a minimum it would have been good if you had tried to at least reproduce this bug by installing Fedora rawhide inside an actual vmware VM. I've just spend a couple of hours debugging this and the bug definitely impacts vmware VMs too; and thus very likely also reproduces there.
I've a patch fixing this, which I will send out right after this email.
Regards,
Hans
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote:
Hi Zack,
On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Hi All,
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works. For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features, vmwgfx has been put on denylists before due to bugs in VirtualBox implementation of it, we just didn’t feel like playing games like having the driver query the hypervisor “are you really from VMware?” and refuse to load.
In this case it’s their lack of mksStats interfaces that’s the issue. We can’t stop development of vmwgfx because our competitor was trying to reuse our work and didn’t implement the features we have. vmwgfx patches are now months ahead on drm-misc-next which should give anyone working on that device in VirtualBox plenty of time to fix it. I’m happy to spend my spare time reviewing patches that would make it work but it’s just not reasonable to expect anyone to spend their time in the office working on a directly competing product.
At a minimum it would have been good if you had tried to at least reproduce this bug by installing Fedora rawhide inside an actual vmware VM. I've just spend a couple of hours debugging this and the bug definitely impacts vmware VMs too; and thus very likely also reproduces there.
We’re always running Fedora, it should always just work on vmwgfx.
I've a patch fixing this, which I will send out right after this email.
That looks like a back porting issue. drm-misc/drm-misc-next is continuously tested on Fedora with vmwgfx so any breaks should never last more than a day. I’ll back port some patches tomorrow when drm-misc-next-fixes opens (because it’s after rc6). I’m sorry you had to deal with this, just send me an email next time, I should always have a pretty good handle on any issues with Fedora with latest vmwgfx.
z
Hi, this is your Linux kernel regression tracker.
On 10.05.22 02:12, Zack Rusin wrote:
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote: On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox
hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
vmwgfx has been put on denylists
/me wonders what exactly is meant by "denylists" here in the upstream context(¹), but whatever, doesn't matter much now afaics.
(¹) Did the users that reported the issue do anything unusual (like writing telling the driver to load with a pciid that is normally doesn't support) to be enable vmwgfx for this hardware?
before due to bugs in VirtualBox implementation of it, we just didn’t feel like playing games like having the driver query the hypervisor “are you really from VMware?” and refuse to load.
In this case it’s their lack of mksStats interfaces that’s the issue. We can’t stop development of vmwgfx because our competitor was trying to reuse our work and didn’t implement the features we have. vmwgfx patches are now months ahead on drm-misc-next which should give anyone working on that device in VirtualBox plenty of time to fix it.
As Hans said: 'this is not how the kernels "no regressions" policy works.' For details see these documents, esp. the quotes from Linus.
https://www.kernel.org/doc/html/latest/admin-guide/reporting-regressions.htm... https://www.kernel.org/doc/html/latest/process/handling-regressions.html
I’m happy to spend my spare time reviewing patches that would make it work but it’s just not reasonable to expect anyone to spend their time in the office working on a directly competing product.
No, but maintaining the driver in the kernel also means that you can't break a directly competing product, otherwise Linus might revert the commits that cause this, unless someone fixes the breakage.
At a minimum it would have been good if you had tried to at least reproduce this bug by installing Fedora rawhide inside an actual vmware VM. I've just spend a couple of hours debugging this and the bug definitely impacts vmware VMs too; and thus very likely also reproduces there.
We’re always running Fedora, it should always just work on vmwgfx.
I've a patch fixing this, which I will send out right after this email.
Many thx for taking care of this, Hans!
That looks like a back porting issue. drm-misc/drm-misc-next is continuously tested on Fedora with vmwgfx so any breaks should never last more than a day. I’ll back port some patches tomorrow when drm-misc-next-fixes opens (because it’s after rc6). I’m sorry you had to deal with this, just send me an email next time, I should always have a pretty good handle on any issues with Fedora with latest vmwgfx.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.
On May 10, 2022, at 7:06 AM, Thorsten Leemhuis regressions@leemhuis.info wrote:
Hi, this is your Linux kernel regression tracker.
On 10.05.22 02:12, Zack Rusin wrote:
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote: On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox
hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
It never did. vmwgfx is just a driver for VMware's SVGA device, it never supported anything else.
z
On 10.05.22 14:26, Zack Rusin wrote:
On May 10, 2022, at 7:06 AM, Thorsten Leemhuis regressions@leemhuis.info wrote: On 10.05.22 02:12, Zack Rusin wrote:
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote: On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote:
Fedora has received a bug report here:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r...
That Fedora rawhide VMs no longer boot under the VirtualBox
hypervisor after the VM has been updated to a 5.18-rc# kernel.
Switching the emulated GPU from vmwaregfx to VirtualBoxSVGA fixes this, so this seems to be a vmwgfx driver regression.
Note I've not investigated/reproduced this myself due to -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
It never did. vmwgfx is just a driver for VMware's SVGA device, it never supported anything else.
Now I'm curious and would like to understand the issue properly, if you have a minute. :-D
I didn't mean "supported" as in "officially supported", I meant as in "it ran (as in automatically bonded) on VirtualBox in one of the modes one could configure in VirtualBox for virtual GPU". And the latter is the case here afaics, or isn't it?
Ciao, Thorsten
On Tue, 2022-05-10 at 14:44 +0200, Thorsten Leemhuis wrote:
On 10.05.22 14:26, Zack Rusin wrote:
On May 10, 2022, at 7:06 AM, Thorsten Leemhuis regressions@leemhuis.info wrote: On 10.05.22 02:12, Zack Rusin wrote:
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote: On 4/11/22 16:24, Zack Rusin wrote:
On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote: > > Fedora has received a bug report here: > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r... > > >
That Fedora rawhide VMs no longer boot under the VirtualBox
> hypervisor after the VM has been updated to a 5.18-rc# > kernel. > > Switching the emulated GPU from vmwaregfx to > VirtualBoxSVGA > fixes this, so this seems to be a vmwgfx driver > regression. > > Note I've not investigated/reproduced this myself due to > -ENOTIME.
Thanks for letting us know. Unfortunately we do not support vmwgfx on VirtualBox. I'd be happy to review patches related to this, but it's very unlikely we'd have to time to look at this ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
It never did. vmwgfx is just a driver for VMware's SVGA device, it never supported anything else.
Now I'm curious and would like to understand the issue properly, if you have a minute. :-D
I didn't mean "supported" as in "officially supported", I meant as in "it ran (as in automatically bonded) on VirtualBox in one of the modes one could configure in VirtualBox for virtual GPU". And the latter is the case here afaics, or isn't it?
I wouldn't know that. But if the claim is that anyone lying about the type of device they are can hijack development then we'll need Linus to clarify that, i.e. if I create a PCI device that identifies itself as a random AMD GPU and crashes as soon you try to do any register access is AMD gpu driver development done now? Clearly addition of any AMD gpu driver would regress my device.
z
On 10.05.22 15:30, Zack Rusin wrote:
On Tue, 2022-05-10 at 14:44 +0200, Thorsten Leemhuis wrote:
On 10.05.22 14:26, Zack Rusin wrote:
On May 10, 2022, at 7:06 AM, Thorsten Leemhuis regressions@leemhuis.info wrote: On 10.05.22 02:12, Zack Rusin wrote:
On May 9, 2022, at 6:57 AM, Hans de Goede hdegoede@redhat.com wrote: On 4/11/22 16:24, Zack Rusin wrote: > On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote: >> >> Fedora has received a bug report here: >> >> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r... >> >> >>
That Fedora rawhide VMs no longer boot under the VirtualBox
>> hypervisor after the VM has been updated to a 5.18-rc# >> kernel. >> >> Switching the emulated GPU from vmwaregfx to >> VirtualBoxSVGA >> fixes this, so this seems to be a vmwgfx driver >> regression. >> >> Note I've not investigated/reproduced this myself due to >> -ENOTIME. > > Thanks for letting us know. Unfortunately we do not support > vmwgfx on VirtualBox. I'd be happy to review patches > related to > this, but it's very unlikely we'd have to time to look at > this > ourselves.
I somewhat understand where you are coming from, but this is not how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
For the end user a regression is a regression and as maintainers we are supposed to make sure any regressions noticed are fixed before a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
It never did. vmwgfx is just a driver for VMware's SVGA device, it never supported anything else.
Now I'm curious and would like to understand the issue properly, if you have a minute. :-D
I didn't mean "supported" as in "officially supported", I meant as in "it ran (as in automatically bonded) on VirtualBox in one of the modes one could configure in VirtualBox for virtual GPU". And the latter is the case here afaics, or isn't it?
I wouldn't know that. But if the claim is that anyone lying about the type of device they are can hijack development then we'll need Linus to clarify that,
Feel free to ask, I doubt that will work out, but yes, in the end it's Linus decision.
i.e. if I create a PCI device that identifies itself as a random AMD GPU
That's not the case and thus a misleading example afaics. Here someone created a virtual PCI device that seems to be compatible to the original (just like someone created a virtual cirrus device with Qemu that worked with the original cirrus drivers -- with the difference in this case that both original and compatible devices are virtual). And it seemed like that compatible virtual device worked fine with the driver for the original device for people. Then by kernel standards you are not allowed to break this setup with any changes you make to the driver, even if the driver was only meant for the original device. Not sure, maybe that is even not to hard by using quirks or something if the compatible GPU can be detected (in this case the one from VirtualBox)?
and crashes as soon you try to do any register access is AMD gpu driver development done now? Clearly addition of any AMD gpu driver would regress my device.
Ciao, Thorsten
On Tue, 2022-05-10 at 15:49 +0200, Thorsten Leemhuis wrote:
On 10.05.22 15:30, Zack Rusin wrote:
On Tue, 2022-05-10 at 14:44 +0200, Thorsten Leemhuis wrote:
On 10.05.22 14:26, Zack Rusin wrote:
On May 10, 2022, at 7:06 AM, Thorsten Leemhuis regressions@leemhuis.info wrote: On 10.05.22 02:12, Zack Rusin wrote:
> On May 9, 2022, at 6:57 AM, Hans de Goede > hdegoede@redhat.com > wrote: On 4/11/22 16:24, Zack Rusin wrote: > > On Mon, 2022-04-11 at 10:52 +0200, Hans de Goede wrote: > > > > > > Fedora has received a bug report here: > > > > > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.r... > > > > > > > > >
That Fedora rawhide VMs no longer boot under the VirtualBox
> > > hypervisor after the VM has been updated to a 5.18- > > > rc# > > > kernel. > > > > > > Switching the emulated GPU from vmwaregfx to > > > VirtualBoxSVGA > > > fixes this, so this seems to be a vmwgfx driver > > > regression. > > > > > > Note I've not investigated/reproduced this myself due > > > to > > > -ENOTIME. > > > > Thanks for letting us know. Unfortunately we do not > > support > > vmwgfx on VirtualBox. I'd be happy to review patches > > related to > > this, but it's very unlikely we'd have to time to look > > at > > this > > ourselves. > > I somewhat understand where you are coming from, but this > is > not > how the kernels "no regressions" policy works.
Hans, many thx for writing your mail, I once intended to write something similar, but then forgot about it. :-/
> For the end user a regression is a regression and as > maintainers we > are supposed to make sure any regressions noticed are > fixed > before > a new kernel hits end user's systems.
I think there’s a misunderstanding here - the vmwgfx driver never supported VirtualBox. VirtualBox implementation of the svga device lacks a bunch of features,
Which from the kernel's point of view is irrelevant. If the Linux kernel's vmwgfx driver ever supported the VirtualBox implementation then things shouldn't regress with later versions.
It never did. vmwgfx is just a driver for VMware's SVGA device, it never supported anything else.
Now I'm curious and would like to understand the issue properly, if you have a minute. :-D
I didn't mean "supported" as in "officially supported", I meant as in "it ran (as in automatically bonded) on VirtualBox in one of the modes one could configure in VirtualBox for virtual GPU". And the latter is the case here afaics, or isn't it?
I wouldn't know that. But if the claim is that anyone lying about the type of device they are can hijack development then we'll need Linus to clarify that,
Feel free to ask, I doubt that will work out, but yes, in the end it's Linus decision.
i.e. if I create a PCI device that identifies itself as a random AMD GPU
That's not the case and thus a misleading example afaics.
No, that's exactly the case. VirtualBox lies in its PCI ID and claims that it's a VMware SVGA when it clearly isn't.
z
dri-devel@lists.freedesktop.org