On Tue, 21 Jun 2022 13:01:08 +0200 Thomas Zimmermann tzimmermann@suse.de wrote:
Hi
Am 17.06.22 um 16:12 schrieb Alex Williamson:
On Fri, 17 Jun 2022 14:41:01 +0200 Thomas Zimmermann tzimmermann@suse.de wrote:
Hi
Am 17.06.22 um 14:29 schrieb Javier Martinez Canillas:
[adding Zack and Alex to Cc list]
Hello Thomas,
Thanks a lot for tracking this down and figuring out the root cause!
On 6/17/22 14:10, Thomas Zimmermann wrote:
Always run fbdev removal first to remove simpledrm via sysfb_disable(). This clears the internal state. The later call to drm_aperture_detach_drivers() then does nothing. Otherwise, with drm_aperture_detach_drivers() running first, the call to sysfb_disable() uses inconsistent state.
Example backtrace show below:
[ 11.663422] ================================================================== [ 11.663426] BUG: KASAN: use-after-free in device_del+0x79/0x5f0 [ 11.663435] Read of size 8 at addr ffff888108185050 by task systemd-udevd/311 [ 11.663440] CPU: 0 PID: 311 Comm: systemd-udevd Tainted: G E 5 .19.0-rc2-1-default+ #1689 [ 11.663445] Hardware name: HP ProLiant DL120 G7, BIOS J01 04/21/2011 [ 11.663447] Call Trace: [ 11.663449] <TASK> [ 11.663451] ? device_del+0x79/0x5f0 [ 11.663456] dump_stack_lvl+0x5b/0x73 [ 11.663462] print_address_description.constprop.0+0x1f/0x1b0 [ 11.663468] ? device_del+0x79/0x5f0 [ 11.663471] ? device_del+0x79/0x5f0 [ 11.663475] print_report.cold+0x3c/0x21c [ 11.663481] ? lock_acquired+0x87/0x1e0 [ 11.663484] ? lock_acquired+0x87/0x1e0 [ 11.663489] ? device_del+0x79/0x5f0 [ 11.663492] kasan_report+0xbf/0xf0 [ 11.663498] ? device_del+0x79/0x5f0 [ 11.663503] device_del+0x79/0x5f0 [ 11.663509] ? device_remove_attrs+0x170/0x170 [ 11.663514] ? lock_is_held_type+0xe8/0x140 [ 11.663523] platform_device_del.part.0+0x19/0xe0 [ 11.663530] platform_device_unregister+0x1c/0x30 [ 11.663535] sysfb_disable+0x2d/0x70 [ 11.663540] remove_conflicting_framebuffers+0x1c/0xf0 [ 11.663546] remove_conflicting_pci_framebuffers+0x130/0x1a0 [ 11.663554] drm_aperture_remove_conflicting_pci_framebuffers+0x86/0xb0 [ 11.663561] ? mgag200_pci_remove+0x30/0x30 [mgag200] [ 11.663578] mgag200_pci_probe+0x2d/0x140 [mgag200]
Maybe include a Reported-by: Zack Rusin zackr@vmware.com ? since this seems to be the exact same issue that he reported yesterday.
I'll do.
Patch looks good to me and this seems to be the correct fix IMO. That way we won't re-introduce the issue that was fixed by the sysfb_unregister() function, that is to remove a pdev even if wasn't bound to a driver to prevent a late simpledrm registration to match.
Reviewed-by: Javier Martinez Canillas javierm@redhat.com
Thanks!
I wonder what's the best way to coordinate with Alex to merge this fix and your patch that moves the aperture code out of DRM, since as far as I can tell both should target the v5.20 release.
If nothing else comes in, I'll merge this patch on Monday and send Alex an updated version of the vfio patch.
Please also publish a topic branch for the base of that patch if you're still looking for the non-drm aperture + vfio series to go in through my vfio tree. Thanks,
I have merge the aperture fix, but the vfio thing is getting complicated. Can we merge your vfio patches through drm-misc-next? As the vfio-side of the change already got an r-b from Javier, it should show up in v5.20 then.
Sure, if you'd like to take 165541193265.1955826.8778757616438743090.stgit@omen via the drm tree, feel free, it's obviously the more trivial change of the series. Thanks,
Alex