So I've got a reproducable oops with udl sharing from i915,
start X, connect UDL, randr it into position, rip out udl device, kill X,
we get an oops when dma_unmap_sg in i915_gem_unmap_dma_buf gets called, attachment->dev is pointing to a freed structure, now the drm + udl driver points dev->dev at the USB interface device, however when the device is unplugged, the USB interface device disappears in a poof of smoke, and we just keep a fake shell of the drm device around to keep userspace happy.
So I'm wondering if should be using a different device to create dma-buf objects or whether dma-buf objects need to be keeping a reference on the interface device,
Dave.
[ 147.162846] [drm:i915_gem_unmap_dma_buf] *ERROR* attachment is ffff88002a9f2ab8 ffff880030b1e238 ffff88002a90cf00 [ 147.164228] [drm:i915_gem_unmap_dma_buf] *ERROR* obj is ffff88002a90cf00 [ 147.165641] [drm:i915_gem_unmap_dma_buf] *ERROR* sg is ffff88002a9ef2d0 ffff88002da0ca40 33 [ 147.167089] [drm:i915_gem_unmap_dma_buf] *ERROR* ops is ffff880030829c08 6b6b6b6b6b6b6b6b [ 147.168988] general protection fault: 0000 [#1] SMP [ 147.169379] Modules linked in: udl syscopyarea sysfillrect sysimgblt drm_usb rfcomm fuse ip6table_filter ip6_tables iptable_mangle bnep arc4 iwldvm mac80211 btusb iwlwifi bluetooth snd_hda_codec_conexant snd_hda_codec_generic cfg80211 snd_hda_intel snd_hda_codec snd_hwdep snd_seq e1000e snd_seq_device sdhci_pci sdhci coretemp kvm_intel mmc_core snd_pcm kvm microcode 6lowpan_iphc i2c_i801 thinkpad_acpi snd_timer snd lpc_ich mfd_core rfkill soundcore ptp pps_core wmi firewire_ohci firewire_core crc_itu_t yenta_socket i915 i2c_algo_bit drm_kms_helper drm i2c_core video [ 147.169379] CPU: 0 PID: 481 Comm: Xorg Not tainted 3.14.0-rc7+ #28 [ 147.169379] Hardware name: LENOVO 406334M/406334M, BIOS 6FET79WW (3.09 ) 10/02/2009 [ 147.169379] task: ffff880070ba29d0 ti: ffff88005df74000 task.ti: ffff88005df74000 [ 147.169379] RIP: 0010:[<ffffffffa00ff27e>] [<ffffffffa00ff27e>] i915_gem_unmap_dma_buf+0x10e/0x1a0 [i915] [ 147.169379] RSP: 0018:ffff88005df75c58 EFLAGS: 00010246 [ 147.169379] RAX: 0000000000000000 RBX: ffff88002a9ef2d0 RCX: 00000000831a8319 [ 147.169379] RDX: 00000000000000a0 RSI: 0000000000000001 RDI: ffff88002da0cf40 [ 147.169379] RBP: ffff88005df75c90 R08: 0000000000000000 R09: 0000000000000000 [ 147.169379] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880030829c08 [ 147.169379] R13: ffff88002a90cf00 R14: 0000000000000000 R15: 6b6b6b6b6b6b6b6b [ 147.169379] FS: 00007f5e18ef89c0(0000) GS:ffff880078c00000(0000) knlGS:0000000000000000 [ 147.169379] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 147.169379] CR2: 000000000042fa60 CR3: 000000003f5e7000 CR4: 00000000000007f0 [ 147.169379] Stack: [ 147.169379] 000000212a9f2ab8 ffff88002da0ca40 ffff88002a9f2ab8 ffff88002a9ef2d0 [ 147.169379] 0000000000000000 ffff88003fd413b0 ffff88005df75d70 ffff88005df75cb8 [ 147.169379] ffffffff81487851 ffff88002a9f2ab8 ffff88004446e818 ffff88004446e7b0 [ 147.169379] Call Trace: [ 147.169379] [<ffffffff81487851>] dma_buf_unmap_attachment+0x51/0x80 [ 147.169379] [<ffffffffa00392b2>] drm_prime_gem_destroy+0x22/0x40 [drm] [ 147.169379] [<ffffffffa061e8f5>] udl_gem_free_object+0x35/0x60 [udl] [ 147.169379] [<ffffffffa002241a>] drm_gem_object_free+0x2a/0x30 [drm] [ 147.169379] [<ffffffffa0022a74>] drm_gem_object_handle_unreference_unlocked+0xe4/0x120 [drm] [ 147.169379] [<ffffffffa0022cfc>] drm_gem_object_release_handle+0x5c/0x80 [drm] [ 147.169379] [<ffffffff8136fcc4>] idr_for_each+0x104/0x1a0 [ 147.169379] [<ffffffffa0022ca0>] ? drm_gem_dumb_destroy+0x20/0x20 [drm] [ 147.169379] [<ffffffff8172b5fe>] ? mutex_unlock+0xe/0x10 [ 147.169379] [<ffffffffa0023750>] drm_gem_release+0x20/0x30 [drm] [ 147.169379] [<ffffffffa00222a3>] drm_release+0x533/0x5c0 [drm] [ 147.169379] [<ffffffff811eb2f5>] __fput+0xf5/0x2c0 [ 147.169379] [<ffffffff811eb50e>] ____fput+0xe/0x10 [ 147.169379] [<ffffffff81099874>] task_work_run+0xb4/0xe0 [ 147.169379] [<ffffffff81072d4c>] do_exit+0x2ec/0xcd0
On Tue, Mar 25, 2014 at 4:53 AM, Dave Airlie airlied@gmail.com wrote:
So I've got a reproducable oops with udl sharing from i915,
start X, connect UDL, randr it into position, rip out udl device, kill X,
we get an oops when dma_unmap_sg in i915_gem_unmap_dma_buf gets called, attachment->dev is pointing to a freed structure, now the drm
- udl driver points dev->dev at the USB interface device, however when
the device is unplugged, the USB interface device disappears in a poof of smoke, and we just keep a fake shell of the drm device around to keep userspace happy.
So I'm wondering if should be using a different device to create dma-buf objects or whether dma-buf objects need to be keeping a reference on the interface device,
Don't do any mapping/attaching then since you never need to do dma anyway. And I guess if we ever need to do dma with usb devices for real we just need to teach the usb core about dma-bufs, so that the *hci can do the attaching and dma-mapping.
On a quick look through udl_gem.c vmap will keep on working since it's already forwarded to i915. That leaves mmap - either your userspace doesn't need this for prime buffers or you'd need to implement that one too. Rob Clark had patches floating around for forwarding mmaping through dma-bufs between gem drivers a long time ago, Rob Bradford is working on dma-buf mmap support for i915.
Besides the issue at hand though I think drivers need to make sure that the device they use for attaching does outlive the dma-buf. Which for real hotpluggin probably means that drivers need to drop all attachment on unplug (the dma mapping is useless anyway) and just keep all their imported gem objects alive with just a reference to the dma-buf object itself.
Cheers, Daniel
On 03/25/2014 09:01 AM, Daniel Vetter wrote:
On Tue, Mar 25, 2014 at 4:53 AM, Dave Airlie airlied@gmail.com wrote:
So I've got a reproducable oops with udl sharing from i915,
start X, connect UDL, randr it into position, rip out udl device, kill X,
we get an oops when dma_unmap_sg in i915_gem_unmap_dma_buf gets called, attachment->dev is pointing to a freed structure, now the drm
- udl driver points dev->dev at the USB interface device, however when
the device is unplugged, the USB interface device disappears in a poof of smoke, and we just keep a fake shell of the drm device around to keep userspace happy.
So I'm wondering if should be using a different device to create dma-buf objects or whether dma-buf objects need to be keeping a reference on the interface device,
Don't do any mapping/attaching then since you never need to do dma anyway. And I guess if we ever need to do dma with usb devices for real we just need to teach the usb core about dma-bufs, so that the *hci can do the attaching and dma-mapping.
On a quick look through udl_gem.c vmap will keep on working since it's already forwarded to i915. That leaves mmap - either your userspace doesn't need this for prime buffers or you'd need to implement that one too. Rob Clark had patches floating around for forwarding mmaping through dma-bufs between gem drivers a long time ago, Rob Bradford is working on dma-buf mmap support for i915.
Ouch, here we go!
I though the idea was *not* to implement dma-buf mmap() on the major drivers :( ???
/Thomas
Hi
On Tue, Mar 25, 2014 at 9:01 AM, Daniel Vetter daniel@ffwll.ch wrote:
Besides the issue at hand though I think drivers need to make sure that the device they use for attaching does outlive the dma-buf. Which for real hotpluggin probably means that drivers need to drop all attachment on unplug (the dma mapping is useless anyway) and just keep all their imported gem objects alive with just a reference to the dma-buf object itself.
Drivers should never touch other drivers or even look at them. There is no reason i915 is responsible of keeping udl alive. That gets really messy and may introduce circular dependencies.
I'd like to see exported dma-bufs reference their drm-device owner. This way, the drm-device stays around until the dma-buf is removed. To avoid lazy device destruction, a driver can (during unplug) simply detach all dma-bufs if, and only if, they first made the dma-buf somehow stand-alone.
For example: udl can just move the allocated pages into the dma-buf, mark it as dead and detach it. The udl-device can get destructed and whenever i915 releases the dma-buf, the udl-dma-buf ops see it's dead and just deref it / release dma-buf resources. But this is all not needed if the exported dma-bufs just reference "drm_device", which is imho the easiest fix.
David
On Tue, Mar 25, 2014 at 7:40 PM, David Herrmann dh.herrmann@gmail.com wrote:
Hi
On Tue, Mar 25, 2014 at 9:01 AM, Daniel Vetter daniel@ffwll.ch wrote:
Besides the issue at hand though I think drivers need to make sure that the device they use for attaching does outlive the dma-buf. Which for real hotpluggin probably means that drivers need to drop all attachment on unplug (the dma mapping is useless anyway) and just keep all their imported gem objects alive with just a reference to the dma-buf object itself.
Drivers should never touch other drivers or even look at them. There is no reason i915 is responsible of keeping udl alive. That gets really messy and may introduce circular dependencies.
I'd like to see exported dma-bufs reference their drm-device owner. This way, the drm-device stays around until the dma-buf is removed. To avoid lazy device destruction, a driver can (during unplug) simply detach all dma-bufs if, and only if, they first made the dma-buf somehow stand-alone.
For example: udl can just move the allocated pages into the dma-buf, mark it as dead and detach it. The udl-device can get destructed and whenever i915 releases the dma-buf, the udl-dma-buf ops see it's dead and just deref it / release dma-buf resources. But this is all not needed if the exported dma-bufs just reference "drm_device", which is imho the easiest fix.
I sent a patch to fix the udl case to just keep a reference on the usb_interface device it seems to work, we probably do need to do this in theory for pci devices as well in a hotplug/sysfs plug situation I suppose,
Dave.
On Tue, Mar 25, 2014 at 10:40:45AM +0100, David Herrmann wrote:
Hi
On Tue, Mar 25, 2014 at 9:01 AM, Daniel Vetter daniel@ffwll.ch wrote:
Besides the issue at hand though I think drivers need to make sure that the device they use for attaching does outlive the dma-buf. Which for real hotpluggin probably means that drivers need to drop all attachment on unplug (the dma mapping is useless anyway) and just keep all their imported gem objects alive with just a reference to the dma-buf object itself.
Drivers should never touch other drivers or even look at them. There is no reason i915 is responsible of keeping udl alive. That gets really messy and may introduce circular dependencies.
I'd like to see exported dma-bufs reference their drm-device owner. This way, the drm-device stays around until the dma-buf is removed. To avoid lazy device destruction, a driver can (during unplug) simply detach all dma-bufs if, and only if, they first made the dma-buf somehow stand-alone.
For example: udl can just move the allocated pages into the dma-buf, mark it as dead and detach it. The udl-device can get destructed and whenever i915 releases the dma-buf, the udl-dma-buf ops see it's dead and just deref it / release dma-buf resources. But this is all not needed if the exported dma-bufs just reference "drm_device", which is imho the easiest fix.
The issue at hand is the other way round, i.e. i915 exporting, udl importing. The i915/dma-buf falls over because the device struct referenced in the attachment disappeared. Dave fixed it by grabbing a reference, I think the better fix would be to simply not attach (since udl doesn't need it anyway). Either works. -Daniel
dri-devel@lists.freedesktop.org