On Thu, Aug 22, 2019 at 8:42 AM Thomas Hellström (VMware) thomas_os@shipmail.org wrote:
On 8/21/19 9:51 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 08:27:59PM +0200, Thomas Hellström (VMware) wrote:
On 8/21/19 8:11 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 7:06 PM Thomas Hellström (VMware) thomas_os@shipmail.org wrote:
On 8/21/19 6:34 PM, Daniel Vetter wrote:
On Wed, Aug 21, 2019 at 05:54:27PM +0200, Thomas Hellström (VMware) wrote: > On 8/20/19 4:53 PM, Daniel Vetter wrote: >> Full audit of everyone: >> >> - i915, radeon, amdgpu should be clean per their maintainers. >> >> - vram helpers should be fine, they don't do command submission, so >> really no business holding struct_mutex while doing copy_*_user. But >> I haven't checked them all. >> >> - panfrost seems to dma_resv_lock only in panfrost_job_push, which >> looks clean. >> >> - v3d holds dma_resv locks in the tail of its v3d_submit_cl_ioctl(), >> copying from/to userspace happens all in v3d_lookup_bos which is >> outside of the critical section. >> >> - vmwgfx has a bunch of ioctls that do their own copy_*_user: >> - vmw_execbuf_process: First this does some copies in >> vmw_execbuf_cmdbuf() and also in the vmw_execbuf_process() itself. >> Then comes the usual ttm reserve/validate sequence, then actual >> submission/fencing, then unreserving, and finally some more >> copy_to_user in vmw_execbuf_copy_fence_user. Glossing over tons of >> details, but looks all safe. >> - vmw_fence_event_ioctl: No ttm_reserve/dma_resv_lock anywhere to be >> seen, seems to only create a fence and copy it out. >> - a pile of smaller ioctl in vmwgfx_ioctl.c, no reservations to be >> found there. >> Summary: vmwgfx seems to be fine too. >> >> - virtio: There's virtio_gpu_execbuffer_ioctl, which does all the >> copying from userspace before even looking up objects through their >> handles, so safe. Plus the getparam/getcaps ioctl, also both safe. >> >> - qxl only has qxl_execbuffer_ioctl, which calls into >> qxl_process_single_command. There's a lovely comment before the >> __copy_from_user_inatomic that the slowpath should be copied from >> i915, but I guess that never happened. Try not to be unlucky and get >> your CS data evicted between when it's written and the kernel tries >> to read it. The only other copy_from_user is for relocs, but those >> are done before qxl_release_reserve_list(), which seems to be the >> only thing reserving buffers (in the ttm/dma_resv sense) in that >> code. So looks safe. >> >> - A debugfs file in nouveau_debugfs_pstate_set() and the usif ioctl in >> usif_ioctl() look safe. nouveau_gem_ioctl_pushbuf() otoh breaks this >> everywhere and needs to be fixed up. >> >> Cc: Alex Deucher alexander.deucher@amd.com >> Cc: Christian König christian.koenig@amd.com >> Cc: Chris Wilson chris@chris-wilson.co.uk >> Cc: Thomas Zimmermann tzimmermann@suse.de >> Cc: Rob Herring robh@kernel.org >> Cc: Tomeu Vizoso tomeu.vizoso@collabora.com >> Cc: Eric Anholt eric@anholt.net >> Cc: Dave Airlie airlied@redhat.com >> Cc: Gerd Hoffmann kraxel@redhat.com >> Cc: Ben Skeggs bskeggs@redhat.com >> Cc: "VMware Graphics" linux-graphics-maintainer@vmware.com >> Cc: Thomas Hellstrom thellstrom@vmware.com >> Signed-off-by: Daniel Vetter daniel.vetter@intel.com >> --- >> drivers/dma-buf/dma-resv.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c >> index 42a8f3f11681..3edca10d3faf 100644 >> --- a/drivers/dma-buf/dma-resv.c >> +++ b/drivers/dma-buf/dma-resv.c >> @@ -34,6 +34,7 @@ >> #include <linux/dma-resv.h> >> #include <linux/export.h> >> +#include <linux/sched/mm.h> >> /** >> * DOC: Reservation Object Overview >> @@ -107,6 +108,17 @@ void dma_resv_init(struct dma_resv *obj) >> &reservation_seqcount_class); >> RCU_INIT_POINTER(obj->fence, NULL); >> RCU_INIT_POINTER(obj->fence_excl, NULL); >> + >> + if (IS_ENABLED(CONFIG_LOCKDEP)) { >> + if (current->mm) >> + down_read(¤t->mm->mmap_sem); >> + ww_mutex_lock(&obj->lock, NULL); >> + fs_reclaim_acquire(GFP_KERNEL); >> + fs_reclaim_release(GFP_KERNEL); >> + ww_mutex_unlock(&obj->lock); >> + if (current->mm) >> + up_read(¤t->mm->mmap_sem); >> + } >> } >> EXPORT_SYMBOL(dma_resv_init); > I assume if this would have been easily done and maintainable using only > lockdep annotation instead of actually acquiring the locks, that would have > been done? There's might_lock(), plus a pile of macros, but they don't map obviuosly, so pretty good chances I accidentally end up with the wrong type of annotation. Easier to just take the locks quickly, and stuff that all into a lockdep-only section to avoid overhead.
> Otherwise LGTM. > > Reviewed-by: Thomas Hellström thellstrom@vmware.com > > Will test this and let you know if it trips on vmwgfx, but it really > shouldn't. Thanks, Daniel
One thing that strikes me is that this puts restrictions on where you can actually initialize a dma_resv, even if locking orders are otherwise obeyed. But that might not be a big problem.
Hm yeah ... the trouble is a need a non-kthread thread so that I have a current->mm. Otherwise I'd have put it into some init section with a temp dma_buf. And I kinda don't want to create a fake ->mm just for lockdep priming. I don't expect this to be a real problem in practice, since before you've called dma_resv_init the reservation lock doesn't exist, so you can't hold it. And you've probably just allocated it, so fs_reclaim is going to be fine. And if you allocate dma_resv objects from your fault handlers I have questions anyway :-)
Coming to think of it, I think vmwgfx sometimes create bos with other bo's reservation lock held. I guess that would trip both the mmap_sem check the ww_mutex check?
If you do that, yes we're busted. Do you do that?
Yes, we do, in a couple of places it seems, and it also appears like TTM is doing it according to Christian.
I guess needs a new idea for where to put this ... while making sure everyone gets it. So some evil trick like putting it in drm_open() won't work, since I also want to make sure everyone else using dma-buf follows these rules.
IMO it should be sufficient to establish this locking order once, but I guess dma-buf module init time won't work because we might not have an mm structure?
mm_alloc() is a thing as Chris pointed out, and it works. v3 on its way. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch