On Wed, Jul 17, 2013 at 06:38:35PM +0200, David Herrmann wrote:
Hi
On Tue, Jul 16, 2013 at 9:12 AM, Daniel Vetter daniel.vetter@ffwll.ch wrote:
This is the 2nd attempt, I've always been a bit dissatisified with the tricky nature of the first one:
http://lists.freedesktop.org/archives/dri-devel/2012-July/025451.html
The issue is that the flink ioctl can race with calling gem_close on the last gem handle. In that case we'll end up with a zero handle count, but an flink name (and it's corresponding reference). Which results in a neat space leak.
In my first attempt I've solved this by rechecking the handle count. But fundamentally the issue is that ->handle_count isn't your usual refcount - it can be resurrected from 0 among other things.
For those special beasts atomic_t often suggest way more ordering that it actually guarantees. To prevent being tricked by those hairy semantics take the easy way out and simply protect the handle with the existing dev->object_name_lock.
With that change implemented it's dead easy to fix the flink vs. gem close reace: When we try to create the name we simply have to check whether there's still officially a gem handle around and if not refuse to create the flink name. Since the handle count decrement and flink name destruction is now also protected by that lock the reace is gone and we can't ever leak the flink reference again.
Outside of the drm core only the exynos driver looks at the handle count, and tbh I have no idea why (it's just for debug dmesg output luckily).
I've considered inlining the drm_gem_object_handle_free, but I plan to add more name-like things (like the exported dma_buf) to this scheme, so it's clearer to leave the handle freeing in its own function.
v2: Fix up the error path handling in handle_create and make it more robust by simply calling object_handle_unreference.
v3: Fix up the handle_unreference logic bug - atomic_dec_and_test retursn 1 for 0. Oops.
Cc: Inki Dae inki.dae@samsung.com Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch
drivers/gpu/drm/drm_gem.c | 34 ++++++++++++++++++++------------- drivers/gpu/drm/drm_info.c | 2 +- drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 +- include/drm/drmP.h | 12 ++++++++++-- 4 files changed, 33 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index b07519e..14c70b5 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -140,7 +140,7 @@ int drm_gem_object_init(struct drm_device *dev, return PTR_ERR(obj->filp);
kref_init(&obj->refcount);
atomic_set(&obj->handle_count, 0);
obj->handle_count = 0; obj->size = size; return 0;
@@ -161,7 +161,7 @@ int drm_gem_private_object_init(struct drm_device *dev, obj->filp = NULL;
kref_init(&obj->refcount);
atomic_set(&obj->handle_count, 0);
obj->handle_count = 0; obj->size = size; return 0;
@@ -227,11 +227,9 @@ static void drm_gem_object_handle_free(struct drm_gem_object *obj) struct drm_device *dev = obj->dev;
/* Remove any name for this object */
spin_lock(&dev->object_name_lock); if (obj->name) { idr_remove(&dev->object_name_idr, obj->name); obj->name = 0;
spin_unlock(&dev->object_name_lock); /* * The object name held a reference to this object, drop * that now.
@@ -239,15 +237,13 @@ static void drm_gem_object_handle_free(struct drm_gem_object *obj) * This cannot be the last reference, since the handle holds one too. */ kref_put(&obj->refcount, drm_gem_object_ref_bug);
} else
spin_unlock(&dev->object_name_lock);
}
}
void drm_gem_object_handle_unreference_unlocked(struct drm_gem_object *obj) {
if (WARN_ON(atomic_read(&obj->handle_count) == 0))
if (WARN_ON(obj->handle_count == 0)) return; /*
@@ -256,8 +252,11 @@ drm_gem_object_handle_unreference_unlocked(struct drm_gem_object *obj) * checked for a name */
if (atomic_dec_and_test(&obj->handle_count))
spin_lock(&obj->dev->object_name_lock);
if (--obj->handle_count == 0) drm_gem_object_handle_free(obj);
If you inline this here, you can actually drop the huge comment for "caller still holds reference" as you call "object_unreference()" below, anyway. And with this patch, gem_object_handle_free() is pretty small, anyway.
I don't actually understand what you try to say in the commit message about new name-like stuff, but if you reuse it, it's fine.
Later patches will add a 2nd function call here to clean up dma-buf referenes (that's the name-like stuff), so I've figured inlining actually reduces code-readability in the end. Hence why I don't do this here.
spin_unlock(&obj->dev->object_name_lock);
drm_gem_object_unreference_unlocked(obj);
}
@@ -321,18 +320,21 @@ drm_gem_handle_create(struct drm_file *file_priv, * allocation under our spinlock. */ idr_preload(GFP_KERNEL);
spin_lock(&dev->object_name_lock); spin_lock(&file_priv->table_lock); ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT);
drm_gem_object_reference(obj);
obj->handle_count++; spin_unlock(&file_priv->table_lock);
spin_unlock(&dev->object_name_lock); idr_preload_end();
if (ret < 0)
if (ret < 0) {
drm_gem_object_handle_unreference_unlocked(obj); return ret;
}
The locking order isn't really documented. What's wrong with:
idr_preload(GFP_KERNEL); spin_lock(&file_priv->table_lock); ret = idr_alloc(&file_priv->object_idr, obj, 1, 0, GFP_NOWAIT); spin_unlock(&file_priv->table_lock); idr_preload_end();
At this spot a 2nd thread could sneak in with a gem_close (handle names are easily guessable) which drops the handle reference and removes the handle before we've fully set things up. End result is that we leak a reference (since we decrement from 0 to -1 so won't treat it as the last unref), and the handle_count++ later on here restores it to 0. But the reference won't ever get cleaned up again.
Hence we need to protect the entire section from concurrent gem_close calls.
if (ret < 0) return ret;
spin_lock(&dev->object_name_lock); obj->handle_count++; spin_unlock(&dev->object_name_lock); drm_gem_object_reference(obj);
This is safe against flink() as we don't care whether flink() fails if user-space isn't even aware of the handle, yet. And if user-space already has a handle, then "handle_count" is >0, anyway.
And gem_object_handle_unreference can only be called if another handle exists (thus, handle_count > 0).
Or am I missing something?
See above, userspace can sneak in before we've actually incremented handle_count.
*handlep = ret;
drm_gem_object_reference(obj);
atomic_inc(&obj->handle_count); if (dev->driver->gem_open_object) { ret = dev->driver->gem_open_object(obj, file_priv);
@@ -499,6 +501,12 @@ drm_gem_flink_ioctl(struct drm_device *dev, void *data,
idr_preload(GFP_KERNEL); spin_lock(&dev->object_name_lock);
/* prevent races with concurrent gem_close. */
if (obj->handle_count == 0) {
ret = -ENOENT;
goto err;
spin_unlock(&dev->object_name_lock); ?
Oops, will fix.
Aside from the spin_unlock(), it's all a matter of taste, so: Reviewed-by: David Herrmann dh.herrmann@gmail.com
Cheers David
}
if (!obj->name) { ret = idr_alloc(&dev->object_name_idr, obj, 1, 0, GFP_NOWAIT); if (ret < 0)
diff --git a/drivers/gpu/drm/drm_info.c b/drivers/gpu/drm/drm_info.c index d4b20ce..f4b348c 100644 --- a/drivers/gpu/drm/drm_info.c +++ b/drivers/gpu/drm/drm_info.c @@ -207,7 +207,7 @@ static int drm_gem_one_name_info(int id, void *ptr, void *data)
seq_printf(m, "%6d %8zd %7d %8d\n", obj->name, obj->size,
atomic_read(&obj->handle_count),
obj->handle_count, atomic_read(&obj->refcount.refcount)); return 0;
} diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c index 24c22a8..16963ca 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -135,7 +135,7 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj) obj = &exynos_gem_obj->base; buf = exynos_gem_obj->buffer;
DRM_DEBUG_KMS("handle count = %d\n", atomic_read(&obj->handle_count));
DRM_DEBUG_KMS("handle count = %d\n", obj->handle_count); /* * do not release memory region from exporter.
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 2fb83b4..25da8e0 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -634,8 +634,16 @@ struct drm_gem_object { /** Reference count of this object */ struct kref refcount;
/** Handle count of this object. Each handle also holds a reference */
atomic_t handle_count; /* number of handles on this object */
/**
* handle_count - gem file_priv handle count of this object
*
* Each handle also holds a reference. Note that when the handle_count
* drops to 0 any global names (e.g. the id in the flink namespace) will
* be cleared.
*
* Protected by dev->object_name_lock.
* */
unsigned handle_count; /** Related drm device */ struct drm_device *dev;
-- 1.8.3.2
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel