Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
+ if (dev->driver->load) { + if (!drm_core_check_feature(dev, DRIVER_LEGACY)) + DRM_INFO("drm driver %s is using deprecated ->load callback\n", + dev->driver->name); + } + ret = drm_minor_register(dev, DRM_MINOR_RENDER); if (ret) goto err_minors; diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 77685ed7aa65..77bc63de0a91 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -173,6 +173,9 @@ struct drm_driver { * * This is deprecated, do not use! * + * FIXME: A few non-DRIVER_LEGACY drivers still use this, and should be + * converted. + * * Returns: * * Zero on success, non-zero value on failure.
Instead check for master status, in case we've raced.
This is the last exception to the general rule that we restore fbcon only when there's no master active. Compositors are supposed to drop their master status before they switch to a different console back to text mode (or just switch to text mode directly, without a vt switch).
This is known to break some subtests of kms_fbcon_fbt in igt, but they're just wrong - it does a graphics/text mode switch for the vt without updating the master status.
Also add a comment to the drm_client->restore hook that this is expected going forward from all clients (there's currently just one).
v2: Also drop the force in pan_display
Cc: Noralf Trønnes noralf@tronnes.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/gpu/drm/drm_fb_helper.c | 14 ++------------ include/drm/drm_client.h | 5 +++++ 2 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 4c7cbce7bae7..926187a82255 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -250,17 +250,7 @@ int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) return 0;
mutex_lock(&fb_helper->lock); - /* - * TODO: - * We should bail out here if there is a master by dropping _force. - * Currently these igt tests fail if we do that: - * - kms_fbcon_fbt@psr - * - kms_fbcon_fbt@psr-suspend - * - * So first these tests need to be fixed so they drop master or don't - * have an fd open. - */ - ret = drm_client_modeset_commit_force(&fb_helper->client); + ret = drm_client_modeset_commit(&fb_helper->client);
do_delayed = fb_helper->delayed_hotplug; if (do_delayed) @@ -1357,7 +1347,7 @@ static int pan_display_atomic(struct fb_var_screeninfo *var,
pan_set(fb_helper, var->xoffset, var->yoffset);
- ret = drm_client_modeset_commit_force(&fb_helper->client); + ret = drm_client_modeset_commit(&fb_helper->client); if (!ret) { info->var.xoffset = var->xoffset; info->var.yoffset = var->yoffset; diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h index 5cf2c5dd8b1e..d01d311023ac 100644 --- a/include/drm/drm_client.h +++ b/include/drm/drm_client.h @@ -44,6 +44,11 @@ struct drm_client_funcs { * returns zero gets the privilege to restore and no more clients are * called. This callback is not called after @unregister has been called. * + * Note that the core does not guarantee exclusion against concurrent + * drm_open(). Clients need to ensure this themselves, for example by + * using drm_master_internal_acquire() and + * drm_master_internal_release(). + * * This callback is optional. */ int (*restore)(struct drm_client_dev *client);
Den 28.01.2020 11.45, skrev Daniel Vetter:
Instead check for master status, in case we've raced.
This is the last exception to the general rule that we restore fbcon only when there's no master active. Compositors are supposed to drop their master status before they switch to a different console back to text mode (or just switch to text mode directly, without a vt switch).
This is known to break some subtests of kms_fbcon_fbt in igt, but they're just wrong - it does a graphics/text mode switch for the vt without updating the master status.
Also add a comment to the drm_client->restore hook that this is expected going forward from all clients (there's currently just one).
v2: Also drop the force in pan_display
Cc: Noralf Trønnes noralf@tronnes.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_fb_helper.c | 14 ++------------ include/drm/drm_client.h | 5 +++++ 2 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 4c7cbce7bae7..926187a82255 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -250,17 +250,7 @@ int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) return 0;
mutex_lock(&fb_helper->lock);
- /*
* TODO:
* We should bail out here if there is a master by dropping _force.
* Currently these igt tests fail if we do that:
* - kms_fbcon_fbt@psr
* - kms_fbcon_fbt@psr-suspend
*
* So first these tests need to be fixed so they drop master or don't
* have an fd open.
*/
- ret = drm_client_modeset_commit_force(&fb_helper->client);
ret = drm_client_modeset_commit(&fb_helper->client);
do_delayed = fb_helper->delayed_hotplug; if (do_delayed)
@@ -1357,7 +1347,7 @@ static int pan_display_atomic(struct fb_var_screeninfo *var,
pan_set(fb_helper, var->xoffset, var->yoffset);
- ret = drm_client_modeset_commit_force(&fb_helper->client);
- ret = drm_client_modeset_commit(&fb_helper->client);
This needs _force because drm_fb_helper_pan_display() already holds the locks.
With that fixed: Reviewed-by: Noralf Trønnes noralf@tronnes.org
Maybe a better and more descriptive name would have been drm_client_modeset_commit_locked().
Noralf.
if (!ret) { info->var.xoffset = var->xoffset; info->var.yoffset = var->yoffset; diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h index 5cf2c5dd8b1e..d01d311023ac 100644 --- a/include/drm/drm_client.h +++ b/include/drm/drm_client.h @@ -44,6 +44,11 @@ struct drm_client_funcs { * returns zero gets the privilege to restore and no more clients are * called. This callback is not called after @unregister has been called. *
* Note that the core does not guarantee exclusion against concurrent
* drm_open(). Clients need to ensure this themselves, for example by
* using drm_master_internal_acquire() and
* drm_master_internal_release().
*
*/ int (*restore)(struct drm_client_dev *client);
- This callback is optional.
On Tue, Jan 28, 2020 at 12:52:05PM +0100, Noralf Trønnes wrote:
Den 28.01.2020 11.45, skrev Daniel Vetter:
Instead check for master status, in case we've raced.
This is the last exception to the general rule that we restore fbcon only when there's no master active. Compositors are supposed to drop their master status before they switch to a different console back to text mode (or just switch to text mode directly, without a vt switch).
This is known to break some subtests of kms_fbcon_fbt in igt, but they're just wrong - it does a graphics/text mode switch for the vt without updating the master status.
Also add a comment to the drm_client->restore hook that this is expected going forward from all clients (there's currently just one).
v2: Also drop the force in pan_display
Cc: Noralf Trønnes noralf@tronnes.org Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_fb_helper.c | 14 ++------------ include/drm/drm_client.h | 5 +++++ 2 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c index 4c7cbce7bae7..926187a82255 100644 --- a/drivers/gpu/drm/drm_fb_helper.c +++ b/drivers/gpu/drm/drm_fb_helper.c @@ -250,17 +250,7 @@ int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper) return 0;
mutex_lock(&fb_helper->lock);
- /*
* TODO:
* We should bail out here if there is a master by dropping _force.
* Currently these igt tests fail if we do that:
* - kms_fbcon_fbt@psr
* - kms_fbcon_fbt@psr-suspend
*
* So first these tests need to be fixed so they drop master or don't
* have an fd open.
*/
- ret = drm_client_modeset_commit_force(&fb_helper->client);
ret = drm_client_modeset_commit(&fb_helper->client);
do_delayed = fb_helper->delayed_hotplug; if (do_delayed)
@@ -1357,7 +1347,7 @@ static int pan_display_atomic(struct fb_var_screeninfo *var,
pan_set(fb_helper, var->xoffset, var->yoffset);
- ret = drm_client_modeset_commit_force(&fb_helper->client);
- ret = drm_client_modeset_commit(&fb_helper->client);
This needs _force because drm_fb_helper_pan_display() already holds the locks.
Geez, now I remember again why I did _not_ include this in v1 :-/
With that fixed: Reviewed-by: Noralf Trønnes noralf@tronnes.org
Maybe a better and more descriptive name would have been drm_client_modeset_commit_locked().
This sounds like a good idea, I'll do a patch for this. I'll need to resend the series anyway to be able to co-test it with the igt side fix.
Thanks for your review. -Daniel
Noralf.
if (!ret) { info->var.xoffset = var->xoffset; info->var.yoffset = var->yoffset; diff --git a/include/drm/drm_client.h b/include/drm/drm_client.h index 5cf2c5dd8b1e..d01d311023ac 100644 --- a/include/drm/drm_client.h +++ b/include/drm/drm_client.h @@ -44,6 +44,11 @@ struct drm_client_funcs { * returns zero gets the privilege to restore and no more clients are * called. This callback is not called after @unregister has been called. *
* Note that the core does not guarantee exclusion against concurrent
* drm_open(). Clients need to ensure this themselves, for example by
* using drm_master_internal_acquire() and
* drm_master_internal_release().
*
*/ int (*restore)(struct drm_client_dev *client);
- This callback is optional.
We want to only take the BKL on crap drivers, but to know whether we have a crap driver we first need to look it up. Split this shuffle out from the main BKL-disabling patch, for more clarity.
Since the minors are refcounted drm_minor_acquire is purely internal and this does not have a driver visible effect.
v2: Push the locking even further into drm_open(), suggested by Chris. This gives us more symmetry with drm_release(), and maybe a futuer avenue where we make drm_globale_mutex locking (partially) opt-in like with drm_release_noglobal().
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/gpu/drm/drm_drv.c | 14 +++++--------- drivers/gpu/drm/drm_file.c | 6 ++++++ 2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 8deff75b484c..05bdf0b9d2b3 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -1085,17 +1085,14 @@ static int drm_stub_open(struct inode *inode, struct file *filp)
DRM_DEBUG("\n");
- mutex_lock(&drm_global_mutex); minor = drm_minor_acquire(iminor(inode)); - if (IS_ERR(minor)) { - err = PTR_ERR(minor); - goto out_unlock; - } + if (IS_ERR(minor)) + return PTR_ERR(minor);
new_fops = fops_get(minor->dev->driver->fops); if (!new_fops) { err = -ENODEV; - goto out_release; + goto out; }
replace_fops(filp, new_fops); @@ -1104,10 +1101,9 @@ static int drm_stub_open(struct inode *inode, struct file *filp) else err = 0;
-out_release: +out: drm_minor_release(minor); -out_unlock: - mutex_unlock(&drm_global_mutex); + return err; }
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 1075b3a8b5b1..d36cb74ebe0c 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -378,6 +378,8 @@ int drm_open(struct inode *inode, struct file *filp) if (IS_ERR(minor)) return PTR_ERR(minor);
+ mutex_unlock(&drm_global_mutex); + dev = minor->dev; if (!atomic_fetch_inc(&dev->open_count)) need_setup = 1; @@ -395,10 +397,14 @@ int drm_open(struct inode *inode, struct file *filp) goto err_undo; } } + + mutex_unlock(&drm_global_mutex); + return 0;
err_undo: atomic_dec(&dev->open_count); + mutex_unlock(&drm_global_mutex); drm_minor_release(minor); return retcode; }
Quoting Daniel Vetter (2020-01-28 10:46:00)
We want to only take the BKL on crap drivers, but to know whether we have a crap driver we first need to look it up. Split this shuffle out from the main BKL-disabling patch, for more clarity.
Since the minors are refcounted drm_minor_acquire is purely internal and this does not have a driver visible effect.
v2: Push the locking even further into drm_open(), suggested by Chris. This gives us more symmetry with drm_release(), and maybe a futuer avenue where we make drm_globale_mutex locking (partially) opt-in like with drm_release_noglobal().
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 14 +++++--------- drivers/gpu/drm/drm_file.c | 6 ++++++ 2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 8deff75b484c..05bdf0b9d2b3 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -1085,17 +1085,14 @@ static int drm_stub_open(struct inode *inode, struct file *filp)
DRM_DEBUG("\n");
mutex_lock(&drm_global_mutex); minor = drm_minor_acquire(iminor(inode));
if (IS_ERR(minor)) {
err = PTR_ERR(minor);
goto out_unlock;
}
if (IS_ERR(minor))
return PTR_ERR(minor); new_fops = fops_get(minor->dev->driver->fops); if (!new_fops) { err = -ENODEV;
goto out_release;
goto out; } replace_fops(filp, new_fops);
@@ -1104,10 +1101,9 @@ static int drm_stub_open(struct inode *inode, struct file *filp) else err = 0;
-out_release: +out: drm_minor_release(minor); -out_unlock:
mutex_unlock(&drm_global_mutex);
return err;
}
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 1075b3a8b5b1..d36cb74ebe0c 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -378,6 +378,8 @@ int drm_open(struct inode *inode, struct file *filp) if (IS_ERR(minor)) return PTR_ERR(minor);
mutex_unlock(&drm_global_mutex);
dev = minor->dev; if (!atomic_fetch_inc(&dev->open_count)) need_setup = 1;
@@ -395,10 +397,14 @@ int drm_open(struct inode *inode, struct file *filp) goto err_undo; } }
mutex_unlock(&drm_global_mutex);
The only reason why I could think it was in drm_stub_open() not drm_open() was for the possibility of some driver using a different callback. Such a driver would not be partaking in the drm_global_mutex so... Reviewed-by: Chris Wilson chris@chris-wilson.co.uk -Chris
This catches the majority of drivers (unfortunately not if we take users into account, because all the big drivers have at least a lastclose hook).
With the prep patches out of the way all drm state is fully protected and either prevents or can deal with the races from dropping the BKL around open/close. The only thing left to audit are the various driver hooks - by keeping the BKL around if any of them are set we have a very simple cop-out!
Note that one of the biggest prep pieces to get here was making dev->open_count atomic, which was done in
commit 7e13ad896484a0165a68197a2e64091ea28c9602 Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jan 24 13:01:07 2020 +0000
drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@intel.com --- drivers/gpu/drm/drm_drv.c | 6 +++-- drivers/gpu/drm/drm_file.c | 46 ++++++++++++++++++++++++++++++---- drivers/gpu/drm/drm_internal.h | 1 + 3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 05bdf0b9d2b3..9fcd6ab3c154 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) struct drm_driver *driver = dev->driver; int ret;
- mutex_lock(&drm_global_mutex); + if (drm_dev_needs_global_mutex(dev)) + mutex_lock(&drm_global_mutex);
if (dev->driver->load) { if (!drm_core_check_feature(dev, DRIVER_LEGACY)) @@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) drm_minor_unregister(dev, DRM_MINOR_PRIMARY); drm_minor_unregister(dev, DRM_MINOR_RENDER); out_unlock: - mutex_unlock(&drm_global_mutex); + if (drm_dev_needs_global_mutex(dev)) + mutex_unlock(&drm_global_mutex); return ret; } EXPORT_SYMBOL(drm_dev_register); diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index d36cb74ebe0c..efd6fe0b6b4f 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -51,6 +51,37 @@ /* from BKL pushdown */ DEFINE_MUTEX(drm_global_mutex);
+bool drm_dev_needs_global_mutex(struct drm_device *dev) +{ + /* + * Legacy drivers rely on all kinds of BKL locking semantics, don't + * bother. They also still need BKL locking for their ioctls, so better + * safe than sorry. + */ + if (drm_core_check_feature(dev, DRIVER_LEGACY)) + return true; + + /* + * The deprecated ->load callback must be called after the driver is + * already registered. This means such drivers rely on the BKL to make + * sure an open can't proceed until the driver is actually fully set up. + * Similar hilarity holds for the unload callback. + */ + if (dev->driver->load || dev->driver->unload) + return true; + + /* + * Drivers with the lastclose callback assume that it's synchronized + * against concurrent opens, which again needs the BKL. The proper fix + * is to use the drm_client infrastructure with proper locking for each + * client. + */ + if (dev->driver->lastclose) + return true; + + return false; +} + /** * DOC: file operations * @@ -378,9 +409,10 @@ int drm_open(struct inode *inode, struct file *filp) if (IS_ERR(minor)) return PTR_ERR(minor);
- mutex_unlock(&drm_global_mutex); - dev = minor->dev; + if (drm_dev_needs_global_mutex(dev)) + mutex_unlock(&drm_global_mutex); + if (!atomic_fetch_inc(&dev->open_count)) need_setup = 1;
@@ -398,13 +430,15 @@ int drm_open(struct inode *inode, struct file *filp) } }
- mutex_unlock(&drm_global_mutex); + if (drm_dev_needs_global_mutex(dev)) + mutex_unlock(&drm_global_mutex);
return 0;
err_undo: atomic_dec(&dev->open_count); - mutex_unlock(&drm_global_mutex); + if (drm_dev_needs_global_mutex(dev)) + mutex_unlock(&drm_global_mutex); drm_minor_release(minor); return retcode; } @@ -444,6 +478,7 @@ int drm_release(struct inode *inode, struct file *filp) struct drm_minor *minor = file_priv->minor; struct drm_device *dev = minor->dev;
+ if (drm_dev_needs_global_mutex(dev)) mutex_lock(&drm_global_mutex);
DRM_DEBUG("open_count = %d\n", atomic_read(&dev->open_count)); @@ -453,7 +488,8 @@ int drm_release(struct inode *inode, struct file *filp) if (atomic_dec_and_test(&dev->open_count)) drm_lastclose(dev);
- mutex_unlock(&drm_global_mutex); + if (drm_dev_needs_global_mutex(dev)) + mutex_unlock(&drm_global_mutex);
drm_minor_release(minor);
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h index 6937bf923f05..aeec2e68d772 100644 --- a/drivers/gpu/drm/drm_internal.h +++ b/drivers/gpu/drm/drm_internal.h @@ -41,6 +41,7 @@ struct drm_printer;
/* drm_file.c */ extern struct mutex drm_global_mutex; +bool drm_dev_needs_global_mutex(struct drm_device *dev); struct drm_file *drm_file_alloc(struct drm_minor *minor); void drm_file_free(struct drm_file *file); void drm_lastclose(struct drm_device *dev);
Quoting Daniel Vetter (2020-01-28 10:46:01)
This catches the majority of drivers (unfortunately not if we take users into account, because all the big drivers have at least a lastclose hook).
Yeah, that lastclose for switching back to fbcon, and ensuring that switch is started before the next client takes over the console, does nerf the ability of avoiding the global_mutex.
With the prep patches out of the way all drm state is fully protected and either prevents or can deal with the races from dropping the BKL around open/close. The only thing left to audit are the various driver hooks - by keeping the BKL around if any of them are set we have a very simple cop-out!
Note that one of the biggest prep pieces to get here was making dev->open_count atomic, which was done in
commit 7e13ad896484a0165a68197a2e64091ea28c9602 Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jan 24 13:01:07 2020 +0000
drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 +++-- drivers/gpu/drm/drm_file.c | 46 ++++++++++++++++++++++++++++++---- drivers/gpu/drm/drm_internal.h | 1 + 3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 05bdf0b9d2b3..9fcd6ab3c154 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) struct drm_driver *driver = dev->driver; int ret;
mutex_lock(&drm_global_mutex);
if (drm_dev_needs_global_mutex(dev))
mutex_lock(&drm_global_mutex); if (dev->driver->load) { if (!drm_core_check_feature(dev, DRIVER_LEGACY))
@@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) drm_minor_unregister(dev, DRM_MINOR_PRIMARY); drm_minor_unregister(dev, DRM_MINOR_RENDER); out_unlock:
mutex_unlock(&drm_global_mutex);
if (drm_dev_needs_global_mutex(dev))
mutex_unlock(&drm_global_mutex); return ret;
} EXPORT_SYMBOL(drm_dev_register); diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index d36cb74ebe0c..efd6fe0b6b4f 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -51,6 +51,37 @@ /* from BKL pushdown */ DEFINE_MUTEX(drm_global_mutex);
+bool drm_dev_needs_global_mutex(struct drm_device *dev) +{
/*
* Legacy drivers rely on all kinds of BKL locking semantics, don't
* bother. They also still need BKL locking for their ioctls, so better
* safe than sorry.
*/
if (drm_core_check_feature(dev, DRIVER_LEGACY))
return true;
/*
* The deprecated ->load callback must be called after the driver is
* already registered. This means such drivers rely on the BKL to make
* sure an open can't proceed until the driver is actually fully set up.
* Similar hilarity holds for the unload callback.
*/
if (dev->driver->load || dev->driver->unload)
return true;
/*
* Drivers with the lastclose callback assume that it's synchronized
* against concurrent opens, which again needs the BKL. The proper fix
* is to use the drm_client infrastructure with proper locking for each
* client.
*/
if (dev->driver->lastclose)
return true;
return false;
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk
I'm not particularly fussed by this patch, maybe a bit too obviously midlayer.
I was wondering if we should have a *dev->global_mutex which drivers can set to be any level they need (with a bit of care on our part to make sure it is not destroyed across dev->driver->lastclose). -Chris
On Tue, Jan 28, 2020 at 11:00:32AM +0000, Chris Wilson wrote:
Quoting Daniel Vetter (2020-01-28 10:46:01)
This catches the majority of drivers (unfortunately not if we take users into account, because all the big drivers have at least a lastclose hook).
Yeah, that lastclose for switching back to fbcon, and ensuring that switch is started before the next client takes over the console, does nerf the ability of avoiding the global_mutex.
With the prep patches out of the way all drm state is fully protected and either prevents or can deal with the races from dropping the BKL around open/close. The only thing left to audit are the various driver hooks - by keeping the BKL around if any of them are set we have a very simple cop-out!
Note that one of the biggest prep pieces to get here was making dev->open_count atomic, which was done in
commit 7e13ad896484a0165a68197a2e64091ea28c9602 Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jan 24 13:01:07 2020 +0000
drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count
Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 +++-- drivers/gpu/drm/drm_file.c | 46 ++++++++++++++++++++++++++++++---- drivers/gpu/drm/drm_internal.h | 1 + 3 files changed, 46 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 05bdf0b9d2b3..9fcd6ab3c154 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -946,7 +946,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) struct drm_driver *driver = dev->driver; int ret;
mutex_lock(&drm_global_mutex);
if (drm_dev_needs_global_mutex(dev))
mutex_lock(&drm_global_mutex); if (dev->driver->load) { if (!drm_core_check_feature(dev, DRIVER_LEGACY))
@@ -992,7 +993,8 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags) drm_minor_unregister(dev, DRM_MINOR_PRIMARY); drm_minor_unregister(dev, DRM_MINOR_RENDER); out_unlock:
mutex_unlock(&drm_global_mutex);
if (drm_dev_needs_global_mutex(dev))
mutex_unlock(&drm_global_mutex); return ret;
} EXPORT_SYMBOL(drm_dev_register); diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index d36cb74ebe0c..efd6fe0b6b4f 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -51,6 +51,37 @@ /* from BKL pushdown */ DEFINE_MUTEX(drm_global_mutex);
+bool drm_dev_needs_global_mutex(struct drm_device *dev) +{
/*
* Legacy drivers rely on all kinds of BKL locking semantics, don't
* bother. They also still need BKL locking for their ioctls, so better
* safe than sorry.
*/
if (drm_core_check_feature(dev, DRIVER_LEGACY))
return true;
/*
* The deprecated ->load callback must be called after the driver is
* already registered. This means such drivers rely on the BKL to make
* sure an open can't proceed until the driver is actually fully set up.
* Similar hilarity holds for the unload callback.
*/
if (dev->driver->load || dev->driver->unload)
return true;
/*
* Drivers with the lastclose callback assume that it's synchronized
* against concurrent opens, which again needs the BKL. The proper fix
* is to use the drm_client infrastructure with proper locking for each
* client.
*/
if (dev->driver->lastclose)
return true;
return false;
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk
I'm not particularly fussed by this patch, maybe a bit too obviously midlayer.
Yeah it's not great. But I think long-term we might have a chance to limit this to DRIVER_LEGACY: - load/unload is on the way out - generic fbdev won't need lastclose - I have an idea for fixing the vgaswitcheroo locking, now that I starred at this wall a bit more. Should be a good undependent cleanup. - we could perhaps do a ->lastclose_unlocked for reducing that midlayer smell a tad. Same way we slowly managed to get rid of the dev->struct_mutex locking midlayer.
I was wondering if we should have a *dev->global_mutex which drivers can set to be any level they need (with a bit of care on our part to make sure it is not destroyed across dev->driver->lastclose).
I feel like outside of legacy driver the drm_global_mutex protection is limited enough that there's no need to give driver writers ideas :-) -Daniel
Quoting Daniel Vetter (2020-01-28 10:45:58)
Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
if (dev->driver->load) {
if (!drm_core_check_feature(dev, DRIVER_LEGACY))
DRM_INFO("drm driver %s is using deprecated ->load callback\n",
dev->driver->name);
DRM_WARN() if the plan is to remove it?
That should encourage people to complain louder. -Chris
Quoting Chris Wilson (2020-01-28 10:47:59)
Quoting Daniel Vetter (2020-01-28 10:45:58)
Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
if (dev->driver->load) {
if (!drm_core_check_feature(dev, DRIVER_LEGACY))
DRM_INFO("drm driver %s is using deprecated ->load callback\n",
dev->driver->name);
DRM_WARN() if the plan is to remove it?
Either way though, Reviewed-by: Chris Wilson chris@chris-wilson.co.uk -Chris
On Tue, Jan 28, 2020 at 10:47:59AM +0000, Chris Wilson wrote:
Quoting Daniel Vetter (2020-01-28 10:45:58)
Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
if (dev->driver->load) {
if (!drm_core_check_feature(dev, DRIVER_LEGACY))
DRM_INFO("drm driver %s is using deprecated ->load callback\n",
dev->driver->name);
DRM_WARN() if the plan is to remove it?
Consensus from the security check work that Kees Cook is doing seems to be: - Put new thing in place, convert lots - Start to do opt-in/informational stuff - Once you're sure it's all gone, put in the big splat that kills the box/driver. Apparently radeon/amdgpu are the hold-outs, once those are done I think I'll just outright disable ->load/unload for !DRIVER_LEGACY.
Cheers, Daniel
That should encourage people to complain louder. -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi
Am 28.01.20 um 11:45 schrieb Daniel Vetter:
Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
- if (dev->driver->load) {
if (!drm_core_check_feature(dev, DRIVER_LEGACY))
DRM_INFO("drm driver %s is using deprecated ->load callback\n",
dev->driver->name);
- }
- ret = drm_minor_register(dev, DRM_MINOR_RENDER); if (ret) goto err_minors;
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 77685ed7aa65..77bc63de0a91 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -173,6 +173,9 @@ struct drm_driver { * * This is deprecated, do not use! *
* FIXME: A few non-DRIVER_LEGACY drivers still use this, and should be
* converted.
*
I recently converted several of them. The status here is that only radeon and amdgpu still use load. I've only not been able to convert them because they do some debugfs registering that relies on the device being registered early. I've not had time to convert the code.
On the patch:
Reviewed-by: Thomas Zimmermann tzimmermann@suse.de
* Returns: * * Zero on success, non-zero value on failure.
On Tue, Jan 28, 2020 at 02:41:06PM +0100, Thomas Zimmermann wrote:
Hi
Am 28.01.20 um 11:45 schrieb Daniel Vetter:
Kinda time to get this sorted. The locking around this really is not nice.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
drivers/gpu/drm/drm_drv.c | 6 ++++++ include/drm/drm_drv.h | 3 +++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 7c18a980cd4b..8deff75b484c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -948,6 +948,12 @@ int drm_dev_register(struct drm_device *dev, unsigned long flags)
mutex_lock(&drm_global_mutex);
- if (dev->driver->load) {
if (!drm_core_check_feature(dev, DRIVER_LEGACY))
DRM_INFO("drm driver %s is using deprecated ->load callback\n",
dev->driver->name);
- }
- ret = drm_minor_register(dev, DRM_MINOR_RENDER); if (ret) goto err_minors;
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 77685ed7aa65..77bc63de0a91 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -173,6 +173,9 @@ struct drm_driver { * * This is deprecated, do not use! *
* FIXME: A few non-DRIVER_LEGACY drivers still use this, and should be
* converted.
*
I recently converted several of them. The status here is that only radeon and amdgpu still use load. I've only not been able to convert them because they do some debugfs registering that relies on the device being registered early. I've not had time to convert the code.
Oh nice, this sounds great. We have an outreachy project to helpt simplify the debugfs stuff, so might be worth waiting for that. I guess I'll keep this patch meanwhile (expect if Alex or Harry ack it, they'll see the fallout if we merge this early). -Daniel
On the patch:
Reviewed-by: Thomas Zimmermann tzimmermann@suse.de
* Returns: * * Zero on success, non-zero value on failure.
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
dri-devel@lists.freedesktop.org