Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
1. Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
2. Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
3. Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
4. Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
5. Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8: - Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI. - Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7: - Modify code alignment as suggested by the intel-gfx CI. - Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI. - Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6: - Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. - Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter. - Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov. - Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5: - Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. - Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4: - Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/ - Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set. - Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter. - Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3: - Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter. - Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2: - Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
In preparation for a future patch to take a lock on drm_device.master_mutex inside drm_is_current_master(), we first move the call to drm_is_current_master() in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. This avoids creating a circular lock dependency.
Failing to avoid this lock dependency produces the following lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 5.13.0-rc7-CI-CI_DRM_10254+ #1 Not tainted ------------------------------------------------------ kms_frontbuffer/1087 is trying to acquire lock: ffff88810dcd01a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1b/0x40 but task is already holding lock: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&dev->mode_config.mutex){+.+.}-{3:3}: __mutex_lock+0xab/0x970 drm_client_modeset_probe+0x22e/0xca0 __drm_fb_helper_initial_config_and_unlock+0x42/0x540 intel_fbdev_initial_config+0xf/0x20 [i915] async_run_entry_fn+0x28/0x130 process_one_work+0x26d/0x5c0 worker_thread+0x37/0x380 kthread+0x144/0x170 ret_from_fork+0x1f/0x30 -> #1 (&client->modeset_mutex){+.+.}-{3:3}: __mutex_lock+0xab/0x970 drm_client_modeset_commit_locked+0x1c/0x180 drm_client_modeset_commit+0x1c/0x40 __drm_fb_helper_restore_fbdev_mode_unlocked+0x88/0xb0 drm_fb_helper_set_par+0x34/0x40 intel_fbdev_set_par+0x11/0x40 [i915] fbcon_init+0x270/0x4f0 visual_init+0xc6/0x130 do_bind_con_driver+0x1e5/0x2d0 do_take_over_console+0x10e/0x180 do_fbcon_takeover+0x53/0xb0 register_framebuffer+0x22d/0x310 __drm_fb_helper_initial_config_and_unlock+0x36c/0x540 intel_fbdev_initial_config+0xf/0x20 [i915] async_run_entry_fn+0x28/0x130 process_one_work+0x26d/0x5c0 worker_thread+0x37/0x380 kthread+0x144/0x170 ret_from_fork+0x1f/0x30 -> #0 (&dev->master_mutex){+.+.}-{3:3}: __lock_acquire+0x151e/0x2590 lock_acquire+0xd1/0x3d0 __mutex_lock+0xab/0x970 drm_is_current_master+0x1b/0x40 drm_mode_getconnector+0x37e/0x4a0 drm_ioctl_kernel+0xa8/0xf0 drm_ioctl+0x1e8/0x390 __x64_sys_ioctl+0x6a/0xa0 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &dev->master_mutex --> &client->modeset_mutex --> &dev->mode_config.mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&dev->mode_config.mutex); lock(&client->modeset_mutex); lock(&dev->mode_config.mutex); lock(&dev->master_mutex); *** DEADLOCK *** 1 lock held by kms_frontbuffer/1087: #0: ffff88810dcd0488 (&dev->mode_config.mutex){+.+.}-{3:3}, at: drm_mode_getconnector+0x1c6/0x4a0 stack backtrace: CPU: 7 PID: 1087 Comm: kms_frontbuffer Not tainted 5.13.0-rc7-CI-CI_DRM_10254+ #1 Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3234.A01.1906141750 06/14/2019 Call Trace: dump_stack+0x7f/0xad check_noncircular+0x12e/0x150 __lock_acquire+0x151e/0x2590 lock_acquire+0xd1/0x3d0 __mutex_lock+0xab/0x970 drm_is_current_master+0x1b/0x40 drm_mode_getconnector+0x37e/0x4a0 drm_ioctl_kernel+0xa8/0xf0 drm_ioctl+0x1e8/0x390 __x64_sys_ioctl+0x6a/0xa0 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
Reported-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reviewed-by: Emil Velikov emil.l.velikov@gmail.com --- drivers/gpu/drm/drm_connector.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c index da39e7ff6965..2ba257b1ae20 100644 --- a/drivers/gpu/drm/drm_connector.c +++ b/drivers/gpu/drm/drm_connector.c @@ -2414,6 +2414,7 @@ int drm_mode_getconnector(struct drm_device *dev, void *data, struct drm_mode_modeinfo u_mode; struct drm_mode_modeinfo __user *mode_ptr; uint32_t __user *encoder_ptr; + bool is_current_master;
if (!drm_core_check_feature(dev, DRIVER_MODESET)) return -EOPNOTSUPP; @@ -2444,9 +2445,11 @@ int drm_mode_getconnector(struct drm_device *dev, void *data, out_resp->connector_type = connector->connector_type; out_resp->connector_type_id = connector->connector_type_id;
+ is_current_master = drm_is_current_master(file_priv); + mutex_lock(&dev->mode_config.mutex); if (out_resp->count_modes == 0) { - if (drm_is_current_master(file_priv)) + if (is_current_master) connector->funcs->fill_modes(connector, dev->mode_config.max_width, dev->mode_config.max_height);
Inside drm_clients_info, the rcu_read_lock is held to lock pid_task()->comm. However, within this protected section, a call to drm_is_current_master is made, which involves a mutex lock in a future patch. However, this is illegal because the mutex lock might block while in the RCU read-side critical section.
Since drm_is_current_master isn't protected by rcu_read_lock, we avoid this by moving it out of the RCU critical section.
The following report came from intel-gfx ci's igt@debugfs_test@read_all_entries testcase:
============================= [ BUG: Invalid wait context ] 5.13.0-CI-Patchwork_20515+ #1 Tainted: G W ----------------------------- debugfs_test/1101 is trying to lock: ffff888132d901a8 (&dev->master_mutex){+.+.}-{3:3}, at: drm_is_current_master+0x1e/0x50 other info that might help us debug this: context-{4:4} 3 locks held by debugfs_test/1101: #0: ffff88810fdffc90 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0x53/0x3b0 #1: ffff888132d90240 (&dev->filelist_mutex){+.+.}-{3:3}, at: drm_clients_info+0x63/0x2a0 #2: ffffffff82734220 (rcu_read_lock){....}-{1:2}, at: drm_clients_info+0x1b1/0x2a0 stack backtrace: CPU: 8 PID: 1101 Comm: debugfs_test Tainted: G W 5.13.0-CI-Patchwork_20515+ #1 Hardware name: Intel Corporation CometLake Client Platform/CometLake S UDIMM (ERB/CRB), BIOS CMLSFWR1.R00.1263.D00.1906260926 06/26/2019 Call Trace: dump_stack+0x7f/0xad __lock_acquire.cold.78+0x2af/0x2ca lock_acquire+0xd3/0x300 ? drm_is_current_master+0x1e/0x50 ? __mutex_lock+0x76/0x970 ? lockdep_hardirqs_on+0xbf/0x130 __mutex_lock+0xab/0x970 ? drm_is_current_master+0x1e/0x50 ? drm_is_current_master+0x1e/0x50 ? drm_is_current_master+0x1e/0x50 drm_is_current_master+0x1e/0x50 drm_clients_info+0x107/0x2a0 seq_read_iter+0x178/0x3b0 seq_read+0x104/0x150 full_proxy_read+0x4e/0x80 vfs_read+0xa5/0x1b0 ksys_read+0x5a/0xd0 do_syscall_64+0x39/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae
Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com --- drivers/gpu/drm/drm_debugfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c index 3d7182001004..b0a826489488 100644 --- a/drivers/gpu/drm/drm_debugfs.c +++ b/drivers/gpu/drm/drm_debugfs.c @@ -91,6 +91,7 @@ static int drm_clients_info(struct seq_file *m, void *data) mutex_lock(&dev->filelist_mutex); list_for_each_entry_reverse(priv, &dev->filelist, lhead) { struct task_struct *task; + bool is_current_master = drm_is_current_master(priv);
rcu_read_lock(); /* locks pid_task()->comm */ task = pid_task(priv->pid, PIDTYPE_PID); @@ -99,7 +100,7 @@ static int drm_clients_info(struct seq_file *m, void *data) task ? task->comm : "<unknown>", pid_vnr(priv->pid), priv->minor->index, - drm_is_current_master(priv) ? 'y' : 'n', + is_current_master ? 'y' : 'n', priv->authenticated ? 'y' : 'n', from_kuid_munged(seq_user_ns(m), uid), priv->magic);
While checking the master status of the DRM file in drm_is_current_master(), the device's master mutex should be held. Without the mutex, the pointer fpriv->master may be freed concurrently by another process calling drm_setmaster_ioctl(). This could lead to use-after-free errors when the pointer is subsequently dereferenced in drm_lease_owner().
The callers of drm_is_current_master() from drm_auth.c hold the device's master mutex, but external callers do not. Hence, we implement drm_is_current_master_locked() to be used within drm_auth.c, and modify drm_is_current_master() to grab the device's master mutex before checking the master status.
Reported-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reviewed-by: Emil Velikov emil.l.velikov@gmail.com --- drivers/gpu/drm/drm_auth.c | 51 ++++++++++++++++++++++++-------------- 1 file changed, 32 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index f00e5abdbbf4..ab1863c5a5a0 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -61,6 +61,35 @@ * trusted clients. */
+static bool drm_is_current_master_locked(struct drm_file *fpriv) +{ + lockdep_assert_held_once(&fpriv->minor->dev->master_mutex); + + return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master; +} + +/** + * drm_is_current_master - checks whether @priv is the current master + * @fpriv: DRM file private + * + * Checks whether @fpriv is current master on its device. This decides whether a + * client is allowed to run DRM_MASTER IOCTLs. + * + * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting + * - the current master is assumed to own the non-shareable display hardware. + */ +bool drm_is_current_master(struct drm_file *fpriv) +{ + bool ret; + + mutex_lock(&fpriv->minor->dev->master_mutex); + ret = drm_is_current_master_locked(fpriv); + mutex_unlock(&fpriv->minor->dev->master_mutex); + + return ret; +} +EXPORT_SYMBOL(drm_is_current_master); + int drm_getmagic(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct drm_auth *auth = data; @@ -223,7 +252,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void *data, if (ret) goto out_unlock;
- if (drm_is_current_master(file_priv)) + if (drm_is_current_master_locked(file_priv)) goto out_unlock;
if (dev->master) { @@ -272,7 +301,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void *data, if (ret) goto out_unlock;
- if (!drm_is_current_master(file_priv)) { + if (!drm_is_current_master_locked(file_priv)) { ret = -EINVAL; goto out_unlock; } @@ -321,7 +350,7 @@ void drm_master_release(struct drm_file *file_priv) if (file_priv->magic) idr_remove(&file_priv->master->magic_map, file_priv->magic);
- if (!drm_is_current_master(file_priv)) + if (!drm_is_current_master_locked(file_priv)) goto out;
drm_legacy_lock_master_cleanup(dev, master); @@ -342,22 +371,6 @@ void drm_master_release(struct drm_file *file_priv) mutex_unlock(&dev->master_mutex); }
-/** - * drm_is_current_master - checks whether @priv is the current master - * @fpriv: DRM file private - * - * Checks whether @fpriv is current master on its device. This decides whether a - * client is allowed to run DRM_MASTER IOCTLs. - * - * Most of the modern IOCTL which require DRM_MASTER are for kernel modesetting - * - the current master is assumed to own the non-shareable display hardware. - */ -bool drm_is_current_master(struct drm_file *fpriv) -{ - return fpriv->is_master && drm_lease_owner(fpriv->master) == fpriv->minor->dev->master; -} -EXPORT_SYMBOL(drm_is_current_master); - /** * drm_master_get - reference a master pointer * @master: &struct drm_master
Currently, drm_file.master pointers should be protected by drm_device.master_mutex when being dereferenced. This is because drm_file.master is not invariant for the lifetime of drm_file. If drm_file is not the creator of master, then drm_file.is_master is false, and a call to drm_setmaster_ioctl will invoke drm_new_set_master, which then allocates a new master for drm_file and puts the old master.
Thus, without holding drm_device.master_mutex, the old value of drm_file.master could be freed while it is being used by another concurrent process.
However, it is not always possible to lock drm_device.master_mutex to dereference drm_file.master. Through the fbdev emulation code, this might occur in a deep nest of other locks. But drm_device.master_mutex is also the outermost lock in the nesting hierarchy, so this leads to potential deadlocks.
To address this, we introduce a new spin lock at the bottom of the lock hierarchy that only serializes drm_file.master. With this change, the value of drm_file.master changes only when both drm_device.master_mutex and drm_file.master_lookup_lock are held. Hence, any process holding either of those locks can ensure that the value of drm_file.master will not change concurrently.
Since no lock depends on the new drm_file.master_lookup_lock, when drm_file.master is dereferenced, but drm_device.master_mutex cannot be held, we can safely protect the master pointer with drm_file.master_lookup_lock.
Reported-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com --- drivers/gpu/drm/drm_auth.c | 17 +++++++++++------ drivers/gpu/drm/drm_file.c | 1 + include/drm/drm_file.h | 12 +++++++++--- 3 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index ab1863c5a5a0..30a239901b36 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -164,16 +164,18 @@ static void drm_set_master(struct drm_device *dev, struct drm_file *fpriv, static int drm_new_set_master(struct drm_device *dev, struct drm_file *fpriv) { struct drm_master *old_master; + struct drm_master *new_master;
lockdep_assert_held_once(&dev->master_mutex);
WARN_ON(fpriv->is_master); old_master = fpriv->master; - fpriv->master = drm_master_create(dev); - if (!fpriv->master) { - fpriv->master = old_master; + new_master = drm_master_create(dev); + if (!new_master) return -ENOMEM; - } + spin_lock(&fpriv->master_lookup_lock); + fpriv->master = new_master; + spin_unlock(&fpriv->master_lookup_lock);
fpriv->is_master = 1; fpriv->authenticated = 1; @@ -332,10 +334,13 @@ int drm_master_open(struct drm_file *file_priv) * any master object for render clients */ mutex_lock(&dev->master_mutex); - if (!dev->master) + if (!dev->master) { ret = drm_new_set_master(dev, file_priv); - else + } else { + spin_lock(&file_priv->master_lookup_lock); file_priv->master = drm_master_get(dev->master); + spin_unlock(&file_priv->master_lookup_lock); + } mutex_unlock(&dev->master_mutex);
return ret; diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index d4f0bac6f8f8..ceb1a9723855 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -176,6 +176,7 @@ struct drm_file *drm_file_alloc(struct drm_minor *minor) init_waitqueue_head(&file->event_wait); file->event_space = 4096; /* set aside 4k for event buffer */
+ spin_lock_init(&file->master_lookup_lock); mutex_init(&file->event_read_lock);
if (drm_core_check_feature(dev, DRIVER_GEM)) diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index b81b3bfb08c8..9b82988e3427 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -226,15 +226,21 @@ struct drm_file { /** * @master: * - * Master this node is currently associated with. Only relevant if - * drm_is_primary_client() returns true. Note that this only - * matches &drm_device.master if the master is the currently active one. + * Master this node is currently associated with. Protected by struct + * &drm_device.master_mutex, and serialized by @master_lookup_lock. + * + * Only relevant if drm_is_primary_client() returns true. Note that + * this only matches &drm_device.master if the master is the currently + * active one. * * See also @authentication and @is_master and the :ref:`section on * primary nodes and authentication <drm_primary_node>`. */ struct drm_master *master;
+ /** @master_lock: Serializes @master. */ + spinlock_t master_lookup_lock; + /** @pid: Process that opened this file. */ struct pid *pid;
drm_file->master pointers should be protected by drm_device.master_mutex or drm_file.master_lookup_lock when being dereferenced.
However, in drm_lease.c, there are multiple instances where drm_file->master is accessed and dereferenced while neither lock is held. This makes drm_lease.c vulnerable to use-after-free bugs.
We address this issue in 2 ways:
1. Add a new drm_file_get_master() function that calls drm_master_get on drm_file->master while holding on to drm_file.master_lookup_lock. Since drm_master_get increments the reference count of master, this prevents master from being freed until we unreference it with drm_master_put.
2. In each case where drm_file->master is directly accessed and eventually dereferenced in drm_lease.c, we wrap the access in a call to the new drm_file_get_master function, then unreference the master pointer once we are done using it.
Reported-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Desmond Cheong Zhi Xi desmondcheongzx@gmail.com Reviewed-by: Emil Velikov emil.l.velikov@gmail.com --- drivers/gpu/drm/drm_auth.c | 25 ++++++++++++ drivers/gpu/drm/drm_lease.c | 81 ++++++++++++++++++++++++++++--------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 6 +++ 4 files changed, 93 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c index 30a239901b36..f00354bec3fb 100644 --- a/drivers/gpu/drm/drm_auth.c +++ b/drivers/gpu/drm/drm_auth.c @@ -389,6 +389,31 @@ struct drm_master *drm_master_get(struct drm_master *master) } EXPORT_SYMBOL(drm_master_get);
+/** + * drm_file_get_master - reference &drm_file.master of @file_priv + * @file_priv: DRM file private + * + * Increments the reference count of @file_priv's &drm_file.master and returns + * the &drm_file.master. If @file_priv has no &drm_file.master, returns NULL. + * + * Master pointers returned from this function should be unreferenced using + * drm_master_put(). + */ +struct drm_master *drm_file_get_master(struct drm_file *file_priv) +{ + struct drm_master *master = NULL; + + spin_lock(&file_priv->master_lookup_lock); + if (!file_priv->master) + goto unlock; + master = drm_master_get(file_priv->master); + +unlock: + spin_unlock(&file_priv->master_lookup_lock); + return master; +} +EXPORT_SYMBOL(drm_file_get_master); + static void drm_master_destroy(struct kref *kref) { struct drm_master *master = container_of(kref, struct drm_master, refcount); diff --git a/drivers/gpu/drm/drm_lease.c b/drivers/gpu/drm/drm_lease.c index 00fb433bcef1..92eac73d9001 100644 --- a/drivers/gpu/drm/drm_lease.c +++ b/drivers/gpu/drm/drm_lease.c @@ -106,10 +106,19 @@ static bool _drm_has_leased(struct drm_master *master, int id) */ bool _drm_lease_held(struct drm_file *file_priv, int id) { - if (!file_priv || !file_priv->master) + bool ret; + struct drm_master *master; + + if (!file_priv) return true;
- return _drm_lease_held_master(file_priv->master, id); + master = drm_file_get_master(file_priv); + if (!master) + return true; + ret = _drm_lease_held_master(master, id); + drm_master_put(&master); + + return ret; }
/** @@ -128,13 +137,22 @@ bool drm_lease_held(struct drm_file *file_priv, int id) struct drm_master *master; bool ret;
- if (!file_priv || !file_priv->master || !file_priv->master->lessor) + if (!file_priv) return true;
- master = file_priv->master; + master = drm_file_get_master(file_priv); + if (!master) + return true; + if (!master->lessor) { + ret = true; + goto out; + } mutex_lock(&master->dev->mode_config.idr_mutex); ret = _drm_lease_held_master(master, id); mutex_unlock(&master->dev->mode_config.idr_mutex); + +out: + drm_master_put(&master); return ret; }
@@ -154,10 +172,16 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) int count_in, count_out; uint32_t crtcs_out = 0;
- if (!file_priv || !file_priv->master || !file_priv->master->lessor) + if (!file_priv) return crtcs_in;
- master = file_priv->master; + master = drm_file_get_master(file_priv); + if (!master) + return crtcs_in; + if (!master->lessor) { + crtcs_out = crtcs_in; + goto out; + } dev = master->dev;
count_in = count_out = 0; @@ -176,6 +200,9 @@ uint32_t drm_lease_filter_crtcs(struct drm_file *file_priv, uint32_t crtcs_in) count_in++; } mutex_unlock(&master->dev->mode_config.idr_mutex); + +out: + drm_master_put(&master); return crtcs_out; }
@@ -489,7 +516,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, size_t object_count; int ret = 0; struct idr leases; - struct drm_master *lessor = lessor_priv->master; + struct drm_master *lessor; struct drm_master *lessee = NULL; struct file *lessee_file = NULL; struct file *lessor_file = lessor_priv->filp; @@ -501,12 +528,6 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, if (!drm_core_check_feature(dev, DRIVER_MODESET)) return -EOPNOTSUPP;
- /* Do not allow sub-leases */ - if (lessor->lessor) { - DRM_DEBUG_LEASE("recursive leasing not allowed\n"); - return -EINVAL; - } - /* need some objects */ if (cl->object_count == 0) { DRM_DEBUG_LEASE("no objects in lease\n"); @@ -518,12 +539,22 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, return -EINVAL; }
+ lessor = drm_file_get_master(lessor_priv); + /* Do not allow sub-leases */ + if (lessor->lessor) { + DRM_DEBUG_LEASE("recursive leasing not allowed\n"); + ret = -EINVAL; + goto out_lessor; + } + object_count = cl->object_count;
object_ids = memdup_user(u64_to_user_ptr(cl->object_ids), array_size(object_count, sizeof(__u32))); - if (IS_ERR(object_ids)) - return PTR_ERR(object_ids); + if (IS_ERR(object_ids)) { + ret = PTR_ERR(object_ids); + goto out_lessor; + }
idr_init(&leases);
@@ -534,14 +565,15 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, if (ret) { DRM_DEBUG_LEASE("lease object lookup failed: %i\n", ret); idr_destroy(&leases); - return ret; + goto out_lessor; }
/* Allocate a file descriptor for the lease */ fd = get_unused_fd_flags(cl->flags & (O_CLOEXEC | O_NONBLOCK)); if (fd < 0) { idr_destroy(&leases); - return fd; + ret = fd; + goto out_lessor; }
DRM_DEBUG_LEASE("Creating lease\n"); @@ -577,6 +609,7 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, /* Hook up the fd */ fd_install(fd, lessee_file);
+ drm_master_put(&lessor); DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl succeeded\n"); return 0;
@@ -586,6 +619,8 @@ int drm_mode_create_lease_ioctl(struct drm_device *dev, out_leases: put_unused_fd(fd);
+out_lessor: + drm_master_put(&lessor); DRM_DEBUG_LEASE("drm_mode_create_lease_ioctl failed: %d\n", ret); return ret; } @@ -608,7 +643,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, struct drm_mode_list_lessees *arg = data; __u32 __user *lessee_ids = (__u32 __user *) (uintptr_t) (arg->lessees_ptr); __u32 count_lessees = arg->count_lessees; - struct drm_master *lessor = lessor_priv->master, *lessee; + struct drm_master *lessor, *lessee; int count; int ret = 0;
@@ -619,6 +654,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, if (!drm_core_check_feature(dev, DRIVER_MODESET)) return -EOPNOTSUPP;
+ lessor = drm_file_get_master(lessor_priv); DRM_DEBUG_LEASE("List lessees for %d\n", lessor->lessee_id);
mutex_lock(&dev->mode_config.idr_mutex); @@ -642,6 +678,7 @@ int drm_mode_list_lessees_ioctl(struct drm_device *dev, arg->count_lessees = count;
mutex_unlock(&dev->mode_config.idr_mutex); + drm_master_put(&lessor);
return ret; } @@ -661,7 +698,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, struct drm_mode_get_lease *arg = data; __u32 __user *object_ids = (__u32 __user *) (uintptr_t) (arg->objects_ptr); __u32 count_objects = arg->count_objects; - struct drm_master *lessee = lessee_priv->master; + struct drm_master *lessee; struct idr *object_idr; int count; void *entry; @@ -675,6 +712,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, if (!drm_core_check_feature(dev, DRIVER_MODESET)) return -EOPNOTSUPP;
+ lessee = drm_file_get_master(lessee_priv); DRM_DEBUG_LEASE("get lease for %d\n", lessee->lessee_id);
mutex_lock(&dev->mode_config.idr_mutex); @@ -702,6 +740,7 @@ int drm_mode_get_lease_ioctl(struct drm_device *dev, arg->count_objects = count;
mutex_unlock(&dev->mode_config.idr_mutex); + drm_master_put(&lessee);
return ret; } @@ -720,7 +759,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, void *data, struct drm_file *lessor_priv) { struct drm_mode_revoke_lease *arg = data; - struct drm_master *lessor = lessor_priv->master; + struct drm_master *lessor; struct drm_master *lessee; int ret = 0;
@@ -730,6 +769,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev, if (!drm_core_check_feature(dev, DRIVER_MODESET)) return -EOPNOTSUPP;
+ lessor = drm_file_get_master(lessor_priv); mutex_lock(&dev->mode_config.idr_mutex);
lessee = _drm_find_lessee(lessor, arg->lessee_id); @@ -750,6 +790,7 @@ int drm_mode_revoke_lease_ioctl(struct drm_device *dev,
fail: mutex_unlock(&dev->mode_config.idr_mutex); + drm_master_put(&lessor);
return ret; } diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h index 6bf8b2b78991..f99d3417f304 100644 --- a/include/drm/drm_auth.h +++ b/include/drm/drm_auth.h @@ -107,6 +107,7 @@ struct drm_master { };
struct drm_master *drm_master_get(struct drm_master *master); +struct drm_master *drm_file_get_master(struct drm_file *file_priv); void drm_master_put(struct drm_master **master); bool drm_is_current_master(struct drm_file *fpriv);
diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index 9b82988e3427..726cfe0ff5f5 100644 --- a/include/drm/drm_file.h +++ b/include/drm/drm_file.h @@ -233,6 +233,12 @@ struct drm_file { * this only matches &drm_device.master if the master is the currently * active one. * + * When dereferencing this pointer, either hold struct + * &drm_device.master_mutex for the duration of the pointer's use, or + * use drm_file_get_master() if struct &drm_device.master_mutex is not + * currently held and there is no other need to hold it. This prevents + * @master from being freed during use. + * * See also @authentication and @is_master and the :ref:`section on * primary nodes and authentication <drm_primary_node>`. */
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
On 21/7/21 2:24 am, Daniel Vetter wrote:
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
Hi Daniel,
Thanks for merging, more testing definitely sounds good to me.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
I thought about this too, but I'm not sure if that's the best solution.
drm_is_current_master calls drm_lease_owner which then walks up the tree of master lessors. The spinlock protects the master of the current drm file, but subsequent lessors aren't protected without holding the device's master mutex.
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Yup, now that the potential UAFs are addressed (hopefully), I'll take a closer look and propose a patch for this.
Best wishes, Desmond
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 2:24 am, Daniel Vetter wrote:
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
Hi Daniel,
Thanks for merging, more testing definitely sounds good to me.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
I thought about this too, but I'm not sure if that's the best solution.
drm_is_current_master calls drm_lease_owner which then walks up the tree of master lessors. The spinlock protects the master of the current drm file, but subsequent lessors aren't protected without holding the device's master mutex.
But this isn't a fpriv->master pointer, but a master->lessor pointer. Which should never ever be able to change (we'd have tons of uaf bugs around drm_lease_owner otherwise). So I don't think there's anything that dev->master_lock protects here that fpriv->master_lookup_lock doesn't protect already?
Or am I missing something?
The comment in the struct drm_master says it's protected by mode_config.idr_mutex, but that only applies to the idrs and lists I think.
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Yup, now that the potential UAFs are addressed (hopefully), I'll take a closer look and propose a patch for this.
Thanks a lot. -Daniel
Best wishes, Desmond
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
On 21/7/21 6:29 pm, Daniel Vetter wrote:
On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 2:24 am, Daniel Vetter wrote:
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
Hi Daniel,
Thanks for merging, more testing definitely sounds good to me.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
I thought about this too, but I'm not sure if that's the best solution.
drm_is_current_master calls drm_lease_owner which then walks up the tree of master lessors. The spinlock protects the master of the current drm file, but subsequent lessors aren't protected without holding the device's master mutex.
But this isn't a fpriv->master pointer, but a master->lessor pointer. Which should never ever be able to change (we'd have tons of uaf bugs around drm_lease_owner otherwise). So I don't think there's anything that dev->master_lock protects here that fpriv->master_lookup_lock doesn't protect already?
Or am I missing something?
The comment in the struct drm_master says it's protected by
mode_config.idr_mutex, but that only applies to the idrs and lists I think.
Ah you're right, I also completely forgot that lessees hold a reference to their lessor so nothing will be freed as long as the spinlock is held. I'll prepare that patch then, thanks for pointing it out.
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Yup, now that the potential UAFs are addressed (hopefully), I'll take a closer look and propose a patch for this.
Thanks a lot. -Daniel
Best wishes, Desmond
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
On Wed, Jul 21, 2021 at 2:44 PM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 6:29 pm, Daniel Vetter wrote:
On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 2:24 am, Daniel Vetter wrote:
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
Hi Daniel,
Thanks for merging, more testing definitely sounds good to me.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
I thought about this too, but I'm not sure if that's the best solution.
drm_is_current_master calls drm_lease_owner which then walks up the tree of master lessors. The spinlock protects the master of the current drm file, but subsequent lessors aren't protected without holding the device's master mutex.
But this isn't a fpriv->master pointer, but a master->lessor pointer. Which should never ever be able to change (we'd have tons of uaf bugs around drm_lease_owner otherwise). So I don't think there's anything that dev->master_lock protects here that fpriv->master_lookup_lock doesn't protect already?
Or am I missing something?
The comment in the struct drm_master says it's protected by
mode_config.idr_mutex, but that only applies to the idrs and lists I think.
Ah you're right, I also completely forgot that lessees hold a reference to their lessor so nothing will be freed as long as the spinlock is held. I'll prepare that patch then, thanks for pointing it out.
btw since we now looked at all this in detail, can you perhaps do a patch to update the kerneldoc for all the lease fields in struct drm_master? I think moving them to the inline style and then adding comments for each field how locking/lifetime rules work would be really good. Since right now it's all fresh from for us. -Daniel
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Yup, now that the potential UAFs are addressed (hopefully), I'll take a closer look and propose a patch for this.
Thanks a lot. -Daniel
Best wishes, Desmond
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
On 21/7/21 9:23 pm, Daniel Vetter wrote:
On Wed, Jul 21, 2021 at 2:44 PM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 6:29 pm, Daniel Vetter wrote:
On Wed, Jul 21, 2021 at 6:12 AM Desmond Cheong Zhi Xi desmondcheongzx@gmail.com wrote:
On 21/7/21 2:24 am, Daniel Vetter wrote:
On Mon, Jul 12, 2021 at 12:35:03PM +0800, Desmond Cheong Zhi Xi wrote:
Hi,
In the previous thread on this series we decided to remove a patch that was violating a lockdep requirement in drm_lease. In addition to this change, I took a closer look at the CI logs for the Basic Acceptance Tests and noticed that another regression was introduced. The new patch 2 is a response to this.
Overall, this series addresses potential use-after-free errors when dereferencing pointers to struct drm_master. These were identified after one such bug was caught by Syzbot in drm_getunique(): https://syzkaller.appspot.com/bug?id=148d2f1dfac64af52ffd27b661981a540724f80...
The series is broken up into five patches:
Move a call to drm_is_current_master() out from a section locked by &dev->mode_config.mutex in drm_mode_getconnector(). This patch does not apply to stable.
Move a call to drm_is_current_master() out from the RCU read-side critical section in drm_clients_info().
Implement a locked version of drm_is_current_master() function that's used within drm_auth.c.
Serialize drm_file.master by introducing a new spinlock that's held whenever the value of drm_file.master changes.
Identify areas in drm_lease.c where pointers to struct drm_master are dereferenced, and ensure that the master pointers are not freed during use.
v7 -> v8:
- Remove the patch that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find. This patch violated an existing lockdep requirement as reported by the intel-gfx CI.
- Added a new patch that moves a call to drm_is_current_master out from the RCU critical section in drm_clients_info. This was reported by the intel-gfx CI.
v6 -> v7:
- Modify code alignment as suggested by the intel-gfx CI.
- Add a new patch to the series that adds a new lock to serialize drm_file.master, in response to the lockdep splat by the intel-gfx CI.
- Update drm_file_get_master to use the new drm_file.master_lock instead of drm_device.master_mutex, in response to the lockdep splat by the intel-gfx CI.
v5 -> v6:
- Add a new patch to the series that moves the call to _drm_lease_held out from the section locked by &dev->mode_config.idr_mutex in __drm_mode_object_find.
- Clarify the kerneldoc for dereferencing drm_file.master, as suggested by Daniel Vetter.
- Refactor error paths with goto labels so that each function only has a single drm_master_put(), as suggested by Emil Velikov.
- Modify comparisons to NULL into "!master", as suggested by the intel-gfx CI.
v4 -> v5:
- Add a new patch to the series that moves the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex.
- Additionally, added a missing semicolon to the patch, caught by the intel-gfx CI.
v3 -> v4:
- Move the call to drm_is_current_master in drm_mode_getconnector out from the section locked by &dev->mode_config.mutex. As suggested by Daniel Vetter. This avoids a circular lock lock dependency as reported here https://patchwork.freedesktop.org/patch/440406/
- Inside drm_is_current_master, instead of grabbing &fpriv->master->dev->master_mutex, we grab &fpriv->minor->dev->master_mutex to avoid dereferencing a null ptr if fpriv->master is not set.
- Modify kerneldoc formatting for drm_file.master, as suggested by Daniel Vetter.
- Additionally, add a file_priv->master NULL check inside drm_file_get_master, and handle the NULL result accordingly in drm_lease.c. As suggested by Daniel Vetter.
v2 -> v3:
- Move the definition of drm_is_current_master and the _locked version higher up in drm_auth.c to avoid needing a forward declaration of drm_is_current_master_locked. As suggested by Daniel Vetter.
- Instead of leaking drm_device.master_mutex into drm_lease.c to protect drm_master pointers, add a new drm_file_get_master() function that returns drm_file->master while increasing its reference count, to prevent drm_file->master from being freed. As suggested by Daniel Vetter.
v1 -> v2:
- Move the lock and assignment before the DRM_DEBUG_LEASE in drm_mode_get_lease_ioctl, as suggested by Emil Velikov.
Apologies for the delay, I missed your series. Maybe just ping next time around there's silence.
Looks all great, merged to drm-misc-next. Given how complex this was I'm vary of just pushing this to -fixes without some solid testing.
Hi Daniel,
Thanks for merging, more testing definitely sounds good to me.
One thing I noticed is that drm_is_current_master could just use the spinlock, since it's only doing a read access. Care to type up that patch?
I thought about this too, but I'm not sure if that's the best solution.
drm_is_current_master calls drm_lease_owner which then walks up the tree of master lessors. The spinlock protects the master of the current drm file, but subsequent lessors aren't protected without holding the device's master mutex.
But this isn't a fpriv->master pointer, but a master->lessor pointer. Which should never ever be able to change (we'd have tons of uaf bugs around drm_lease_owner otherwise). So I don't think there's anything that dev->master_lock protects here that fpriv->master_lookup_lock doesn't protect already?
Or am I missing something?
The comment in the struct drm_master says it's protected by
mode_config.idr_mutex, but that only applies to the idrs and lists I think.
Ah you're right, I also completely forgot that lessees hold a reference to their lessor so nothing will be freed as long as the spinlock is held. I'll prepare that patch then, thanks for pointing it out.
btw since we now looked at all this in detail, can you perhaps do a patch to update the kerneldoc for all the lease fields in struct drm_master? I think moving them to the inline style and then adding comments for each field how locking/lifetime rules work would be really good. Since right now it's all fresh from for us. -Daniel
Sure thing. Just sent out the suggested changes in the same series, along with a relevant fix for drm/vmwgfx that I just noticed.
Also, do you plan to look into that idea we've discussed to flush pending access when we revoke a master or a lease? I think that would be really nice improvement here. -Daniel
Yup, now that the potential UAFs are addressed (hopefully), I'll take a closer look and propose a patch for this.
Thanks a lot. -Daniel
Best wishes, Desmond
Desmond Cheong Zhi Xi (5): drm: avoid circular locks in drm_mode_getconnector drm: avoid blocking in drm_clients_info's rcu section drm: add a locked version of drm_is_current_master drm: serialize drm_file.master with a new spinlock drm: protect drm_master pointers in drm_lease.c
drivers/gpu/drm/drm_auth.c | 93 ++++++++++++++++++++++++--------- drivers/gpu/drm/drm_connector.c | 5 +- drivers/gpu/drm/drm_debugfs.c | 3 +- drivers/gpu/drm/drm_file.c | 1 + drivers/gpu/drm/drm_lease.c | 81 +++++++++++++++++++++------- include/drm/drm_auth.h | 1 + include/drm/drm_file.h | 18 +++++-- 7 files changed, 152 insertions(+), 50 deletions(-)
-- 2.25.1
dri-devel@lists.freedesktop.org