[PATCH 1/2] drm: Fix memory leak at error path of drm_read()

List overview All Threads
Download

newer

older

[PULL] topic/atomic-core

[PATCH RFC v7 00/21] Add support...

Takashi Iwai

4 Dec 2014 4 Dec '14

10:56 a.m.

Signed-off-by: Takashi Iwai tiwai@suse.de --- drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT; + e->destroy(e); break; }

-- 2.1.3

Show replies by date

Takashi Iwai

4 Dec 4 Dec

10:56 a.m.

New subject: [PATCH 2/2] drm: Implement nonblocking mode in drm_read()

drm_read() always blocks the read until an event comes up even if the file is opened in nonblocking mode. Also, there is a potential race between the wake_event_interruptible() call and the actual event retrieval in drm_dequeue_event(), and it may result in zero event read even in the blocking mode.

This patch fixes these issues. The fixes are two folds:

- Do drm_dequeue_event() at first, then call wait_event_interruptible() in a loop only if no event is pending. It's safe to call drm_dequeue_event() at first since it returns immediately if no event is found, and the function is protected properly by event_lock.

- Add a f_flags check when drm_dequeue_event() fails and returns immediately in nonblocking mode. This is evaluated only at the first read.

Note: originally this issue comes up while debugging the hangup of X at switching VT quickly. For example, running the below on KDE results in the stall of X: for i in $(seq 1 100); do chvt 1; chvt 7; done

This was reproduced on various Intel machines. I took a stack trace of the hanging process, and it points to the blockage in drm_read().

This patch "fixes" such a bug. A drawback is that now it becomes more difficult to find out such a buggy code in the caller side.

Signed-off-by: Takashi Iwai tiwai@suse.de --- drivers/gpu/drm/drm_fops.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index a82dc28d54f3..04833eaf51b3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -515,13 +515,20 @@ ssize_t drm_read(struct file *filp, char __user *buffer, size_t total; ssize_t ret;

- ret = wait_event_interruptible(file_priv->event_wait, - !list_empty(&file_priv->event_list)); - if (ret < 0) - return ret; - total = 0; - while (drm_dequeue_event(file_priv, total, count, &e)) { + for (;;) { + if (!drm_dequeue_event(file_priv, total, count, &e)) { + if (total) + break; + if (filp->f_flags & O_NONBLOCK) + return -EAGAIN; + ret = wait_event_interruptible(file_priv->event_wait, + !list_empty(&file_priv->event_list)); + if (ret < 0) + return ret; + continue; + } + if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;

-- 2.1.3

Daniel Vetter

11:46 a.m.

On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...

Signed-off-by: Takashi Iwai tiwai@suse.de

Applied to my drm-misc branch. The second patch seems to be a dupe of

commit bd008e5b2953186fc0c6633a885ade95e7043800 Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 7 14:13:51 2014 +0100

drm: Implement O_NONBLOCK support on /dev/dri/cardN

so maybe we should backport that one once it's landed in 3.19? -Daniel

...

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
break;
}
-- 2.1.3

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Takashi Iwai

11:50 a.m.

At Thu, 4 Dec 2014 12:46:16 +0100, Daniel Vetter wrote:

...

On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

Applied to my drm-misc branch. The second patch seems to be a dupe of

commit bd008e5b2953186fc0c6633a885ade95e7043800 Author: Chris Wilson chris@chris-wilson.co.uk Date: Tue Oct 7 14:13:51 2014 +0100
drm: Implement O_NONBLOCK support on /dev/dri/cardN
so maybe we should backport that one once it's landed in 3.19?

Ah thanks, I misread Chris' comment as if it were for libdrm, and now understood correctly.

Takashi

...

-Daniel

...
drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
break;
}
-- 2.1.3

dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Takashi Iwai

11:51 a.m.

At Thu, 04 Dec 2014 12:50:04 +0100, Takashi Iwai wrote:

...

At Thu, 4 Dec 2014 12:46:16 +0100, Daniel Vetter wrote:

...
so maybe we should backport that one once it's landed in 3.19?

Speaking about this: yes, it's worth, IMO, as this (unexpectedly) seems really fixing the issue on the current driver.

thanks,

Takashi

Chris Wilson

11:51 a.m.

On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...

Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);

We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly. -Chris

-- Chris Wilson, Intel Open Source Technology Centre

Takashi Iwai

11:56 a.m.

At Thu, 4 Dec 2014 11:51:14 +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.

Yeah, I thought of it while writing this, but as a starter, I tried the simpler one. (And I didn't realize your comment referring to the already existing fix in kernel, sorry!)

The problem to hold the spinlock for the whole is that you can't it with copy_to_user(). So it'd be a bit tricky.

And, why not using event_wait.lock instead of the extra event_lock? Then we can use wait_event_lock_*() variant that covers more race between dequeuing and wait_event.

Takashi

Chris Wilson

12:13 p.m.

On Thu, Dec 04, 2014 at 12:56:38PM +0100, Takashi Iwai wrote:

...

At Thu, 4 Dec 2014 11:51:14 +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
Yeah, I thought of it while writing this, but as a starter, I tried the simpler one. (And I didn't realize your comment referring to the already existing fix in kernel, sorry!)

The problem to hold the spinlock for the whole is that you can't it with copy_to_user(). So it'd be a bit tricky.

We can use access_ok(VERFIY_WRITE) and then copy_to_user_inatomic(). Doable with a bit of code rearrangement.

...

And, why not using event_wait.lock instead of the extra event_lock? Then we can use wait_event_lock_*() variant that covers more race between dequeuing and wait_event.

The tricky part would appear to be then protecting dev->vblank_event_list. That looks like it could be an RCU list instead? -Chris

-- Chris Wilson, Intel Open Source Technology Centre

Takashi Iwai

12:27 p.m.

At Thu, 4 Dec 2014 12:13:46 +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 12:56:38PM +0100, Takashi Iwai wrote:

...
At Thu, 4 Dec 2014 11:51:14 +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
Yeah, I thought of it while writing this, but as a starter, I tried the simpler one. (And I didn't realize your comment referring to the already existing fix in kernel, sorry!)

The problem to hold the spinlock for the whole is that you can't it with copy_to_user(). So it'd be a bit tricky.
We can use access_ok(VERFIY_WRITE) and then copy_to_user_inatomic(). Doable with a bit of code rearrangement.

Yes, but this is really not necessarily atomic. We can simply prepend the event back in the error case, instead, too.

...

...
And, why not using event_wait.lock instead of the extra event_lock? Then we can use wait_event_lock_*() variant that covers more race between dequeuing and wait_event.

The tricky part would appear to be then protecting dev->vblank_event_list. That looks like it could be an RCU list instead?

Sounds interesting.

Takashi

Daniel Vetter

12:28 p.m.

On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.

I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.

I'll add a note to the commit message. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Chris Wilson

4:31 p.m.

On Thu, Dec 04, 2014 at 01:28:39PM +0100, Daniel Vetter wrote:

...

On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.

Hmm. Actually the code is buggy is the provided buffer is too short for the first event in O_NONBLOCK mode. -Chris

-- Chris Wilson, Intel Open Source Technology Centre

Chris Wilson

4:38 p.m.

On Thu, Dec 04, 2014 at 04:31:15PM +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 01:28:39PM +0100, Daniel Vetter wrote:

...
On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.
Hmm. Actually the code is buggy is the provided buffer is too short for the first event in O_NONBLOCK mode.

Just inlining drm_dequeue_event is simple enough to fix the issue:

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 91e1105..7af6676 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -478,43 +478,17 @@ int drm_release(struct inode *inode, struct file *filp) } EXPORT_SYMBOL(drm_release);

-static bool -drm_dequeue_event(struct drm_file *file_priv, - size_t total, size_t max, struct drm_pending_event **out) -{ - struct drm_device *dev = file_priv->minor->dev; - struct drm_pending_event *e; - unsigned long flags; - bool ret = false; - - spin_lock_irqsave(&dev->event_lock, flags); - - *out = NULL; - if (list_empty(&file_priv->event_list)) - goto out; - e = list_first_entry(&file_priv->event_list, - struct drm_pending_event, link); - if (e->event->length + total > max) - goto out; - - file_priv->event_space += e->event->length; - list_del(&e->link); - *out = e; - ret = true; - -out: - spin_unlock_irqrestore(&dev->event_lock, flags); - return ret; -} - ssize_t drm_read(struct file *filp, char __user *buffer, size_t count, loff_t *offset) { struct drm_file *file_priv = filp->private_data; - struct drm_pending_event *e; - size_t total; + struct drm_device *dev = file_priv->minor->dev; + unsigned long flags; ssize_t ret;

+ if (!access_ok(VERIFY_WRITE, buffer, count)) + return -EFAULT; + if ((filp->f_flags & O_NONBLOCK) == 0) { ret = wait_event_interruptible(file_priv->event_wait, !list_empty(&file_priv->event_list)); @@ -522,19 +496,35 @@ ssize_t drm_read(struct file *filp, char __user *buffer, return ret; }

- total = 0; - while (drm_dequeue_event(file_priv, total, count, &e)) { - if (copy_to_user(buffer + total, - e->event, e->event->length)) { - total = -EFAULT; - break; - } + spin_lock_irqsave(&dev->event_lock, flags); + if (list_empty(&file_priv->event_list)) { + ret = -EAGAIN; + } else { + ret = 0; + do { + struct drm_pending_event *e; + + e = list_first_entry(&file_priv->event_list, + struct drm_pending_event, link); + if (e->event->length + ret > count) + break; + + if (__copy_to_user_inatomic(buffer + ret, + e->event, e->event->length)) { + if (ret == 0) + ret = -EFAULT; + break; + }

- total += e->event->length; - e->destroy(e); + file_priv->event_space += e->event->length; + ret += e->event->length; + list_del(&e->link); + e->destroy(e); + } while (!list_empty(&file_priv->event_list)); } + spin_unlock_irqrestore(&dev->event_lock, flags);

- return total ?: -EAGAIN; + return ret; } EXPORT_SYMBOL(drm_read);

which looks familar to the previous attempt to get O_NONBLOCK fixed. -Chris

-- Chris Wilson, Intel Open Source Technology Centre

Takashi Iwai

4:55 p.m.

At Thu, 4 Dec 2014 16:38:57 +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 04:31:15PM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 01:28:39PM +0100, Daniel Vetter wrote:

...
On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.
Hmm. Actually the code is buggy is the provided buffer is too short for the first event in O_NONBLOCK mode.
Just inlining drm_dequeue_event is simple enough to fix the issue:

Right, this avoids the dequeuing at error.

But still there is a race between wait_event_interruptible() and dequeuing. So, the more comprehensive code should look like:

if (!access_ok(VERIFY_WRITE, buffer, count)) return -EFAULT;

spin_lock_irq(&dev->event_lock); for (;;) { if (list_empty(&file_priv->event_list)) { if (ret) break; if (!(filp->f_flags & O_NONBLOCK)) { ret = -EAGAIN; break; } spin_unlock_irq(&dev->event_lock); ret = wait_event_interruptible(....); spin_lock_irq(&dev->event_lock); if (ret < 0) break; ret = 0; continue; }

e = list_first_entry(&file_priv->event_list, struct drm_pending_event, link); if (e->event->length + ret > count) break;

if (__copy_to_user_inatomic(buffer + ret, e->event, e->event->length)) { if (!ret) ret = -EFAULT; break; }

file_priv->event_space += e->event->length; ret += e->event->length; list_del(&e->link); e->destroy(e); }

spin_unlock_irq(&dev->event_lock); return ret;

thanks,

Takashi

...

-static bool -drm_dequeue_event(struct drm_file *file_priv,

            size_t total, size_t max, struct drm_pending_event **out)

  struct drm_device *dev = file_priv->minor->dev;

```
  struct drm_pending_event *e;
```
```
  unsigned long flags;
```
```
  bool ret = false;
```

  spin_lock_irqsave(&dev->event_lock, flags);

```
  *out = NULL;
```

  if (list_empty(&file_priv->event_list))

```
          goto out;
```

  e = list_first_entry(&file_priv->event_list,

                       struct drm_pending_event, link);

```
  if (e->event->length + total > max)
```
```
          goto out;
```

  file_priv->event_space += e->event->length;

```
  list_del(&e->link);
```
```
  *out = e;
```
```
  ret = true;
```

-out:

  spin_unlock_irqrestore(&dev->event_lock, flags);

```
  return ret;
```

ssize_t drm_read(struct file *filp, char __user *buffer, size_t count, loff_t *offset) { struct drm_file *file_priv = filp->private_data;

```
  struct drm_pending_event *e;
```
```
  size_t total;
```

  struct drm_device *dev = file_priv->minor->dev;

```
  unsigned long flags;
  ssize_t ret;
```

  if (!access_ok(VERIFY_WRITE, buffer, count))

```
          return -EFAULT;
```

  if ((filp->f_flags & O_NONBLOCK) == 0) {
          ret = wait_event_interruptible(file_priv->event_wait,
                                         !list_empty(&file_priv->event_list));

@@ -522,19 +496,35 @@ ssize_t drm_read(struct file *filp, char __user *buffer, return ret; }

```
  total = 0;
```

  while (drm_dequeue_event(file_priv, total, count, &e)) {

          if (copy_to_user(buffer + total,

                           e->event, e->event->length)) {

```
                  total = -EFAULT;
```
```
                  break;
```
```
          }
```

  spin_lock_irqsave(&dev->event_lock, flags);

  if (list_empty(&file_priv->event_list)) {

```
          ret = -EAGAIN;
```
```
  } else {
```
```
          ret = 0;
```
```
          do {
```

                  struct drm_pending_event *e;

                  e = list_first_entry(&file_priv->event_list,

                                       struct drm_pending_event, link);

                  if (e->event->length + ret > count)

```
                          break;
```

                  if (__copy_to_user_inatomic(buffer + ret,

                                              e->event, e->event->length)) {

                          if (ret == 0)

                                  ret = -EFAULT;

```
                          break;
```
```
                  }
```

```
          total += e->event->length;
```
```
          e->destroy(e);
```

                  file_priv->event_space += e->event->length;

                  ret += e->event->length;

```
                  list_del(&e->link);
```
```
                  e->destroy(e);
```

          } while (!list_empty(&file_priv->event_list));
  }

  spin_unlock_irqrestore(&dev->event_lock, flags);

```
  return total ?: -EAGAIN;
```

```
  return ret;
```

} EXPORT_SYMBOL(drm_read);

which looks familar to the previous attempt to get O_NONBLOCK fixed. -Chris

-- Chris Wilson, Intel Open Source Technology Centre

Chris Wilson

9:03 p.m.

New subject: [PATCH] drm: Make drm_read() more robust against multithreaded races

The current implementation of drm_read() faces a number of issues:

1. Upon an error, it consumes the event which may lead to the client blocking. 2. Upon an error, it forgets about events already copied 3. If it fails to copy a single event with O_NONBLOCK it falls into a infinite loop of reporting EAGAIN. 3. There is a race between multiple waiters and blocking reads of the events list.

Here, we inline drm_dequeue_event() into drm_read() so that we can take the spinlock around the list walking and event copying, and importantly reorder the error handling to avoid the issues above.

Cc: Takashi Iwai tiwai@suse.de Signed-off-by: Chris Wilson chris@chris-wilson.co.uk --- drivers/gpu/drm/drm_fops.c | 90 ++++++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 91e1105f2800..076dd606b580 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -478,63 +478,59 @@ int drm_release(struct inode *inode, struct file *filp) } EXPORT_SYMBOL(drm_release);

-static bool -drm_dequeue_event(struct drm_file *file_priv, - size_t total, size_t max, struct drm_pending_event **out) +ssize_t drm_read(struct file *filp, char __user *buffer, + size_t count, loff_t *offset) { + struct drm_file *file_priv = filp->private_data; struct drm_device *dev = file_priv->minor->dev; - struct drm_pending_event *e; - unsigned long flags; - bool ret = false; - - spin_lock_irqsave(&dev->event_lock, flags); + ssize_t ret = 0;

- *out = NULL; - if (list_empty(&file_priv->event_list)) - goto out; - e = list_first_entry(&file_priv->event_list, - struct drm_pending_event, link); - if (e->event->length + total > max) - goto out; + if (!access_ok(VERIFY_WRITE, buffer, count)) + return -EFAULT;

- file_priv->event_space += e->event->length; - list_del(&e->link); - *out = e; - ret = true; + spin_lock_irq(&dev->event_lock); + for (;;) { + if (list_empty(&file_priv->event_list)) { + if (ret) + break;

-out: - spin_unlock_irqrestore(&dev->event_lock, flags); - return ret; -} - -ssize_t drm_read(struct file *filp, char __user *buffer, - size_t count, loff_t *offset) -{ - struct drm_file *file_priv = filp->private_data; - struct drm_pending_event *e; - size_t total; - ssize_t ret; + if (filp->f_flags & O_NONBLOCK) { + ret = -EAGAIN; + break; + }

- if ((filp->f_flags & O_NONBLOCK) == 0) { - ret = wait_event_interruptible(file_priv->event_wait, - !list_empty(&file_priv->event_list)); - if (ret < 0) - return ret; - } + spin_unlock_irq(&dev->event_lock); + ret = wait_event_interruptible(file_priv->event_wait, + !list_empty(&file_priv->event_list)); + spin_lock_irq(&dev->event_lock); + if (ret < 0) + break; + + ret = 0; + } else { + struct drm_pending_event *e; + + e = list_first_entry(&file_priv->event_list, + struct drm_pending_event, link); + if (e->event->length + ret > count) + break; + + if (__copy_to_user_inatomic(buffer + ret, + e->event, e->event->length)) { + if (ret == 0) + ret = -EFAULT; + break; + }

- total = 0; - while (drm_dequeue_event(file_priv, total, count, &e)) { - if (copy_to_user(buffer + total, - e->event, e->event->length)) { - total = -EFAULT; - break; + file_priv->event_space += e->event->length; + ret += e->event->length; + list_del(&e->link); + e->destroy(e); } - - total += e->event->length; - e->destroy(e); } + spin_unlock_irq(&dev->event_lock);

- return total ?: -EAGAIN; + return ret; } EXPORT_SYMBOL(drm_read);

-- 2.1.3

Takashi Iwai

5 Dec 5 Dec

8:42 a.m.

New subject: [PATCH] drm: Make drm_read() more robust against multithreaded races

At Thu, 4 Dec 2014 21:03:25 +0000, Chris Wilson wrote:

...

The current implementation of drm_read() faces a number of issues:

Upon an error, it consumes the event which may lead to the client

blocking. 2. Upon an error, it forgets about events already copied 3. If it fails to copy a single event with O_NONBLOCK it falls into a infinite loop of reporting EAGAIN. 3. There is a race between multiple waiters and blocking reads of the events list.

Here, we inline drm_dequeue_event() into drm_read() so that we can take the spinlock around the list walking and event copying, and importantly reorder the error handling to avoid the issues above.

Cc: Takashi Iwai tiwai@suse.de

Reviewed-by: Takashi Iwai tiwai@suse.de

Takashi

...

Signed-off-by: Chris Wilson chris@chris-wilson.co.uk

drivers/gpu/drm/drm_fops.c | 90 ++++++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 91e1105f2800..076dd606b580 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -478,63 +478,59 @@ int drm_release(struct inode *inode, struct file *filp) } EXPORT_SYMBOL(drm_release);

-static bool -drm_dequeue_event(struct drm_file *file_priv,
  size_t total, size_t max, struct drm_pending_event **out)
+ssize_t drm_read(struct file *filp, char __user *buffer,
 size_t count, loff_t *offset)
{

struct drm_file *file_priv = filp->private_data; struct drm_device *dev = file_priv->minor->dev;

struct drm_pending_event *e;

unsigned long flags;

bool ret = false;

spin_lock_irqsave(&dev->event_lock, flags);

ssize_t ret = 0;
*out = NULL;

if (list_empty(&file_priv->event_list))
goto out;
e = list_first_entry(&file_priv->event_list,
	     struct drm_pending_event, link);
if (e->event->length + total > max)
goto out;
if (!access_ok(VERIFY_WRITE, buffer, count))
return -EFAULT;
file_priv->event_space += e->event->length;

list_del(&e->link);

*out = e;

ret = true;
spin_lock_irq(&dev->event_lock);

for (;;) {
if (list_empty(&file_priv->event_list)) {
	if (ret)
		break;
-out:

spin_unlock_irqrestore(&dev->event_lock, flags);

return ret;

-}

-ssize_t drm_read(struct file *filp, char __user *buffer,
 size_t count, loff_t *offset)
-{

struct drm_file *file_priv = filp->private_data;

struct drm_pending_event *e;

size_t total;

ssize_t ret;
	if (filp->f_flags & O_NONBLOCK) {
		ret = -EAGAIN;
		break;
	}
if ((filp->f_flags & O_NONBLOCK) == 0) {
ret = wait_event_interruptible(file_priv->event_wait,
			       !list_empty(&file_priv->event_list));
if (ret < 0)
	return ret;
}
	spin_unlock_irq(&dev->event_lock);
	ret = wait_event_interruptible(file_priv->event_wait,
				       !list_empty(&file_priv->event_list));
	spin_lock_irq(&dev->event_lock);
	if (ret < 0)
		break;
	ret = 0;
} else {
	struct drm_pending_event *e;
	e = list_first_entry(&file_priv->event_list,
			     struct drm_pending_event, link);
	if (e->event->length + ret > count)
		break;
	if (__copy_to_user_inatomic(buffer + ret,
				    e->event, e->event->length)) {
		if (ret == 0)
			ret = -EFAULT;
		break;
	}
total = 0;

while (drm_dequeue_event(file_priv, total, count, &e)) {
if (copy_to_user(buffer + total,
		 e->event, e->event->length)) {
	total = -EFAULT;
	break;
	file_priv->event_space += e->event->length;
	ret += e->event->length;
	list_del(&e->link);
	e->destroy(e);
}
total += e->event->length;
e->destroy(e);
}
spin_unlock_irq(&dev->event_lock);

return total ?: -EAGAIN;

return ret;

} EXPORT_SYMBOL(drm_read);

-- 2.1.3

Daniel Vetter

5 Jan 5 Jan

10:18 a.m.

New subject: [Intel-gfx] [PATCH] drm: Make drm_read() more robust against multithreaded races

On Fri, Dec 05, 2014 at 09:42:35AM +0100, Takashi Iwai wrote:

...

At Thu, 4 Dec 2014 21:03:25 +0000, Chris Wilson wrote:

...
The current implementation of drm_read() faces a number of issues:

Upon an error, it consumes the event which may lead to the client

blocking. 2. Upon an error, it forgets about events already copied 3. If it fails to copy a single event with O_NONBLOCK it falls into a infinite loop of reporting EAGAIN. 3. There is a race between multiple waiters and blocking reads of the events list.

Here, we inline drm_dequeue_event() into drm_read() so that we can take the spinlock around the list walking and event copying, and importantly reorder the error handling to avoid the issues above.

Cc: Takashi Iwai tiwai@suse.de

Reviewed-by: Takashi Iwai tiwai@suse.de

Merged to drm-misc with the tag for the drm_read testcase added. Thanks for the patch&reivew. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter

5 Dec 5 Dec

8:44 a.m.

New subject: [Intel-gfx] [PATCH] drm: Make drm_read() more robust against multithreaded races

On Thu, Dec 04, 2014 at 09:03:25PM +0000, Chris Wilson wrote:

...

The current implementation of drm_read() faces a number of issues:

Upon an error, it consumes the event which may lead to the client

blocking. 2. Upon an error, it forgets about events already copied 3. If it fails to copy a single event with O_NONBLOCK it falls into a infinite loop of reporting EAGAIN. 3. There is a race between multiple waiters and blocking reads of the events list.

Here, we inline drm_dequeue_event() into drm_read() so that we can take the spinlock around the list walking and event copying, and importantly reorder the error handling to avoid the issues above.

Cc: Takashi Iwai tiwai@suse.de Signed-off-by: Chris Wilson chris@chris-wilson.co.uk

Imo if you go through all the trouble of fixing the corner-cases then we should also have a testcase to exercise them. Otherwise I expect this to fall apart again. So a little igt would be great. -Daniel

...

drivers/gpu/drm/drm_fops.c | 90 ++++++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index 91e1105f2800..076dd606b580 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -478,63 +478,59 @@ int drm_release(struct inode *inode, struct file *filp) } EXPORT_SYMBOL(drm_release);

-static bool -drm_dequeue_event(struct drm_file *file_priv,
  size_t total, size_t max, struct drm_pending_event **out)
+ssize_t drm_read(struct file *filp, char __user *buffer,
 size_t count, loff_t *offset)
{

struct drm_file *file_priv = filp->private_data; struct drm_device *dev = file_priv->minor->dev;

struct drm_pending_event *e;

unsigned long flags;

bool ret = false;

spin_lock_irqsave(&dev->event_lock, flags);

ssize_t ret = 0;
*out = NULL;

if (list_empty(&file_priv->event_list))
goto out;
e = list_first_entry(&file_priv->event_list,
	     struct drm_pending_event, link);
if (e->event->length + total > max)
goto out;
if (!access_ok(VERIFY_WRITE, buffer, count))
return -EFAULT;
file_priv->event_space += e->event->length;

list_del(&e->link);

*out = e;

ret = true;
spin_lock_irq(&dev->event_lock);

for (;;) {
if (list_empty(&file_priv->event_list)) {
	if (ret)
		break;
-out:

spin_unlock_irqrestore(&dev->event_lock, flags);

return ret;

-}

-ssize_t drm_read(struct file *filp, char __user *buffer,
 size_t count, loff_t *offset)
-{

struct drm_file *file_priv = filp->private_data;

struct drm_pending_event *e;

size_t total;

ssize_t ret;
	if (filp->f_flags & O_NONBLOCK) {
		ret = -EAGAIN;
		break;
	}
if ((filp->f_flags & O_NONBLOCK) == 0) {
ret = wait_event_interruptible(file_priv->event_wait,
			       !list_empty(&file_priv->event_list));
if (ret < 0)
	return ret;
}
	spin_unlock_irq(&dev->event_lock);
	ret = wait_event_interruptible(file_priv->event_wait,
				       !list_empty(&file_priv->event_list));
	spin_lock_irq(&dev->event_lock);
	if (ret < 0)
		break;
	ret = 0;
} else {
	struct drm_pending_event *e;
	e = list_first_entry(&file_priv->event_list,
			     struct drm_pending_event, link);
	if (e->event->length + ret > count)
		break;
	if (__copy_to_user_inatomic(buffer + ret,
				    e->event, e->event->length)) {
		if (ret == 0)
			ret = -EFAULT;
		break;
	}
total = 0;

while (drm_dequeue_event(file_priv, total, count, &e)) {
if (copy_to_user(buffer + total,
		 e->event, e->event->length)) {
	total = -EFAULT;
	break;
	file_priv->event_space += e->event->length;
	ret += e->event->length;
	list_del(&e->link);
	e->destroy(e);
}
total += e->event->length;
e->destroy(e);
}
spin_unlock_irq(&dev->event_lock);

return total ?: -EAGAIN;

return ret;

} EXPORT_SYMBOL(drm_read);

-- 2.1.3

Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Daniel Vetter

4 Dec 4 Dec

5:25 p.m.

On Thu, Dec 04, 2014 at 04:31:15PM +0000, Chris Wilson wrote:

...

On Thu, Dec 04, 2014 at 01:28:39PM +0100, Daniel Vetter wrote:

...
On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.
Hmm. Actually the code is buggy is the provided buffer is too short for the first event in O_NONBLOCK mode.

Well we essentially send out datagrams instead of a bytestream. If we look at recvmsg and friends then discarding the additional bytes is something that's already being done. So I'm not terribly concerned about that either. It's a bit non-pretty that we use read and not reicvmsg but since we use read already not something we can ever fix. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Chris Wilson

5 Dec 5 Dec

8:01 a.m.

On Thu, Dec 04, 2014 at 06:25:08PM +0100, Daniel Vetter wrote:

...

On Thu, Dec 04, 2014 at 04:31:15PM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 01:28:39PM +0100, Daniel Vetter wrote:

...
On Thu, Dec 04, 2014 at 11:51:14AM +0000, Chris Wilson wrote:

...
On Thu, Dec 04, 2014 at 11:56:42AM +0100, Takashi Iwai wrote:

...
Signed-off-by: Takashi Iwai tiwai@suse.de

drivers/gpu/drm/drm_fops.c | 1 + 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/drm_fops.c b/drivers/gpu/drm/drm_fops.c index ed7bc68f7e87..a82dc28d54f3 100644 --- a/drivers/gpu/drm/drm_fops.c +++ b/drivers/gpu/drm/drm_fops.c @@ -525,6 +525,7 @@ ssize_t drm_read(struct file *filp, char __user *buffer, if (copy_to_user(buffer + total, e->event, e->event->length)) { total = -EFAULT;
	e->destroy(e);
We shouldn't just be throwing away the event here, but put the event back at the front of the queue. Poses an interesting race issue. Seems like we want to hold the spinlock until the copy is complete so that we can fix up the failure correctly.
I've read the manpage for read and it explicitly states that when you get an error it's undefined what happens to the read position. Since -EFAULT is really just a userspace bug I think we can happily drop the event on the floor, no reason to bend over in the kernel.
Hmm. Actually the code is buggy is the provided buffer is too short for the first event in O_NONBLOCK mode.
Well we essentially send out datagrams instead of a bytestream. If we look at recvmsg and friends then discarding the additional bytes is something that's already being done. So I'm not terribly concerned about that either. It's a bit non-pretty that we use read and not reicvmsg but since we use read already not something we can ever fix.

In all of this consider what the impact of dropping an event is: system lockup. -Chris

-- Chris Wilson, Intel Open Source Technology Centre

3759

Age (days ago)

3791

Last active (days ago)

dri-devel@lists.freedesktop.org

18 comments

3 participants

tags (0)

participants (3)

Chris Wilson
Daniel Vetter
Takashi Iwai