Hi,
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
The bug is easily reproducible (at least by me) by playing SuperTuxKart for a minute. It presents on latest -next and 5.17-rc7, I haven't checked older kernel versions.
I'm not an expert of the non-blocking code paths in DRM, so asking for suggestions about where the root of the problem could be.
Here is the KASAN report:
================================================================== BUG: KASAN: use-after-free in drm_atomic_helper_wait_for_vblanks.part.0+0x10b/0x4b0 Read of size 1 at addr ffff888110354809 by task kworker/u8:5/97
CPU: 1 PID: 97 Comm: kworker/u8:5 Not tainted 5.17.0-rc7-next-20220309+ #158 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound commit_work Call Trace: <TASK> dump_stack_lvl+0x49/0x5e print_report.cold+0x9c/0x562 ? drm_atomic_helper_wait_for_vblanks.part.0+0x10b/0x4b0 kasan_report+0xb9/0xf0 ? drm_atomic_helper_wait_for_vblanks.part.0+0x10b/0x4b0 __asan_load1+0x4d/0x50 drm_atomic_helper_wait_for_vblanks.part.0+0x10b/0x4b0 ? page_flip_common+0x150/0x150 ? complete_all+0x41/0x50 ? drm_atomic_helper_commit_hw_done+0x1a2/0x220 drm_atomic_helper_commit_tail+0x8c/0xa0 commit_tail+0x15c/0x1d0 commit_work+0x12/0x20 process_one_work+0x50e/0x9d0 ? pwq_dec_nr_in_flight+0x120/0x120 ? do_raw_spin_lock+0x10a/0x190 worker_thread+0x2ba/0x630 ? process_one_work+0x9d0/0x9d0 kthread+0x15d/0x190 ? kthread_complete_and_exit+0x30/0x30 ret_from_fork+0x1f/0x30 </TASK>
Allocated by task 325: kasan_save_stack+0x26/0x50 __kasan_kmalloc+0x88/0xa0 kmem_cache_alloc_trace+0x1fa/0x380 drm_atomic_helper_crtc_duplicate_state+0x4a/0x80 drm_atomic_get_crtc_state+0xbf/0x1d0 page_flip_common+0x46/0x150 drm_atomic_helper_page_flip+0x7a/0xe0 drm_mode_page_flip_ioctl+0x9c6/0xa20 drm_ioctl_kernel+0x145/0x220 drm_ioctl+0x34e/0x5f0 __x64_sys_ioctl+0xbe/0xf0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae
Freed by task 230: kasan_save_stack+0x26/0x50 kasan_set_track+0x25/0x30 kasan_set_free_info+0x24/0x40 __kasan_slab_free+0x100/0x140 kfree+0xaf/0x310 drm_atomic_helper_crtc_destroy_state+0x1e/0x30 drm_atomic_state_default_clear+0x20e/0x5d0 __drm_atomic_state_free+0xbf/0x130 commit_tail+0x166/0x1d0 commit_work+0x12/0x20 process_one_work+0x50e/0x9d0 worker_thread+0x2ba/0x630 kthread+0x15d/0x190 ret_from_fork+0x1f/0x30
The buggy address belongs to the object at ffff888110354800 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 9 bytes inside of 512-byte region [ffff888110354800, ffff888110354a00)
The buggy address belongs to the physical page: page:0000000010a164bd refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x110354 head:0000000010a164bd order:2 compound_mapcount:0 compound_pincount:0 flags: 0x8000000000010200(slab|head|zone=2) raw: 8000000000010200 0000000000000000 dead000000000001 ffff888100042c80 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888110354700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888110354780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888110354800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff888110354880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff888110354900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== ------------[ cut here ]------------
Hi Dmitry,
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
The bug is easily reproducible (at least by me) by playing SuperTuxKart for a minute. It presents on latest -next and 5.17-rc7, I haven't checked older kernel versions.
I'm not an expert of the non-blocking code paths in DRM, so asking for suggestions about where the root of the problem could be.
Does it occur with other platforms? Can you easily test on something else?
Thanks, Maxime
On 3/11/22 17:22, Maxime Ripard wrote:
Hi Dmitry,
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
Yes, the pointers are the same. I'd assume that the newer atomic state should duplicate CRTC state, but apparently it doesn't happen.
The bug is easily reproducible (at least by me) by playing SuperTuxKart for a minute. It presents on latest -next and 5.17-rc7, I haven't checked older kernel versions.
I'm not an expert of the non-blocking code paths in DRM, so asking for suggestions about where the root of the problem could be.
Does it occur with other platforms? Can you easily test on something else?
Shouldn't be easy to replicate this on other platforms, but I'll try.
On Tue, Mar 15, 2022 at 12:53:30AM +0300, Dmitry Osipenko wrote:
On 3/11/22 17:22, Maxime Ripard wrote:
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
Yes, the pointers are the same. I'd assume that the newer atomic state should duplicate CRTC state, but apparently it doesn't happen.
Yeah, I don't think this is right either
The bug is easily reproducible (at least by me) by playing SuperTuxKart for a minute. It presents on latest -next and 5.17-rc7, I haven't checked older kernel versions.
I'm not an expert of the non-blocking code paths in DRM, so asking for suggestions about where the root of the problem could be.
Does it occur with other platforms? Can you easily test on something else?
Shouldn't be easy to replicate this on other platforms, but I'll try.
By replicating I meant running SuperTuxKart on a platform with a different KMS driver than virtio-gpu. So any ARM SBC with a GPU will do for example.
That will allow us to see if it's a bug in virtio-gpu or in the helpers/core.
Maxime
On Tue, Mar 15, 2022 at 12:53:30AM +0300, Dmitry Osipenko wrote:
On 3/11/22 17:22, Maxime Ripard wrote:
Hi Dmitry,
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
Yes, the pointers are the same. I'd assume that the newer atomic state should duplicate CRTC state, but apparently it doesn't happen.
The legacy cursor hack stuff does this, and it pretty fundamentally breaks everything. Might be good to retest with that disabled:
https://lore.kernel.org/dri-devel/20201023123925.2374863-1-daniel.vetter@ffw...
The problem is a bit that this might cause some regressions, for drivers which don't yet have the fancy new cursor fastpath for plane updates. -Daniel
The bug is easily reproducible (at least by me) by playing SuperTuxKart for a minute. It presents on latest -next and 5.17-rc7, I haven't checked older kernel versions.
I'm not an expert of the non-blocking code paths in DRM, so asking for suggestions about where the root of the problem could be.
Does it occur with other platforms? Can you easily test on something else?
Shouldn't be easy to replicate this on other platforms, but I'll try.
Hi Daniel,
I'm using this thread as the occasion to discuss the legacy cursor stuff a bit further.
I've been trying to address the issue detailed here: https://lore.kernel.org/all/20220221134155.125447-1-maxime@cerno.tech/
And the last patch in particular: https://lore.kernel.org/all/20220221134155.125447-9-maxime@cerno.tech/
On Wed, Mar 30, 2022 at 11:45:17AM +0200, Daniel Vetter wrote:
On Tue, Mar 15, 2022 at 12:53:30AM +0300, Dmitry Osipenko wrote:
On 3/11/22 17:22, Maxime Ripard wrote:
Hi Dmitry,
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
Yes, the pointers are the same. I'd assume that the newer atomic state should duplicate CRTC state, but apparently it doesn't happen.
The legacy cursor hack stuff does this, and it pretty fundamentally breaks everything. Might be good to retest with that disabled:
https://lore.kernel.org/dri-devel/20201023123925.2374863-1-daniel.vetter@ffw...
The problem is a bit that this might cause some regressions, for drivers which don't yet have the fancy new cursor fastpath for plane updates.
I've been trying to force it to be disabled in either atomic_check or atomic_setup, and it was either slow (check), or ineffective (setup).
It's not really clear to me what this hack is about, and what we could do to make sure the cursor plane gets updated without waiting for vblank. vc4 has async plane update support, so I'm sure it's just some plumbing to do but I'm not sure where and why
Thanks! Maxime
On 3/30/22 12:45, Daniel Vetter wrote:
On Tue, Mar 15, 2022 at 12:53:30AM +0300, Dmitry Osipenko wrote:
On 3/11/22 17:22, Maxime Ripard wrote:
Hi Dmitry,
On Thu, Mar 10, 2022 at 03:33:07AM +0300, Dmitry Osipenko wrote:
I was playing/testing SuperTuxKart using VirtIO-GPU driver and spotted a UAF bug in drm_atomic_helper_wait_for_vblanks().
SuperTuxKart can use DRM directly, i.e. you can run game in VT without Xorg or Wayland, this is where bugs happens. SuperTuxKart uses a non-blocking atomic page flips and UAF happens when a new atomic state is committed while there is a previous page flip still in-fly.
What happens is that the new and old atomic states refer to the same CRTC state somehow. Once the older atomic state is destroyed, the CRTC state is freed and the newer atomic state continues to use the freed CRTC state.
I'm not sure what you mean by "the new and old atomic states refer to the same CRTC state", are those the same pointers?
Yes, the pointers are the same. I'd assume that the newer atomic state should duplicate CRTC state, but apparently it doesn't happen.
The legacy cursor hack stuff does this, and it pretty fundamentally breaks everything. Might be good to retest with that disabled:
https://lore.kernel.org/dri-devel/20201023123925.2374863-1-daniel.vetter@ffw...
The problem is a bit that this might cause some regressions, for drivers which don't yet have the fancy new cursor fastpath for plane updates. -Daniel
Thank you, I tested yours patch and unfortunately it doesn't fix my problem. Should be a separate bug.
Those async update code paths aren't trivial, will take some time for me to debug it.
dri-devel@lists.freedesktop.org