Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE

12 Sep 2013


      On 09/12/2013 05:58 PM, Daniel Vetter wrote:
...
On Thu, Sep 12, 2013 at 5:43 PM, Peter Zijlstra peterz@infradead.org wrote:
...
...
The one in ttm is just bonghits to shut up lockdep: ttm can recurse
into it's own pagefault handler and then deadlock, the trylock just
keeps lockdep quiet.
Could you describe how it could recurse into it's own pagefault handler?
IIRC the VM flags of the TTM VMAs makes get_user_pages() refrain from 
touching these VMAs,
hence I don't think this code can deadlock, but admittedly it's far from 
the optimal solution.
Never mind, more on the set_need_resched() below.
...
...
...
We've had that bug arise in drm/i915 due to some
fun userspace did and now have testcases for them. The right solution
to fix this is to use copy_to|from_user_atomic in ttm everywhere it
holds locks and have slowpaths which drops locks, copies stuff into a
temp allocation and then continues. At least that's how we've fixed
all those inversions in i915-gem. I'm not volunteering to fix this ;-)
Yikes.. so how common is it? If I simply rip the set_need_resched() out
it will 'spin' on the fault a little longer until a 'natural' preemption
point -- if such a thing is every going to happen.
A typical case is if a process is throwing out a buffer from the GPU or 
system memory while another
process pagefaults while writing to it. It's not a common situation, and 
it's by no means a fastpath situation.
For correctness purposes, I think set_need_resched() can be safely removed.
...
It's a case of "our userspace doesn't do this", so as long as you're
not evil and frob the drm device nodes of ttm drivers directly the
deadlock will never happen. No idea how much contention actually
happens on e.g. shared buffer objects - in i915 we have just one lock
and so suffer quite a bit more from contention. So no idea how much
removing the yield would hurt.
-Daniel
/Thomas

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE