On Tue, Aug 20, 2019 at 10:24:40PM +0200, Daniel Vetter wrote:
On Tue, Aug 20, 2019 at 10:19:01AM +0200, Daniel Vetter wrote:
In some special cases we must not block, but there's not a spinlock, preempt-off, irqs-off or similar critical section already that arms the might_sleep() debug checks. Add a non_block_start/end() pair to annotate these.
This will be used in the oom paths of mmu-notifiers, where blocking is not allowed to make sure there's forward progress. Quoting Michal:
"The notifier is called from quite a restricted context - oom_reaper - which shouldn't depend on any locks or sleepable conditionals. The code should be swift as well but we mostly do care about it to make a forward progress. Checking for sleepable context is the best thing we could come up with that would describe these demands at least partially."
Peter also asked whether we want to catch spinlocks on top, but Michal said those are less of a problem because spinlocks can't have an indirect dependency upon the page allocator and hence close the loop with the oom reaper.
Suggested by Michal Hocko.
v2:
- Improve commit message (Michal)
- Also check in schedule, not just might_sleep (Peter)
v3: It works better when I actually squash in the fixup I had lying around :-/
v4: Pick the suggestion from Andrew Morton to give non_block_start/end some good kerneldoc comments. I added that other blocking calls like wait_event pose similar issues, since that's the other example we discussed.
Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@redhat.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Michal Hocko mhocko@suse.com Cc: David Rientjes rientjes@google.com Cc: "Christian König" christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Jérôme Glisse" jglisse@redhat.com Cc: linux-mm@kvack.org Cc: Masahiro Yamada yamada.masahiro@socionext.com Cc: Wei Wang wvw@google.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Jann Horn jannh@google.com Cc: Feng Tang feng.tang@intel.com Cc: Kees Cook keescook@chromium.org Cc: Randy Dunlap rdunlap@infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: Christian König christian.koenig@amd.com (v1) Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Hi Peter,
Iirc you've been involved at least somewhat in discussing this. -mm folks are a bit undecided whether these new non_block semantics are a good idea. Michal Hocko still is in support, but Andrew Morton and Jason Gunthorpe are less enthusiastic. Jason said he's ok with merging the hmm side of this if scheduler folks ack. If not, then I'll respin with the preempt_disable/enable instead like in v1.
So ack/nack for this from the scheduler side?
Right, I had memories of seeing this before, and I just found a fairly long discussion on this elsewhere in the vacation inbox (*groan*).
Yeah, this is something I can live with,
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org