On Fri, Aug 24, 2018 at 02:33:41PM +0200, Michal Hocko wrote:
On Fri 24-08-18 14:18:44, Christian König wrote:
Am 24.08.2018 um 14:03 schrieb Michal Hocko:
On Fri 24-08-18 13:57:52, Christian König wrote:
Am 24.08.2018 um 13:52 schrieb Michal Hocko:
On Fri 24-08-18 13:43:16, Christian König wrote:
[...]
That won't work like this there might be multiple invalidate_range_start()/invalidate_range_end() pairs open at the same time. E.g. the lock might be taken recursively and that is illegal for a rw_semaphore.
I am not sure I follow. Are you saying that one invalidate_range might trigger another one from the same path?
No, but what can happen is:
invalidate_range_start(A,B); invalidate_range_start(C,D); ... invalidate_range_end(C,D); invalidate_range_end(A,B);
Grabbing the read lock twice would be illegal in this case.
I am sorry but I still do not follow. What is the context the two are called from?
I don't have the slightest idea.
Can you give me an example. I simply do not see it in the code, mostly because I am not familiar with it.
I'm neither.
We stumbled over that by pure observation and after discussing the problem with Jerome came up with this solution.
No idea where exactly that case comes from, but I can confirm that it indeed happens.
Thiking about it some more, I can imagine that a notifier callback which performs an allocation might trigger a memory reclaim and that in turn might trigger a notifier to be invoked and recurse. But notifier shouldn't really allocate memory. They are called from deep MM code paths and this would be extremely deadlock prone. Maybe Jerome can come up some more realistic scenario. If not then I would propose to simplify the locking here. We have lockdep to catch self deadlocks and it is always better to handle a specific issue rather than having a code without a clear indication how it can recurse.
Multiple concurrent mmu notifier, for overlapping range or not, is common (each concurrent threads can trigger some). So you might have multiple invalidate_range_start() in flight for same mm and thus might complete in different order (invalidate_range_end()). IIRC this is what this lock was trying to protect against.
I can't think of a reason for recursive mmu notifier call right now. I will ponder see if i remember something about it.
Cheers, Jérôme