On Tue, Aug 20, 2019 at 05:18:10PM +0200, Daniel Vetter wrote:
On Tue, Aug 20, 2019 at 10:34:18AM -0300, Jason Gunthorpe wrote:
On Tue, Aug 20, 2019 at 10:19:02AM +0200, Daniel Vetter wrote:
We need to make sure implementations don't cheat and don't have a possible schedule/blocking point deeply burried where review can't catch it.
I'm not sure whether this is the best way to make sure all the might_sleep() callsites trigger, and it's a bit ugly in the code flow. But it gets the job done.
Inspired by an i915 patch series which did exactly that, because the rules haven't been entirely clear to us.
v2: Use the shiny new non_block_start/end annotations instead of abusing preempt_disable/enable.
v3: Rebase on top of Glisse's arg rework.
v4: Rebase on top of more Glisse rework.
Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Andrew Morton akpm@linux-foundation.org Cc: Michal Hocko mhocko@suse.com Cc: David Rientjes rientjes@google.com Cc: "Christian König" christian.koenig@amd.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: "Jérôme Glisse" jglisse@redhat.com Cc: linux-mm@kvack.org Reviewed-by: Christian König christian.koenig@amd.com Reviewed-by: Jérôme Glisse jglisse@redhat.com Signed-off-by: Daniel Vetter daniel.vetter@intel.com mm/mmu_notifier.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c index 538d3bb87f9b..856636d06ee0 100644 +++ b/mm/mmu_notifier.c @@ -181,7 +181,13 @@ int __mmu_notifier_invalidate_range_start(struct mmu_notifier_range *range) id = srcu_read_lock(&srcu); hlist_for_each_entry_rcu(mn, &range->mm->mmu_notifier_mm->list, hlist) { if (mn->ops->invalidate_range_start) {
int _ret = mn->ops->invalidate_range_start(mn, range);
int _ret;
if (!mmu_notifier_range_blockable(range))
non_block_start();
_ret = mn->ops->invalidate_range_start(mn, range);
if (!mmu_notifier_range_blockable(range))
non_block_end();
If someone Acks all the sched changes then I can pick this for hmm.git, but I still think the existing pre-emption debugging is fine for this use case.
Ok, I'll ping Peter Z. for an ack, iirc he was involved.
Also, same comment as for the lockdep map, this needs to apply to the non-blocking range_end also.
Hm, I thought the page table locks we're holding there already prevent any sleeping, so would be redundant? But reading through code I think that's not guaranteed, so yeah makes sense to add it for invalidate_range_end too. I'll respin once I have the ack/nack from scheduler people.
So I started to look into this, and I'm a bit confused. There's no _nonblock version of this, so does this means blocking is never allowed, or always allowed?
From a quick look through implementations I've only seen spinlocks, and
one up_read. So I guess I should wrape this callback in some unconditional non_block_start/end, but I'm not sure.
Thanks, Daniel
Anyhow, since this series has conflicts with hmm.git it would be best to flow through the whole thing through that tree. If there are no remarks on the first two patches I'll grab them in a few days.
Thanks, Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch