On Wed, Oct 7, 2020 at 11:13 PM John Hubbard jhubbard@nvidia.com wrote:
On 10/7/20 9:44 AM, Daniel Vetter wrote:
This is used by media/videbuf2 for persistent dma mappings, not just for a single dma operation and then freed again, so needs FOLL_LONGTERM.
Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to locking issues. Rework the code to pull the pup path out from the mmap_sem critical section as suggested by Jason.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com Cc: Jason Gunthorpe jgg@ziepe.ca Cc: Pawel Osciak pawel@osciak.com Cc: Marek Szyprowski m.szyprowski@samsung.com Cc: Kyungmin Park kyungmin.park@samsung.com Cc: Tomasz Figa tfiga@chromium.org Cc: Mauro Carvalho Chehab mchehab@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: John Hubbard jhubbard@nvidia.com Cc: Jérôme Glisse jglisse@redhat.com Cc: Jan Kara jack@suse.cz Cc: Dan Williams dan.j.williams@intel.com Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: linux-media@vger.kernel.org
mm/frame_vector.c | 36 +++++++++++------------------------- 1 file changed, 11 insertions(+), 25 deletions(-)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c index 10f82d5643b6..39db520a51dc 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, struct vm_area_struct *vma; int ret = 0; int err;
int locked; if (nr_frames == 0) return 0;
@@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
start = untagged_addr(start);
ret = pin_user_pages_fast(start, nr_frames,
FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
(struct page **)(vec->ptrs));
if (ret > 0) {
vec->got_ref = true;
vec->is_pfns = false;
goto out_unlocked;
}
This part looks good, and changing to _fast is a potential performance improvement, too.
mmap_read_lock(mm);
locked = 1; vma = find_vma_intersection(mm, start, start + 1); if (!vma) { ret = -EFAULT; goto out; }
/*
* While get_vaddr_frames() could be used for transient (kernel
* controlled lifetime) pinning of memory pages all current
* users establish long term (userspace controlled lifetime)
* page pinning. Treat get_vaddr_frames() like
* get_user_pages_longterm() and disallow it for filesystem-dax
* mappings.
*/
if (vma_is_fsdax(vma)) {
ret = -EOPNOTSUPP;
goto out;
}
Are you sure we don't need to check vma_is_fsdax() anymore?
Since FOLL_LONGTERM checks for this and can only return struct page backed memory, and explicitly excludes VM_IO | VM_PFNMAP, was assuming this is not needed for follow_pfn. And the get_user_pages_locked this used back then didn't have the same check, hence why it was added (and FOLL_LONGTERM still doesn't work for the _locked versions, as you pointed out on the last round of this discussion).
But now that you're asking, I have no idea whether fsdax vma can also be of VM_IO | VM_PFNMAP type. I'm not seeing that set anywhere in fs/dax.c, but that says nothing :-)
Dan, you added this check originally, do we need it for VM_SPECIAL vmas too?
Thanks, Daniel
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
vec->got_ref = true;
vec->is_pfns = false;
ret = pin_user_pages_locked(start, nr_frames,
gup_flags, (struct page **)(vec->ptrs), &locked);
goto out;
}
vec->got_ref = false; vec->is_pfns = true; do {
@@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, vma = find_vma_intersection(mm, start, start + 1); } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); out:
if (locked)
mmap_read_unlock(mm);
mmap_read_unlock(mm);
+out_unlocked: if (!ret) ret = -EFAULT; if (ret > 0)
All of the error handling still looks accurate there.
thanks,
John Hubbard NVIDIA