On Wed, Dec 08, 2021 at 10:31:58PM +1100, Alistair Popple wrote:
On Tuesday, 7 December 2021 5:52:43 AM AEDT Alex Sierra wrote:
Avoid long term pinning for Coherent device type pages. This could interfere with their own device memory manager. If caller tries to get user device coherent pages with PIN_LONGTERM flag set, those pages will be migrated back to system memory.
Signed-off-by: Alex Sierra alex.sierra@amd.com mm/gup.c | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-)
diff --git a/mm/gup.c b/mm/gup.c index 886d6148d3d0..1572eacf07f4 100644 +++ b/mm/gup.c @@ -1689,17 +1689,37 @@ struct page *get_dump_page(unsigned long addr) #endif /* CONFIG_ELF_CORE */
#ifdef CONFIG_MIGRATION +static int migrate_device_page(unsigned long address,
struct page *page)
+{
- struct vm_area_struct *vma = find_vma(current->mm, address);
- struct vm_fault vmf = {
.vma = vma,
.address = address & PAGE_MASK,
.flags = FAULT_FLAG_USER,
.pgoff = linear_page_index(vma, address),
.gfp_mask = GFP_KERNEL,
.page = page,
- };
- if (page->pgmap && page->pgmap->ops->migrate_to_ram)
return page->pgmap->ops->migrate_to_ram(&vmf);
How does this synchronise against pgmap being released? As I understand things at this point we're not holding a reference on either the page or pgmap, so the page and therefore the pgmap may have been freed.
For sure, this can't keep touching the pages[] array after it unpinned them:
if (gup_flags & FOLL_PIN) { unpin_user_pages(pages, nr_pages);
^^^^^^^^^^^^^^^^^^^
} else { for (i = 0; i < nr_pages; i++) put_page(pages[i]); }
- if (is_device_page(head))
return migrate_device_page(start + page_index * PAGE_SIZE, head);
It was safe before this patch as isolate_lru_page(head) has a get_page() inside.
Also, please try hard not to turn this function into goto spaghetti
I think a similar problem exists for device private fault handling as well and it has been on my list of things to fix for a while. I think the solution is to call try_get_page(), except it doesn't work with device pages due to the whole refcount thing. That issue is blocking a fair bit of work now so I've started looking into it.
Where is this?
Jason