On 2/11/22 18:51, Alistair Popple wrote: ...
@@ -1888,15 +1942,40 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages, continue; prev_head = head; /*
* If we get a movable page, since we are going to be pinning
* these entries, try to move them out if possible.
* Device coherent pages are managed by a driver and should not
* be pinned indefinitely as it prevents the driver moving the
* page. So when trying to pin with FOLL_LONGTERM instead try
* migrating page out of device memory. */ if (is_dev_private_or_coherent_page(head)) {
/*
* device private pages will get faulted in during gup
* so it shouldn't be possible to see one here.
*/ WARN_ON_ONCE(is_device_private_page(head));
ret = -EFAULT;
goto unpin_pages;
WARN_ON_ONCE(PageCompound(head));
/*
* migration will fail if the page is pinned, so convert
* the pin on the source page to a normal reference.
*/
if (gup_flags & FOLL_PIN) {
get_page(head);
unpin_user_page(head);
OK...but now gup_flags can no longer be used as a guide for how to release these pages, right? In other words, up until this point, FOLL_PIN meant "call unpin_user_page() in order to release". However, now this page must be released via put_page().
This is the source page (head). We are unpinning it because we can't migrate a pinned page, however we still need a reference on it for migrate_vma hence the get_page followed by unpin. In the non-FOLL_PIN case we already have a reference from gup.
See below...
}
pages[i] = migrate_device_page(head, gup_flags);
migrate_device_page() will return a new page that has been correctly pinned with gup_flags by try_grab_page(). Therefore this page can still be released with unpin_user_page() or put_page() as appropriate for the given gup_flags.
The reference we had on the source page (head) always gets dropped in migrate_vma_finalize().
OK. Good.
The above would be good to have in a comment, right around here, imho. Because we have this marvelous mix of references for migration (get_page()) and other, and it's a bit hard to see that it's all correct without a hint or two.
...
Which unless I've missed something is still the correct thing to do.
This reminds me: out of the many things to monitor, the FOLL_PIN counts in /proc/vmstat are especially helpful, whenever making changes to code that deals with this:
nr_foll_pin_acquired nr_foll_pin_released
...and those should normally be equal to each other when "at rest".
I hope this is/was run, just to be sure?
thanks,