On 11.02.22 17:56, Jason Gunthorpe wrote:
On Fri, Feb 11, 2022 at 05:49:08PM +0100, David Hildenbrand wrote:
On 11.02.22 17:45, Jason Gunthorpe wrote:
On Fri, Feb 11, 2022 at 05:15:25PM +0100, David Hildenbrand wrote:
... I'm pretty sure we cannot FOLL_PIN DEVICE_PRIVATE pages
Currently the only way to get a DEVICE_PRIVATE page out of the page tables is via hmm_range_fault() and that doesn't manipulate any ref counts.
Thanks for clarifying Jason! ... and AFAIU, device exclusive entries are essentially just pointers at ordinary PageAnon() pages. So with DEVICE COHERENT we'll have the first PageAnon() ZONE_DEVICE pages mapped as present in the page tables where GUP could FOLL_PIN them.
This is my understanding
Though you probably understand what PageAnon means alot better than I do.. I wonder if it really makes sense to talk about that together with ZONE_DEVICE which has alot in common with filesystem originated pages too.
For me, PageAnon() means that modifications are visible only to the modifying process. On actual CoW, the underlying page will get replaced -- in the world of DEVICE_COHERENT that would mean that once you write to a DEVICE_COHERENT you could suddenly have a !DEVICE_COHERENT page.
PageAnon() pages don't have a mapping, thus they can only be found in MAP_ANON VMAs or in MAP_SHARED VMAs with MAP_PRIVATE. They can only be found via a page table, and not looked up via the page cache (excluding the swap cache).
So if we have PageAnon() pages on ZONE_DEVICE, they generally have the exact same semantics as !ZONE_DEVICE pages, but the way they "appear" in the page tables the allocation/freeing path differs -- I guess :)
... and as we want pinning semantics to be different we have to touch GUP.
I'm not sure what AMDs plan is here, is there an expecation that a GPU driver will somehow stuff these pages into an existing anonymous memory VMA or do they always come from a driver originated VMA?
My understanding is that a driver can just decide to replace "ordinary" PageAnon() pages e.g., in a MAP_ANON VMA by these pages. Hopefully AMD can clarify.