On 2019-01-29 4:47 p.m., Jerome Glisse wrote:
The whole point is to allow to use device memory for range of virtual address of a process when it does make sense to use device memory for that range. So they are multiple cases where it does make sense: [1] - Only the device is accessing the range and they are no CPU access For instance the program is executing/running a big function on the GPU and they are not concurrent CPU access, this is very common in all the existing GPGPU code. In fact AFAICT It is the most common pattern. So here you can use HMM private or public memory. [2] - Both device and CPU access a common range of virtul address concurrently. In that case if you are on a platform with cache coherent inter-connect like OpenCAPI or CCIX then you can use HMM public device memory and have both access the same memory. You can not use HMM private memory.
So far on x86 we only have PCIE and thus so far on x86 we only have private HMM device memory that is not accessible by the CPU in any way.
I feel like you're just moving the rug out from under us... Before you said ignore HMM and I was asking about the use case that wasn't using HMM and how it works without HMM. In response, you just give me *way* too much information describing HMM. And still, as best as I can see, managing DMA mappings (which is different from the userspace mappings) for GPU P2P should be handled by HMM and the userspace mappings should *just* link VMAs to HMM pages using the standard infrastructure we already have.
And what struct pages are actually going to be backing these VMAs if it's not using HMM?
When you have some range of virtual address migrated to HMM private memory then the CPU pte are special swap entry and they behave just as if the memory was swapped to disk. So CPU access to those will fault and trigger a migration back to main memory.
This isn't answering my question at all... I specifically asked what is backing the VMA when we are *not* using HMM.
Logan