On Mon, Mar 07, 2022 at 10:11:19PM +0000, David Laight wrote:
From: Christoph Hellwig
Sent: 07 March 2022 15:57
On Mon, Mar 07, 2022 at 03:29:35PM +0200, Jarkko Sakkinen wrote:
So what would you suggest to sort out the issue? I'm happy to go with ioctl if nothing else is acceptable.
PLenty of drivers treat all mmaps as if MAP_POPULATE was specified, typically by using (io_)remap_pfn_range. If there any reason to only optionally have the pre-fault semantics for sgx? If not this should be really simple. And if we have a real need for it to be optional we'll just need to find a sane way to pass that information to ->mmap.
Is there any space in vma->vm_flags ?
That would be better than an extra argument or function.
It's very dense but I'll give a shot for callback route based on Dave's comments in this thread. I.e. use it as filter inside __mm_populate() and populate_vma_page_range().
For Enarx, which we are implementing being able to use MAP_POPULATE and get the full range EAUG'd would be best way to optimize the performance of wasm JIT (Enarx is a wasm run-time capable of running inside an SGX enclave, AMD SEV-SNP VM etc.). More so than any predictor (ra_state, madvice etc.) inside #PF handler, which have been suggested in this thread.
After some research on how we implement user space, I'd rather keep the #PF handler working on a single page (EAUG a single page) and have either ioctl or MAP_POPULATE to do the batch fill.
We can still "not trust the user space" i.e. the populate does not have to guarantee to do the full length since the #PF handler will then fill the holes. This was one concern in this thread but it is not hard to address.
BR, Jarkko