On Mon, May 10, 2021 at 3:50 PM Jason Gunthorpe jgg@ziepe.ca wrote:
On Sat, May 08, 2021 at 09:46:41AM -0700, Linus Torvalds wrote:
I think follow_pfn() is ok for the actual "this is not a 'struct page' backed area", and disabling that case is wrong even going forward.
Every place we've audited using follow_pfn() has been shown to have some use-after-free bugs like Daniel describes, and a failure to check permissions bug too.
All the other follow_pfn() users were moved to follow_pte() to fix the permissions check and this shifts the use-after-free bug away from being inside an MM API and into the caller mis-using the API by, say, extracting and using the PFN outside the pte lock.
eg look at how VFIO wrongly uses follow_pte():
static int follow_fault_pfn() ret = follow_pte(vma->vm_mm, vaddr, &ptep, &ptl); *pfn = pte_pfn(*ptep); pte_unmap_unlock(ptep, ptl);
// no protection that pte_pfn() is still valid! use_pfn(*pfn)
v4l is the only user that still has the missing permissions check security bug too - so there is no outcome that should keep follow_pfn() in the tree.
At worst v4l should change to follow_pte() and use it wrongly like VFIO. At best we should delete all the v4l stuff.
yeah vfio is still broken for the case I care about. I think there's also some questions open still about whether kvm really uses mmu_notifier in all cases correctly, but iirc the one exception was s390, which didn't have pci mmap and that's how it gets away with that specific problem.
Daniel I suppose we missed this relation to follow_pte(), so I agree that keeping a unsafe_follow_pfn() around is not good.
tbh I never really got the additional issue with the missing write checks. That users of follow_pfn (or well follow_pte + immediate lock dropping like vfio) don't subscribe to the pte updates in general is the bug I'm seeing. That v4l also glosses over the read/write access stuff is kinda just the icing on the cake :-) It's pretty well broken even if it would check that. -Daniel