Re: [PATCH 11/13] mm: add unsafe_follow_pfn

7 Oct 2020

      On Wed, Oct 7, 2020 at 9:00 PM Jason Gunthorpe jgg@ziepe.ca wrote:
...
On Wed, Oct 07, 2020 at 08:10:34PM +0200, Daniel Vetter wrote:
...
On Wed, Oct 7, 2020 at 7:36 PM Jason Gunthorpe jgg@ziepe.ca wrote:
...
On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote:
...
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

gpu drivers dynamically manage their memory nowadays, invalidating

ptes with unmap_mapping_range when buffers get moved

contiguous dma allocations have moved from dedicated carvetouts to

cma regions. This means if we miss the unmap the pfn might contain
pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

even /dev/mem now invalidates mappings when the kernel requests that

iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
("/dev/mem: Revoke mappings when a driver claims the region")
Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea.
Unfortunately there's some users where this is not fixable (like v4l
userptr of iomem mappings) or involves a pile of work (vfio type1
iommu). For now annotate these as unsafe and splat appropriately.
This patch adds an unsafe_follow_pfn, which later patches will then
roll out to all appropriate places.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Cc: Jason Gunthorpe jgg@ziepe.ca
Cc: Kees Cook keescook@chromium.org
Cc: Dan Williams dan.j.williams@intel.com
Cc: Andrew Morton akpm@linux-foundation.org
Cc: John Hubbard jhubbard@nvidia.com
Cc: Jérôme Glisse jglisse@redhat.com
Cc: Jan Kara jack@suse.cz
Cc: Dan Williams dan.j.williams@intel.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: kvm@vger.kernel.org
 include/linux/mm.h |  2 ++
 mm/memory.c        | 32 +++++++++++++++++++++++++++++++-
 mm/nommu.c         | 17 +++++++++++++++++
 security/Kconfig   | 13 +++++++++++++
 4 files changed, 63 insertions(+), 1 deletion(-)
Makes sense to me.
I wonder if we could change the original follow_pfn to require the
ptep and then lockdep_assert_held() it against the page table lock?
The safe variant with the pagetable lock is follow_pte_pmd. The only
way to make follow_pfn safe is if you have an mmu notifier and
corresponding retry logic. That is not covered by lockdep (it would
splat if we annotate the retry side), so I'm not sure how you'd check
for that?
Right OK.
...
Checking for ptep lock doesn't work here, since the one leftover safe
user of this (kvm) doesn't need that at all, because it has the mmu
notifier.
Ah, so a better name and/or function kdoc for follow_pfn is probably a
good iead in this patch as well.
I did change that already to mention that you need an mmu notifier,
and that follow_pte_pmd respectively unsafe_follow_pfn are the
alternatives. Do you want more or something else here?
Note that I left the kerneldoc for the nommu.c case unchanged, since
without an mmu all bets are off anyway.
...
...
So I think we're as good as it gets, since I really have no idea how
to make sure follow_pfn callers do have an mmu notifier registered.
Yah, can't be done. Most mmu notifier users should be using
hmm_range_fault anyhow, kvm is really very special here.
We could pass an mmu notifier to follow_pfn and check that it has a
registration for vma->vm_mm, but that feels like overkill when kvm is
the only legit user for this.
...
...
I've followed the few other CONFIG_STRICT_FOO I've seen, which are all
explicit enables and default to "do not break uapi, damn the
(security) bugs". Which is I think how this should be done. It is in
the security section though, so hopefully competent distros will
enable this all.
I thought the strict ones were more general and less clear security
worries, not bugs like this.
This is "allow a user triggerable use after free bug to exist in the
kernel"
Since at most you get at GFP_MOVEABLE stuff I'm not sure you can use
this to pull the kernel over the table. Maybe best way is if you get a
gpu pagetable somehow into your pfn and then use that to access
abitrary stuff, but there's still an iommu. I think leveraging this is
going to be very tricky, and pretty much has to be device or driver
specific somehow.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 11/13] mm: add unsafe_follow_pfn