Re: [PATCH 07/13] mm: close race in generic_access_phys

7 Oct 2020

      On Wed, Oct 07, 2020 at 06:44:20PM +0200, Daniel Vetter wrote:
...
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

gpu drivers dynamically manage their memory nowadays, invalidating
ptes with unmap_mapping_range when buffers get moved

contiguous dma allocations have moved from dedicated carvetouts to
cma regions. This means if we miss the unmap the pfn might contain
pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

even /dev/mem now invalidates mappings when the kernel requests that
iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
("/dev/mem: Revoke mappings when a driver claims the region")

Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea. Fix this.
Since ioremap might need to manipulate pagetables too we need to drop
the pt lock and have a retry loop if we raced.
While at it, also add kerneldoc and improve the comment for the
vma_ops->access function. It's for accessing, not for moving the
memory from iomem to system memory, as the old comment seemed to
suggest.
References: 28b2ee20c7cb ("access_process_vm device memory infrastructure")
Cc: Jason Gunthorpe jgg@ziepe.ca
Cc: Dan Williams dan.j.williams@intel.com
Cc: Kees Cook keescook@chromium.org
Cc: Rik van Riel riel@redhat.com
Cc: Benjamin Herrensmidt benh@kernel.crashing.org
Cc: Dave Airlie airlied@linux.ie
Cc: Hugh Dickins hugh@veritas.com
Cc: Andrew Morton akpm@linux-foundation.org
Cc: John Hubbard jhubbard@nvidia.com
Cc: Jérôme Glisse jglisse@redhat.com
Cc: Jan Kara jack@suse.cz
Cc: Dan Williams dan.j.williams@intel.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-samsung-soc@vger.kernel.org
Cc: linux-media@vger.kernel.org
Signed-off-by: Daniel Vetter daniel.vetter@intel.com

include/linux/mm.h |  3 ++-
 mm/memory.c        | 44 ++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 44 insertions(+), 3 deletions(-)
This does seem to solve the race with revoke_devmem(), but it is really ugly.
It would be much nicer to wrap a rwsem around this access and the unmap.
Any place using it has a nice linear translation from vm_off to pfn,
so I don't think there is a such a good reason to use follow_pte in
the first place.
ie why not the helper be this:
int generic_access_phys(unsigned long pfn, unsigned long pgprot,
      void *buf, size_t len, bool write)
Then something like dev/mem would compute pfn and obtain the lock:
dev_access(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write)
{
     cpu_addr = vma->vm_pgoff*PAGE_SIZE + (addr - vma->vm_start));
/* FIXME: Has to be over each page of len */
     if (!devmem_is_allowed_access(PHYS_PFN(cpu_addr/4096)))
           return -EPERM;
down_read(&mem_sem);
     generic_access_phys(cpu_addr/4096, pgprot_val(vma->vm_page_prot),
                         buf, len, write);
     up_read(&mem_sem);
}
The other cases looked simpler because they don't revoke, here the
mmap_sem alone should be enough protection, they would just need to
provide the linear translation to pfn.
What do you think?
Jason

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 07/13] mm: close race in generic_access_phys