=== Overview
arm64 has a feature called Top Byte Ignore, which allows to embed pointer tags into the top byte of each pointer. Userspace programs (such as HWASan, a memory debugging tool [1]) might use this feature and pass tagged user pointers to the kernel through syscalls or other interfaces.
Right now the kernel is already able to handle user faults with tagged pointers, due to these patches:
1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a tagged pointer") 2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged pointers") 3. 276e9327 ("arm64: entry: improve data abort handling of tagged pointers")
This patchset extends tagged pointer support to syscall arguments.
As per the proposed ABI change [3], tagged pointers are only allowed to be passed to syscalls when they point to memory ranges obtained by anonymous mmap() or sbrk() (see the patchset [3] for more details).
For non-memory syscalls this is done by untaging user pointers when the kernel performs pointer checking to find out whether the pointer comes from userspace (most notably in access_ok). The untagging is done only when the pointer is being checked, the tag is preserved as the pointer makes its way through the kernel and stays tagged when the kernel dereferences the pointer when perfoming user memory accesses.
Memory syscalls (mmap, mprotect, etc.) don't do user memory accesses but rather deal with memory ranges, and untagged pointers are better suited to describe memory ranges internally. Thus for memory syscalls we untag pointers completely when they enter the kernel.
=== Other approaches
One of the alternative approaches to untagging that was considered is to completely strip the pointer tag as the pointer enters the kernel with some kind of a syscall wrapper, but that won't work with the countless number of different ioctl calls. With this approach we would need a custom wrapper for each ioctl variation, which doesn't seem practical.
An alternative approach to untagging pointers in memory syscalls prologues is to inspead allow tagged pointers to be passed to find_vma() (and other vma related functions) and untag them there. Unfortunately, a lot of find_vma() callers then compare or subtract the returned vma start and end fields against the pointer that was being searched. Thus this approach would still require changing all find_vma() callers.
=== Testing
The following testing approaches has been taken to find potential issues with user pointer untagging:
1. Static testing (with sparse [2] and separately with a custom static analyzer based on Clang) to track casts of __user pointers to integer types to find places where untagging needs to be done.
2. Static testing with grep to find parts of the kernel that call find_vma() (and other similar functions) or directly compare against vm_start/vm_end fields of vma.
3. Static testing with grep to find parts of the kernel that compare user pointers with TASK_SIZE or other similar consts and macros.
4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running a modified syzkaller version that passes tagged pointers to the kernel.
Based on the results of the testing the requried patches have been added to the patchset.
=== Notes
This patchset is meant to be merged together with "arm64 relaxed ABI" [3].
This patchset is a prerequisite for ARM's memory tagging hardware feature support [4].
This patchset has been merged into the Pixel 2 kernel tree and is now being used to enable testing of Pixel 2 phones with HWASan.
Thanks!
[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e...
[3] https://lkml.org/lkml/2019/3/18/819
[4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture...
Changes in v13: - Simplified untagging in tcp_zerocopy_receive(). - Looked at find_vma() callers in drivers/, which allowed to identify a few other places where untagging is needed. - Added patch "mm, arm64: untag user pointers in get_vaddr_frames". - Added patch "drm/amdgpu, arm64: untag user pointers in amdgpu_ttm_tt_get_user_pages". - Added patch "drm/radeon, arm64: untag user pointers in radeon_ttm_tt_pin_userptr". - Added patch "IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr". - Added patch "media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get". - Added patch "tee/optee, arm64: untag user pointers in check_mem_type". - Added patch "vfio/type1, arm64: untag user pointers".
Changes in v12: - Changed untagging in tcp_zerocopy_receive() to also untag zc->address. - Fixed untagging in prctl_set_mm* to only untag pointers for vma lookups and validity checks, but leave them as is for actual user space accesses. - Updated the link to the v2 of the "arm64 relaxed ABI" patchset [3]. - Dropped the documentation patch, as the "arm64 relaxed ABI" patchset [3] handles that.
Changes in v11: - Added "uprobes, arm64: untag user pointers in find_active_uprobe" patch. - Added "bpf, arm64: untag user pointers in stack_map_get_build_id_offset" patch. - Fixed "tracing, arm64: untag user pointers in seq_print_user_ip" to correctly perform subtration with a tagged addr. - Moved untagged_addr() from SYSCALL_DEFINE3(mprotect) and SYSCALL_DEFINE4(pkey_mprotect) to do_mprotect_pkey(). - Moved untagged_addr() definition for other arches from include/linux/memory.h to include/linux/mm.h. - Changed untagging in strn*_user() to perform userspace accesses through tagged pointers. - Updated the documentation to mention that passing tagged pointers to memory syscalls is allowed. - Updated the test to use malloc'ed memory instead of stack memory.
Changes in v10: - Added "mm, arm64: untag user pointers passed to memory syscalls" back. - New patch "fs, arm64: untag user pointers in fs/userfaultfd.c". - New patch "net, arm64: untag user pointers in tcp_zerocopy_receive". - New patch "kernel, arm64: untag user pointers in prctl_set_mm*". - New patch "tracing, arm64: untag user pointers in seq_print_user_ip".
Changes in v9: - Rebased onto 4.20-rc6. - Used u64 instead of __u64 in type casts in the untagged_addr macro for arm64. - Added braces around (addr) in the untagged_addr macro for other arches.
Changes in v8: - Rebased onto 65102238 (4.20-rc1). - Added a note to the cover letter on why syscall wrappers/shims that untag user pointers won't work. - Added a note to the cover letter that this patchset has been merged into the Pixel 2 kernel tree. - Documentation fixes, in particular added a list of syscalls that don't support tagged user pointers.
Changes in v7: - Rebased onto 17b57b18 (4.19-rc6). - Dropped the "arm64: untag user address in __do_user_fault" patch, since the existing patches already handle user faults properly. - Dropped the "usb, arm64: untag user addresses in devio" patch, since the passed pointer must come from a vma and therefore be untagged. - Dropped the "arm64: annotate user pointers casts detected by sparse" patch (see the discussion to the replies of the v6 of this patchset). - Added more context to the cover letter. - Updated Documentation/arm64/tagged-pointers.txt.
Changes in v6: - Added annotations for user pointer casts found by sparse. - Rebased onto 050cdc6c (4.19-rc1+).
Changes in v5: - Added 3 new patches that add untagging to places found with static analysis. - Rebased onto 44c929e1 (4.18-rc8).
Changes in v4: - Added a selftest for checking that passing tagged pointers to the kernel succeeds. - Rebased onto 81e97f013 (4.18-rc1+).
Changes in v3: - Rebased onto e5c51f30 (4.17-rc6+). - Added linux-arch@ to the list of recipients.
Changes in v2: - Rebased onto 2d618bdf (4.17-rc3+). - Removed excessive untagging in gup.c. - Removed untagging pointers returned from __uaccess_mask_ptr.
Changes in v1: - Rebased onto 4.17-rc1.
Changes in RFC v2: - Added "#ifndef untagged_addr..." fallback in linux/uaccess.h instead of defining it for each arch individually. - Updated Documentation/arm64/tagged-pointers.txt. - Dropped "mm, arm64: untag user addresses in memory syscalls". - Rebased onto 3eb2ce82 (4.16-rc7).
Signed-off-by: Andrey Konovalov andreyknvl@google.com
Andrey Konovalov (20): uaccess: add untagged_addr definition for other arches arm64: untag user pointers in access_ok and __uaccess_mask_ptr lib, arm64: untag user pointers in strn*_user mm, arm64: untag user pointers passed to memory syscalls mm, arm64: untag user pointers in mm/gup.c mm, arm64: untag user pointers in get_vaddr_frames fs, arm64: untag user pointers in copy_mount_options fs, arm64: untag user pointers in fs/userfaultfd.c net, arm64: untag user pointers in tcp_zerocopy_receive kernel, arm64: untag user pointers in prctl_set_mm* tracing, arm64: untag user pointers in seq_print_user_ip uprobes, arm64: untag user pointers in find_active_uprobe bpf, arm64: untag user pointers in stack_map_get_build_id_offset drm/amdgpu, arm64: untag user pointers in amdgpu_ttm_tt_get_user_pages drm/radeon, arm64: untag user pointers in radeon_ttm_tt_pin_userptr IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get tee/optee, arm64: untag user pointers in check_mem_type vfio/type1, arm64: untag user pointers in vaddr_get_pfn selftests, arm64: add a selftest for passing tagged pointers to kernel
arch/arm64/include/asm/uaccess.h | 10 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 ++- drivers/gpu/drm/radeon/radeon_ttm.c | 5 ++- drivers/infiniband/hw/mlx4/mr.c | 7 +-- drivers/media/v4l2-core/videobuf-dma-contig.c | 9 ++-- drivers/tee/optee/call.c | 1 + drivers/vfio/vfio_iommu_type1.c | 2 + fs/namespace.c | 2 +- fs/userfaultfd.c | 5 +++ include/linux/mm.h | 4 ++ ipc/shm.c | 2 + kernel/bpf/stackmap.c | 6 ++- kernel/events/uprobes.c | 2 + kernel/sys.c | 44 +++++++++++++------ kernel/trace/trace_output.c | 5 ++- lib/strncpy_from_user.c | 3 +- lib/strnlen_user.c | 3 +- mm/frame_vector.c | 2 + mm/gup.c | 4 ++ mm/madvise.c | 2 + mm/mempolicy.c | 5 +++ mm/migrate.c | 1 + mm/mincore.c | 2 + mm/mlock.c | 5 +++ mm/mmap.c | 7 +++ mm/mprotect.c | 1 + mm/mremap.c | 2 + mm/msync.c | 2 + net/ipv4/tcp.c | 2 + tools/testing/selftests/arm64/.gitignore | 1 + tools/testing/selftests/arm64/Makefile | 11 +++++ .../testing/selftests/arm64/run_tags_test.sh | 12 +++++ tools/testing/selftests/arm64/tags_test.c | 21 +++++++++ 33 files changed, 159 insertions(+), 36 deletions(-) create mode 100644 tools/testing/selftests/arm64/.gitignore create mode 100644 tools/testing/selftests/arm64/Makefile create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh create mode 100644 tools/testing/selftests/arm64/tags_test.c
To allow arm64 syscalls to accept tagged pointers from userspace, we must untag them when they are passed to the kernel. Since untagging is done in generic parts of the kernel, the untagged_addr macro needs to be defined for all architectures.
Define it as a noop for architectures other than arm64.
Acked-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- include/linux/mm.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h index 76769749b5a5..4d674518d392 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -99,6 +99,10 @@ extern int mmap_rnd_compat_bits __read_mostly; #include <asm/pgtable.h> #include <asm/processor.h>
+#ifndef untagged_addr +#define untagged_addr(addr) (addr) +#endif + #ifndef __pa_symbol #define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0)) #endif
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data from user memory into the kernel memory or vice versa. Since a user can provided a tagged pointer to one of the syscalls that use copy_from_user, we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr, before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the checks, but then passes them as is into the kernel internals.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- arch/arm64/include/asm/uaccess.h | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index e5d5f31c6d36..9164ecb5feca 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si return ret; }
-#define access_ok(addr, size) __range_ok(addr, size) +#define access_ok(addr, size) __range_ok(untagged_addr(addr), size) #define user_addr_max get_fs
#define _ASM_EXTABLE(from, to) \ @@ -226,7 +226,8 @@ static inline void uaccess_enable_not_uao(void)
/* * Sanitise a uaccess pointer such that it becomes NULL if above the - * current addr_limit. + * current addr_limit. In case the pointer is tagged (has the top byte set), + * untag the pointer before checking. */ #define uaccess_mask_ptr(ptr) (__typeof__(ptr))__uaccess_mask_ptr(ptr) static inline void __user *__uaccess_mask_ptr(const void __user *ptr) @@ -234,10 +235,11 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr) void __user *safe_ptr;
asm volatile( - " bics xzr, %1, %2\n" + " bics xzr, %3, %2\n" " csel %0, %1, xzr, eq\n" : "=&r" (safe_ptr) - : "r" (ptr), "r" (current_thread_info()->addr_limit) + : "r" (ptr), "r" (current_thread_info()->addr_limit), + "r" (untagged_addr(ptr)) : "cc");
csdb();
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
strncpy_from_user and strnlen_user accept user addresses as arguments, and do not go through the same path as copy_from_user and others, so here we need to handle the case of tagged user addresses separately.
Untag user pointers passed to these functions.
Note, that this patch only temporarily untags the pointers to perform validity checks, but then uses them as is to perform user memory accesses.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- lib/strncpy_from_user.c | 3 ++- lib/strnlen_user.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 58eacd41526c..6209bb9507c7 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -6,6 +6,7 @@ #include <linux/uaccess.h> #include <linux/kernel.h> #include <linux/errno.h> +#include <linux/mm.h>
#include <asm/byteorder.h> #include <asm/word-at-a-time.h> @@ -107,7 +108,7 @@ long strncpy_from_user(char *dst, const char __user *src, long count) return 0;
max_addr = user_addr_max(); - src_addr = (unsigned long)src; + src_addr = (unsigned long)untagged_addr(src); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index 1c1a1b0e38a5..8ca3d2ac32ec 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -2,6 +2,7 @@ #include <linux/kernel.h> #include <linux/export.h> #include <linux/uaccess.h> +#include <linux/mm.h>
#include <asm/word-at-a-time.h>
@@ -109,7 +110,7 @@ long strnlen_user(const char __user *str, long count) return 0;
max_addr = user_addr_max(); - src_addr = (unsigned long)str; + src_addr = (unsigned long)untagged_addr(str); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: madvise, mbind, get_mempolicy, mincore, mlock, mlock2, brk, mmap_pgoff, old_mmap, munmap, remap_file_pages, mprotect, pkey_mprotect, mremap, msync and shmdt.
This is done by untagging pointers passed to these syscalls in the prologues of their handlers.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- ipc/shm.c | 2 ++ mm/madvise.c | 2 ++ mm/mempolicy.c | 5 +++++ mm/migrate.c | 1 + mm/mincore.c | 2 ++ mm/mlock.c | 5 +++++ mm/mmap.c | 7 +++++++ mm/mprotect.c | 1 + mm/mremap.c | 2 ++ mm/msync.c | 2 ++ 10 files changed, 29 insertions(+)
diff --git a/ipc/shm.c b/ipc/shm.c index ce1ca9f7c6e9..7af8951e6c41 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -1593,6 +1593,7 @@ SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg) unsigned long ret; long err;
+ shmaddr = untagged_addr(shmaddr); err = do_shmat(shmid, shmaddr, shmflg, &ret, SHMLBA); if (err) return err; @@ -1732,6 +1733,7 @@ long ksys_shmdt(char __user *shmaddr)
SYSCALL_DEFINE1(shmdt, char __user *, shmaddr) { + shmaddr = untagged_addr(shmaddr); return ksys_shmdt(shmaddr); }
diff --git a/mm/madvise.c b/mm/madvise.c index 21a7881a2db4..64e6d34a7f9b 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -809,6 +809,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
+ start = untagged_addr(start); + if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index af171ccb56a2..31691737c59c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1334,6 +1334,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
+ start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX) @@ -1491,6 +1492,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
+ addr = untagged_addr(addr); + if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
@@ -1576,6 +1579,8 @@ COMPAT_SYSCALL_DEFINE6(mbind, compat_ulong_t, start, compat_ulong_t, len, unsigned long nr_bits, alloc_size; nodemask_t bm;
+ start = untagged_addr(start); + nr_bits = min_t(unsigned long, maxnode-1, MAX_NUMNODES); alloc_size = ALIGN(nr_bits, BITS_PER_LONG) / 8;
diff --git a/mm/migrate.c b/mm/migrate.c index ac6f4939bb59..ecc6dcdefb1f 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1612,6 +1612,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, if (get_user(node, nodes + i)) goto out_flush; addr = (unsigned long)p; + addr = untagged_addr(addr);
err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index 218099b5ed31..c4a3f4484b6b 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -228,6 +228,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
+ start = untagged_addr(start); + /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL; diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..6934ec92bf39 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -715,6 +715,7 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len) { + start = untagged_addr(start); return do_mlock(start, len, VM_LOCKED); }
@@ -722,6 +723,8 @@ SYSCALL_DEFINE3(mlock2, unsigned long, start, size_t, len, int, flags) { vm_flags_t vm_flags = VM_LOCKED;
+ start = untagged_addr(start); + if (flags & ~MLOCK_ONFAULT) return -EINVAL;
@@ -735,6 +738,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
+ start = untagged_addr(start); + len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mmap.c b/mm/mmap.c index 41eb48d9b527..512c679c7f33 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -199,6 +199,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) bool downgraded = false; LIST_HEAD(uf);
+ brk = untagged_addr(brk); + if (down_write_killable(&mm->mmap_sem)) return -EINTR;
@@ -1571,6 +1573,8 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len, struct file *file = NULL; unsigned long retval;
+ addr = untagged_addr(addr); + if (!(flags & MAP_ANONYMOUS)) { audit_mmap_fd(fd, flags); file = fget(fd); @@ -2867,6 +2871,7 @@ EXPORT_SYMBOL(vm_munmap);
SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) { + addr = untagged_addr(addr); profile_munmap(addr); return __vm_munmap(addr, len, true); } @@ -2885,6 +2890,8 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, unsigned long ret = -EINVAL; struct file *file;
+ start = untagged_addr(start); + pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.rst.\n", current->comm, current->pid);
diff --git a/mm/mprotect.c b/mm/mprotect.c index 028c724dcb1a..3c2b11629f89 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -468,6 +468,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL;
+ start = untagged_addr(start); if (start & ~PAGE_MASK) return -EINVAL; if (!len) diff --git a/mm/mremap.c b/mm/mremap.c index e3edef6b7a12..6422aeee65bb 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -605,6 +605,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
+ addr = untagged_addr(addr); + if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
+ start = untagged_addr(start); + if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
On Wed, Mar 20, 2019 at 03:51:18PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: madvise, mbind, get_mempolicy, mincore, mlock, mlock2, brk, mmap_pgoff, old_mmap, munmap, remap_file_pages, mprotect, pkey_mprotect, mremap, msync and shmdt.
This is done by untagging pointers passed to these syscalls in the prologues of their handlers.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
ipc/shm.c | 2 ++ mm/madvise.c | 2 ++ mm/mempolicy.c | 5 +++++ mm/migrate.c | 1 + mm/mincore.c | 2 ++ mm/mlock.c | 5 +++++ mm/mmap.c | 7 +++++++ mm/mprotect.c | 1 + mm/mremap.c | 2 ++ mm/msync.c | 2 ++ 10 files changed, 29 insertions(+)
I wonder whether it's better to keep these as wrappers in the arm64 code.
On Fri, Mar 22, 2019 at 12:44 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:18PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: madvise, mbind, get_mempolicy, mincore, mlock, mlock2, brk, mmap_pgoff, old_mmap, munmap, remap_file_pages, mprotect, pkey_mprotect, mremap, msync and shmdt.
This is done by untagging pointers passed to these syscalls in the prologues of their handlers.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
ipc/shm.c | 2 ++ mm/madvise.c | 2 ++ mm/mempolicy.c | 5 +++++ mm/migrate.c | 1 + mm/mincore.c | 2 ++ mm/mlock.c | 5 +++++ mm/mmap.c | 7 +++++++ mm/mprotect.c | 1 + mm/mremap.c | 2 ++ mm/msync.c | 2 ++ 10 files changed, 29 insertions(+)
I wonder whether it's better to keep these as wrappers in the arm64 code.
I don't think I understand what you propose, could you elaborate?
On Thu, 28 Mar 2019 19:10:07 +0100 Andrey Konovalov andreyknvl@google.com wrote:
Signed-off-by: Andrey Konovalov andreyknvl@google.com
ipc/shm.c | 2 ++ mm/madvise.c | 2 ++ mm/mempolicy.c | 5 +++++ mm/migrate.c | 1 + mm/mincore.c | 2 ++ mm/mlock.c | 5 +++++ mm/mmap.c | 7 +++++++ mm/mprotect.c | 1 + mm/mremap.c | 2 ++ mm/msync.c | 2 ++ 10 files changed, 29 insertions(+)
I wonder whether it's better to keep these as wrappers in the arm64 code.
I don't think I understand what you propose, could you elaborate?
I believe Catalin is saying that instead of placing things like:
@@ -1593,6 +1593,7 @@ SYSCALL_DEFINE3(shmat, int, shmid, char __user *, shmaddr, int, shmflg) unsigned long ret; long err;
+ shmaddr = untagged_addr(shmaddr);
To instead have the shmaddr set to the untagged_addr() before calling the system call, and passing the untagged addr to the system call, as that goes through the arm64 architecture specific code first.
-- Steve
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mm/gup.c provides a kernel interface that accepts user addresses and manipulates user pages directly (for example get_user_pages, that is used by the futex syscall). Since a user can provided tagged addresses, we need to handle this case.
Add untagging to gup.c functions that use user addresses for vma lookups.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- mm/gup.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/gup.c b/mm/gup.c index f84e22685aaa..3192741e0b3a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -686,6 +686,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (!nr_pages) return 0;
+ start = untagged_addr(start); + VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
/* @@ -848,6 +850,8 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret, major = 0;
+ address = untagged_addr(address); + if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
get_vaddr_frames uses provided user pointers for vma lookups, which can only by done with untagged pointers. Instead of locating and changing all callers of this function, perform untagging in it.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- mm/frame_vector.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c index c64dca6e27c2..c431ca81dad5 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated;
+ start = untagged_addr(start); + down_read(&mm->mmap_sem); locked = 1; vma = find_vma_intersection(mm, start, start + 1);
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In copy_mount_options a user address is being subtracted from TASK_SIZE. If the address is lower than TASK_SIZE, the size is calculated to not allow the exact_copy_from_user() call to cross TASK_SIZE boundary. However if the address is tagged, then the size will be calculated incorrectly.
Untag the address before subtracting.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index c9cab307fa77..c27e5713bf04 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2825,7 +2825,7 @@ void *copy_mount_options(const void __user * data) * the remainder of the page. */ /* copy_from_user cannot cross TASK_SIZE ! */ - size = TASK_SIZE - (unsigned long)data; + size = TASK_SIZE - (unsigned long)untagged_addr(data); if (size > PAGE_SIZE) size = PAGE_SIZE;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
userfaultfd_register() and userfaultfd_unregister() use provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in these functions.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- fs/userfaultfd.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 89800fc7dc9d..a3b70e0d9756 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1320,6 +1320,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, goto out; }
+ uffdio_register.range.start = + untagged_addr(uffdio_register.range.start); + ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); if (ret) @@ -1507,6 +1510,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) goto out;
+ uffdio_unregister.start = untagged_addr(uffdio_unregister.start); + ret = validate_range(mm, uffdio_unregister.start, uffdio_unregister.len); if (ret)
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
tcp_zerocopy_receive() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- net/ipv4/tcp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6baa6dc1b13b..855a1f68c1ea 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1761,6 +1761,8 @@ static int tcp_zerocopy_receive(struct sock *sk, if (address & (PAGE_SIZE - 1) || address != zc->address) return -EINVAL;
+ address = untagged_addr(address); + if (sk->sk_state == TCP_LISTEN) return -ENOTCONN;
On Wed, Mar 20, 2019 at 03:51:23PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
tcp_zerocopy_receive() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
net/ipv4/tcp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6baa6dc1b13b..855a1f68c1ea 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1761,6 +1761,8 @@ static int tcp_zerocopy_receive(struct sock *sk, if (address & (PAGE_SIZE - 1) || address != zc->address) return -EINVAL;
- address = untagged_addr(address);
- if (sk->sk_state == TCP_LISTEN) return -ENOTCONN;
I don't think we need this patch if we stick to Vincenzo's ABI restrictions. Can zc->address be an anonymous mmap()? My understanding of TCP_ZEROCOPY_RECEIVE is that this is an mmap() on a socket, so user should not tag such pointer.
We want to allow tagged pointers to work transparently only for heap and stack, hence the restriction to anonymous mmap() and those addresses below sbrk(0).
On 22/03/2019 12:04, Catalin Marinas wrote:
On Wed, Mar 20, 2019 at 03:51:23PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
tcp_zerocopy_receive() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
net/ipv4/tcp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6baa6dc1b13b..855a1f68c1ea 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1761,6 +1761,8 @@ static int tcp_zerocopy_receive(struct sock *sk, if (address & (PAGE_SIZE - 1) || address != zc->address) return -EINVAL;
- address = untagged_addr(address);
- if (sk->sk_state == TCP_LISTEN) return -ENOTCONN;
I don't think we need this patch if we stick to Vincenzo's ABI restrictions. Can zc->address be an anonymous mmap()? My understanding of TCP_ZEROCOPY_RECEIVE is that this is an mmap() on a socket, so user should not tag such pointer.
Good point, I hadn't looked into the interface properly. The `vma->vm_ops != &tcp_vm_ops` check just below makes sure that the mapping is specifically tied to a TCP socket, so definitely not included in the ABI relaxation.
We want to allow tagged pointers to work transparently only for heap and stack, hence the restriction to anonymous mmap() and those addresses below sbrk(0).
That's not quite true: in the ABI relaxation v2, all private mappings that are either anonymous or backed by a regular file are included. The scope is quite a bit larger than heap and stack, even though this is what we're primarily interested in for now.
Kevin
On Mon, Mar 25, 2019 at 2:54 PM Kevin Brodsky kevin.brodsky@arm.com wrote:
On 22/03/2019 12:04, Catalin Marinas wrote:
On Wed, Mar 20, 2019 at 03:51:23PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
tcp_zerocopy_receive() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
net/ipv4/tcp.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6baa6dc1b13b..855a1f68c1ea 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1761,6 +1761,8 @@ static int tcp_zerocopy_receive(struct sock *sk, if (address & (PAGE_SIZE - 1) || address != zc->address) return -EINVAL;
- address = untagged_addr(address);
- if (sk->sk_state == TCP_LISTEN) return -ENOTCONN;
I don't think we need this patch if we stick to Vincenzo's ABI restrictions. Can zc->address be an anonymous mmap()? My understanding of TCP_ZEROCOPY_RECEIVE is that this is an mmap() on a socket, so user should not tag such pointer.
Good point, I hadn't looked into the interface properly. The `vma->vm_ops != &tcp_vm_ops` check just below makes sure that the mapping is specifically tied to a TCP socket, so definitely not included in the ABI relaxation.
We want to allow tagged pointers to work transparently only for heap and stack, hence the restriction to anonymous mmap() and those addresses below sbrk(0).
Right, I'll drop this patch, thanks for noticing!
That's not quite true: in the ABI relaxation v2, all private mappings that are either anonymous or backed by a regular file are included. The scope is quite a bit larger than heap and stack, even though this is what we're primarily interested in for now.
Kevin
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
prctl_set_mm() and prctl_set_mm_map() use provided user pointers for vma lookups and do some pointer comparisons to perform validation, which can only by done with untagged pointers.
Untag user pointers in these functions for vma lookup and validity checks.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- kernel/sys.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c index 12df0e5434b8..fe26ccf3c9e6 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1885,11 +1885,12 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd) * WARNING: we don't require any capability here so be very careful * in what is allowed for modification from userspace. */ -static int validate_prctl_map(struct prctl_mm_map *prctl_map) +static int validate_prctl_map(struct prctl_mm_map *tagged_prctl_map) { unsigned long mmap_max_addr = TASK_SIZE; struct mm_struct *mm = current->mm; int error = -EINVAL, i; + struct prctl_mm_map prctl_map;
static const unsigned char offsets[] = { offsetof(struct prctl_mm_map, start_code), @@ -1905,12 +1906,25 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) offsetof(struct prctl_mm_map, env_end), };
+ memcpy(&prctl_map, tagged_prctl_map, sizeof(prctl_map)); + prctl_map.start_code = untagged_addr(prctl_map.start_code); + prctl_map.end_code = untagged_addr(prctl_map.end_code); + prctl_map.start_data = untagged_addr(prctl_map.start_data); + prctl_map.end_data = untagged_addr(prctl_map.end_data); + prctl_map.start_brk = untagged_addr(prctl_map.start_brk); + prctl_map.brk = untagged_addr(prctl_map.brk); + prctl_map.start_stack = untagged_addr(prctl_map.start_stack); + prctl_map.arg_start = untagged_addr(prctl_map.arg_start); + prctl_map.arg_end = untagged_addr(prctl_map.arg_end); + prctl_map.env_start = untagged_addr(prctl_map.env_start); + prctl_map.env_end = untagged_addr(prctl_map.env_end); + /* * Make sure the members are not somewhere outside * of allowed address space. */ for (i = 0; i < ARRAY_SIZE(offsets); i++) { - u64 val = *(u64 *)((char *)prctl_map + offsets[i]); + u64 val = *(u64 *)((char *)&prctl_map + offsets[i]);
if ((unsigned long)val >= mmap_max_addr || (unsigned long)val < mmap_min_addr) @@ -1921,8 +1935,8 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) * Make sure the pairs are ordered. */ #define __prctl_check_order(__m1, __op, __m2) \ - ((unsigned long)prctl_map->__m1 __op \ - (unsigned long)prctl_map->__m2) ? 0 : -EINVAL + ((unsigned long)prctl_map.__m1 __op \ + (unsigned long)prctl_map.__m2) ? 0 : -EINVAL error = __prctl_check_order(start_code, <, end_code); error |= __prctl_check_order(start_data, <, end_data); error |= __prctl_check_order(start_brk, <=, brk); @@ -1937,23 +1951,24 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) /* * @brk should be after @end_data in traditional maps. */ - if (prctl_map->start_brk <= prctl_map->end_data || - prctl_map->brk <= prctl_map->end_data) + if (prctl_map.start_brk <= prctl_map.end_data || + prctl_map.brk <= prctl_map.end_data) goto out;
/* * Neither we should allow to override limits if they set. */ - if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map->brk, - prctl_map->start_brk, prctl_map->end_data, - prctl_map->start_data)) + if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map.brk, + prctl_map.start_brk, prctl_map.end_data, + prctl_map.start_data)) goto out;
/* * Someone is trying to cheat the auxv vector. */ - if (prctl_map->auxv_size) { - if (!prctl_map->auxv || prctl_map->auxv_size > sizeof(mm->saved_auxv)) + if (prctl_map.auxv_size) { + if (!prctl_map.auxv || prctl_map.auxv_size > + sizeof(mm->saved_auxv)) goto out; }
@@ -1962,7 +1977,7 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) * change /proc/pid/exe link: only local sys admin should * be allowed to. */ - if (prctl_map->exe_fd != (u32)-1) { + if (prctl_map.exe_fd != (u32)-1) { if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN)) goto out; } @@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
- if (addr >= TASK_SIZE || addr < mmap_min_addr) + if (untagged_addr(addr) >= TASK_SIZE || + untagged_addr(addr) < mmap_min_addr) return -EINVAL;
error = -EINVAL;
down_write(&mm->mmap_sem); - vma = find_vma(mm, addr); + vma = find_vma(mm, untagged_addr(addr));
prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
On 20/03/2019 14:51, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
prctl_set_mm() and prctl_set_mm_map() use provided user pointers for vma lookups and do some pointer comparisons to perform validation, which can only by done with untagged pointers.
Untag user pointers in these functions for vma lookup and validity checks.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/sys.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-)
diff --git a/kernel/sys.c b/kernel/sys.c index 12df0e5434b8..fe26ccf3c9e6 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1885,11 +1885,12 @@ static int prctl_set_mm_exe_file(struct mm_struct *mm, unsigned int fd)
- WARNING: we don't require any capability here so be very careful
- in what is allowed for modification from userspace.
*/ -static int validate_prctl_map(struct prctl_mm_map *prctl_map) +static int validate_prctl_map(struct prctl_mm_map *tagged_prctl_map) { unsigned long mmap_max_addr = TASK_SIZE; struct mm_struct *mm = current->mm; int error = -EINVAL, i;
struct prctl_mm_map prctl_map;
static const unsigned char offsets[] = { offsetof(struct prctl_mm_map, start_code),
@@ -1905,12 +1906,25 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) offsetof(struct prctl_mm_map, env_end), };
- memcpy(&prctl_map, tagged_prctl_map, sizeof(prctl_map));
- prctl_map.start_code = untagged_addr(prctl_map.start_code);
- prctl_map.end_code = untagged_addr(prctl_map.end_code);
- prctl_map.start_data = untagged_addr(prctl_map.start_data);
- prctl_map.end_data = untagged_addr(prctl_map.end_data);
- prctl_map.start_brk = untagged_addr(prctl_map.start_brk);
- prctl_map.brk = untagged_addr(prctl_map.brk);
- prctl_map.start_stack = untagged_addr(prctl_map.start_stack);
- prctl_map.arg_start = untagged_addr(prctl_map.arg_start);
- prctl_map.arg_end = untagged_addr(prctl_map.arg_end);
- prctl_map.env_start = untagged_addr(prctl_map.env_start);
- prctl_map.env_end = untagged_addr(prctl_map.env_end);
- /*
*/ for (i = 0; i < ARRAY_SIZE(offsets); i++) {
- Make sure the members are not somewhere outside
- of allowed address space.
u64 val = *(u64 *)((char *)prctl_map + offsets[i]);
u64 val = *(u64 *)((char *)&prctl_map + offsets[i]);
if ((unsigned long)val >= mmap_max_addr || (unsigned long)val < mmap_min_addr)
@@ -1921,8 +1935,8 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) * Make sure the pairs are ordered. */ #define __prctl_check_order(__m1, __op, __m2) \
- ((unsigned long)prctl_map->__m1 __op \
(unsigned long)prctl_map->__m2) ? 0 : -EINVAL
- ((unsigned long)prctl_map.__m1 __op \
error = __prctl_check_order(start_code, <, end_code); error |= __prctl_check_order(start_data, <, end_data); error |= __prctl_check_order(start_brk, <=, brk);(unsigned long)prctl_map.__m2) ? 0 : -EINVAL
@@ -1937,23 +1951,24 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) /* * @brk should be after @end_data in traditional maps. */
- if (prctl_map->start_brk <= prctl_map->end_data ||
prctl_map->brk <= prctl_map->end_data)
if (prctl_map.start_brk <= prctl_map.end_data ||
prctl_map.brk <= prctl_map.end_data)
goto out;
/*
- Neither we should allow to override limits if they set.
*/
- if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map->brk,
prctl_map->start_brk, prctl_map->end_data,
prctl_map->start_data))
if (check_data_rlimit(rlimit(RLIMIT_DATA), prctl_map.brk,
prctl_map.start_brk, prctl_map.end_data,
prctl_map.start_data)) goto out;
/*
- Someone is trying to cheat the auxv vector.
*/
- if (prctl_map->auxv_size) {
if (!prctl_map->auxv || prctl_map->auxv_size > sizeof(mm->saved_auxv))
- if (prctl_map.auxv_size) {
if (!prctl_map.auxv || prctl_map.auxv_size >
}sizeof(mm->saved_auxv)) goto out;
@@ -1962,7 +1977,7 @@ static int validate_prctl_map(struct prctl_mm_map *prctl_map) * change /proc/pid/exe link: only local sys admin should * be allowed to. */
- if (prctl_map->exe_fd != (u32)-1) {
- if (prctl_map.exe_fd != (u32)-1) { if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN)) goto out; }
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
- if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr)
return -EINVAL;
error = -EINVAL;
down_write(&mm->mmap_sem);
- vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr));
prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
I think this new version is consistent w.r.t. tagged/untagged pointer usage. However, I also note that a significant change has been introduced: it is now possible to set MM fields to tagged addresses (tags are ignored by validate_prctl_map()). I am not opposed to this as such, but have you considered the implications? Does it make sense to have a tagged value for e.g. prctl_map.arg_start? Is the kernel able to handle tagged values in those fields? I have the feeling that it's safer to discard tags for now, and if necessary allow them to be preserved later on.
Kevin
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
- if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr)
return -EINVAL;
error = -EINVAL;
down_write(&mm->mmap_sem);
- vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr));
prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr) return -EINVAL; error = -EINVAL; down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr)); prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
I don't know any use cases for this. I did it because it seems to be covered by the relaxed ABI. I'm not entirely sure what to do here, should I just drop this patch?
-- Catalin
On Mon, Apr 1, 2019 at 6:44 PM Andrey Konovalov andreyknvl@google.com wrote:
On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr) return -EINVAL; error = -EINVAL; down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr)); prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
I don't know any use cases for this. I did it because it seems to be covered by the relaxed ABI. I'm not entirely sure what to do here, should I just drop this patch?
ping
-- Catalin
On Mon, Apr 01, 2019 at 06:44:34PM +0200, Andrey Konovalov wrote:
On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr) return -EINVAL; error = -EINVAL; down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr)); prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
I don't know any use cases for this. I did it because it seems to be covered by the relaxed ABI. I'm not entirely sure what to do here, should I just drop this patch?
If we allow tagged addresses to be passed here, we'd have to untag them before they end up in the mm->start_code etc. members.
I know we are trying to relax the ABI here w.r.t. address ranges but mostly because we couldn't figure out a way to document unambiguously the difference between a user pointer that may be dereferenced by the kernel (tags allowed) and an address typically used for managing the address space layout. Suggestions welcomed.
I'd say just drop this patch and capture it in the ABI document.
On Fri, Apr 26, 2019 at 4:50 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Mon, Apr 01, 2019 at 06:44:34PM +0200, Andrey Konovalov wrote:
On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr) return -EINVAL; error = -EINVAL; down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr)); prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
I don't know any use cases for this. I did it because it seems to be covered by the relaxed ABI. I'm not entirely sure what to do here, should I just drop this patch?
If we allow tagged addresses to be passed here, we'd have to untag them before they end up in the mm->start_code etc. members.
I know we are trying to relax the ABI here w.r.t. address ranges but mostly because we couldn't figure out a way to document unambiguously the difference between a user pointer that may be dereferenced by the kernel (tags allowed) and an address typically used for managing the address space layout. Suggestions welcomed.
I'd say just drop this patch and capture it in the ABI document.
OK, will do in v14.
Vincenzo, could you add a note about this into tour patchset?
-- Catalin
Hi Andrey,
sorry for the late reply, I came back from holiday and try to catch up with the emails.
On 4/29/19 3:23 PM, Andrey Konovalov wrote:
On Fri, Apr 26, 2019 at 4:50 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Mon, Apr 01, 2019 at 06:44:34PM +0200, Andrey Konovalov wrote:
On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
@@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr, if (opt == PR_SET_MM_AUXV) return prctl_set_auxv(mm, addr, arg4);
if (addr >= TASK_SIZE || addr < mmap_min_addr)
if (untagged_addr(addr) >= TASK_SIZE ||
untagged_addr(addr) < mmap_min_addr) return -EINVAL; error = -EINVAL; down_write(&mm->mmap_sem);
vma = find_vma(mm, addr);
vma = find_vma(mm, untagged_addr(addr)); prctl_map.start_code = mm->start_code; prctl_map.end_code = mm->end_code;
Does this mean that we are left with tagged addresses for the mm->start_code etc. values? I really don't think we should allow this, I'm not sure what the implications are in other parts of the kernel.
Arguably, these are not even pointer values but some address ranges. I know we decided to relax this notion for mmap/mprotect/madvise() since the user function prototypes take pointer as arguments but it feels like we are overdoing it here (struct prctl_mm_map doesn't even have pointers).
What is the use-case for allowing tagged addresses here? Can user space handle untagging?
I don't know any use cases for this. I did it because it seems to be covered by the relaxed ABI. I'm not entirely sure what to do here, should I just drop this patch?
If we allow tagged addresses to be passed here, we'd have to untag them before they end up in the mm->start_code etc. members.
I know we are trying to relax the ABI here w.r.t. address ranges but mostly because we couldn't figure out a way to document unambiguously the difference between a user pointer that may be dereferenced by the kernel (tags allowed) and an address typically used for managing the address space layout. Suggestions welcomed.
I'd say just drop this patch and capture it in the ABI document.
OK, will do in v14.
Vincenzo, could you add a note about this into tour patchset?
Ok, I will add a note that covers this case in v3 of my document.
-- Catalin
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
seq_print_user_ip() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- kernel/trace/trace_output.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 54373d93e251..6376bee93c84 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -370,6 +370,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, { struct file *file = NULL; unsigned long vmstart = 0; + unsigned long untagged_ip = untagged_addr(ip); int ret = 1;
if (s->full) @@ -379,7 +380,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, const struct vm_area_struct *vma;
down_read(&mm->mmap_sem); - vma = find_vma(mm, ip); + vma = find_vma(mm, untagged_ip); if (vma) { file = vma->vm_file; vmstart = vma->vm_start; @@ -388,7 +389,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, ret = trace_seq_path(s, &file->f_path); if (ret) trace_seq_printf(s, "[+0x%lx]", - ip - vmstart); + untagged_ip - vmstart); } up_read(&mm->mmap_sem); }
On Wed, Mar 20, 2019 at 03:51:25PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
seq_print_user_ip() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/trace/trace_output.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 54373d93e251..6376bee93c84 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -370,6 +370,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, { struct file *file = NULL; unsigned long vmstart = 0;
unsigned long untagged_ip = untagged_addr(ip); int ret = 1;
if (s->full)
@@ -379,7 +380,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, const struct vm_area_struct *vma;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma) { file = vma->vm_file; vmstart = vma->vm_start;vma = find_vma(mm, untagged_ip);
@@ -388,7 +389,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, ret = trace_seq_path(s, &file->f_path); if (ret) trace_seq_printf(s, "[+0x%lx]",
ip - vmstart);
} up_read(&mm->mmap_sem); }untagged_ip - vmstart);
How would we end up with a tagged address here? Does "ip" here imply instruction pointer, which we wouldn't tag?
On Fri, Mar 22, 2019 at 4:45 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:25PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
seq_print_user_ip() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/trace/trace_output.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 54373d93e251..6376bee93c84 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -370,6 +370,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, { struct file *file = NULL; unsigned long vmstart = 0;
unsigned long untagged_ip = untagged_addr(ip); int ret = 1; if (s->full)
@@ -379,7 +380,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, const struct vm_area_struct *vma;
down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
vma = find_vma(mm, untagged_ip); if (vma) { file = vma->vm_file; vmstart = vma->vm_start;
@@ -388,7 +389,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct mm_struct *mm, ret = trace_seq_path(s, &file->f_path); if (ret) trace_seq_printf(s, "[+0x%lx]",
ip - vmstart);
untagged_ip - vmstart); } up_read(&mm->mmap_sem); }
How would we end up with a tagged address here? Does "ip" here imply instruction pointer, which we wouldn't tag?
Yes, it's the instruction pointer. I think I got confused and decided that it's OK to have instruction pointer tagged, but I guess it's not a part of this ABI relaxation. I'll drop the patches that untag instruction pointers.
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
find_active_uprobe() uses user pointers (obtained via instruction_pointer(regs)) for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- kernel/events/uprobes.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index c5cde87329c7..d3a2716a813a 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1992,6 +1992,8 @@ static struct uprobe *find_active_uprobe(unsigned long bp_vaddr, int *is_swbp) struct uprobe *uprobe = NULL; struct vm_area_struct *vma;
+ bp_vaddr = untagged_addr(bp_vaddr); + down_read(&mm->mmap_sem); vma = find_vma(mm, bp_vaddr); if (vma && vma->vm_start <= bp_vaddr) {
On Wed, Mar 20, 2019 at 03:51:26PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
find_active_uprobe() uses user pointers (obtained via instruction_pointer(regs)) for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/events/uprobes.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index c5cde87329c7..d3a2716a813a 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1992,6 +1992,8 @@ static struct uprobe *find_active_uprobe(unsigned long bp_vaddr, int *is_swbp) struct uprobe *uprobe = NULL; struct vm_area_struct *vma;
- bp_vaddr = untagged_addr(bp_vaddr);
- down_read(&mm->mmap_sem); vma = find_vma(mm, bp_vaddr); if (vma && vma->vm_start <= bp_vaddr) {
Similarly here, that's a breakpoint address, hence instruction pointer (PC) which is untagged.
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
stack_map_get_build_id_offset() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function for doing the lookup and calculating the offset, but save as is in the bpf_stack_build_id struct.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- kernel/bpf/stackmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 950ab2f28922..bb89341d3faf 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -320,7 +320,9 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, }
for (i = 0; i < trace_nr; i++) { - vma = find_vma(current->mm, ips[i]); + u64 untagged_ip = untagged_addr(ips[i]); + + vma = find_vma(current->mm, untagged_ip); if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) { /* per entry fall back to ips */ id_offs[i].status = BPF_STACK_BUILD_ID_IP; @@ -328,7 +330,7 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE); continue; } - id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i] + id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + untagged_ip - vma->vm_start; id_offs[i].status = BPF_STACK_BUILD_ID_VALID; }
On Wed, Mar 20, 2019 at 03:51:27PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
stack_map_get_build_id_offset() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function for doing the lookup and calculating the offset, but save as is in the bpf_stack_build_id struct.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/bpf/stackmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 950ab2f28922..bb89341d3faf 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -320,7 +320,9 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, }
for (i = 0; i < trace_nr; i++) {
vma = find_vma(current->mm, ips[i]);
u64 untagged_ip = untagged_addr(ips[i]);
if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) { /* per entry fall back to ips */ id_offs[i].status = BPF_STACK_BUILD_ID_IP;vma = find_vma(current->mm, untagged_ip);
@@ -328,7 +330,7 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE); continue; }
id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
id_offs[i].status = BPF_STACK_BUILD_ID_VALID; }id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + untagged_ip - vma->vm_start;
Can the ips[*] here ever be tagged?
On Fri, Mar 22, 2019 at 4:52 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:27PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
stack_map_get_build_id_offset() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function for doing the lookup and calculating the offset, but save as is in the bpf_stack_build_id struct.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
kernel/bpf/stackmap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 950ab2f28922..bb89341d3faf 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -320,7 +320,9 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, }
for (i = 0; i < trace_nr; i++) {
vma = find_vma(current->mm, ips[i]);
u64 untagged_ip = untagged_addr(ips[i]);
vma = find_vma(current->mm, untagged_ip); if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) { /* per entry fall back to ips */ id_offs[i].status = BPF_STACK_BUILD_ID_IP;
@@ -328,7 +330,7 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE); continue; }
id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + untagged_ip - vma->vm_start; id_offs[i].status = BPF_STACK_BUILD_ID_VALID; }
Can the ips[*] here ever be tagged?
Those are instruction pointers AFAIU, so no, not within the current ABI. I'll drop this patch. Thanks!
-- Catalin
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */ - unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; + unsigned long userptr = untagged_addr(gtt->userptr); + unsigned long end = userptr + ttm->num_pages * PAGE_SIZE; struct vm_area_struct *vma;
- vma = find_vma(mm, gtt->userptr); + vma = find_vma(mm, userptr); if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;
On Wed, Mar 20, 2019 at 03:51:28PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
struct vm_area_struct *vma;unsigned long end = userptr + ttm->num_pages * PAGE_SIZE;
vma = find_vma(mm, gtt->userptr);
if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;vma = find_vma(mm, userptr);
I tried to track this down but I failed to see whether user could provide an tagged pointer here (under the restrictions as per Vincenzo's ABI document).
On 22/03/2019 15:59, Catalin Marinas wrote:
On Wed, Mar 20, 2019 at 03:51:28PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
struct vm_area_struct *vma;unsigned long end = userptr + ttm->num_pages * PAGE_SIZE;
vma = find_vma(mm, gtt->userptr);
if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;vma = find_vma(mm, userptr);
I tried to track this down but I failed to see whether user could provide an tagged pointer here (under the restrictions as per Vincenzo's ABI document).
->userptr is set by radeon_ttm_tt_set_userptr(), itself called from radeon_gem_userptr_ioctl(). Any page-aligned value is allowed.
Kevin
On 2019-03-20 10:51 a.m., Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
struct vm_area_struct *vma;unsigned long end = userptr + ttm->num_pages * PAGE_SIZE;
vma = find_vma(mm, gtt->userptr);
if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;vma = find_vma(mm, userptr);
We'll need to be careful that we don't break your change when the following commit gets applied through drm-next for Linux 5.2:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip&id=...
Would it make sense to apply the untagging in amdgpu_ttm_tt_set_userptr instead? That would avoid this conflict and I think it would clearly put the untagging into the user mode code path where the tagged pointer originates.
In amdgpu_gem_userptr_ioctl and amdgpu_amdkfd_gpuvm.c (init_user_pages) we also set up an MMU notifier with the (tagged) pointer from user mode. That should probably also use the untagged address so that MMU notifiers for the untagged address get correctly matched up with the right BO. I'd move the untagging further up the call stack to cover that. For the GEM case I think amdgpu_gem_userptr_ioctl would be the right place. For the KFD case, I'd do this in amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.
Regards, Â Felix
On Mon, Mar 25, 2019 at 11:21 PM Kuehling, Felix Felix.Kuehling@amd.com wrote:
On 2019-03-20 10:51 a.m., Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
unsigned long end = userptr + ttm->num_pages * PAGE_SIZE; struct vm_area_struct *vma;
vma = find_vma(mm, gtt->userptr);
vma = find_vma(mm, userptr); if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;
We'll need to be careful that we don't break your change when the following commit gets applied through drm-next for Linux 5.2:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip&id=...
Would it make sense to apply the untagging in amdgpu_ttm_tt_set_userptr instead? That would avoid this conflict and I think it would clearly put the untagging into the user mode code path where the tagged pointer originates.
In amdgpu_gem_userptr_ioctl and amdgpu_amdkfd_gpuvm.c (init_user_pages) we also set up an MMU notifier with the (tagged) pointer from user mode. That should probably also use the untagged address so that MMU notifiers for the untagged address get correctly matched up with the right BO. I'd move the untagging further up the call stack to cover that. For the GEM case I think amdgpu_gem_userptr_ioctl would be the right place. For the KFD case, I'd do this in amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.
Will do in v14, thanks a lot for looking at this!
Is this applicable to the radeon driver (drivers/gpu/drm/radeon) as well? It seems to be using very similar structure.
Regards, Felix
On 2019-04-02 10:37 a.m., Andrey Konovalov wrote:
On Mon, Mar 25, 2019 at 11:21 PM Kuehling, Felix Felix.Kuehling@amd.com wrote:
On 2019-03-20 10:51 a.m., Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
amdgpu_ttm_tt_get_user_pages() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 73e71e61dc99..891b027fa33b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -751,10 +751,11 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages) * check that we only use anonymous memory to prevent problems * with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
unsigned long end = userptr + ttm->num_pages * PAGE_SIZE; struct vm_area_struct *vma;
vma = find_vma(mm, gtt->userptr);
vma = find_vma(mm, userptr); if (!vma || vma->vm_file || vma->vm_end < end) { up_read(&mm->mmap_sem); return -EPERM;
We'll need to be careful that we don't break your change when the following commit gets applied through drm-next for Linux 5.2:
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-5.2-wip&id=...
Would it make sense to apply the untagging in amdgpu_ttm_tt_set_userptr instead? That would avoid this conflict and I think it would clearly put the untagging into the user mode code path where the tagged pointer originates.
In amdgpu_gem_userptr_ioctl and amdgpu_amdkfd_gpuvm.c (init_user_pages) we also set up an MMU notifier with the (tagged) pointer from user mode. That should probably also use the untagged address so that MMU notifiers for the untagged address get correctly matched up with the right BO. I'd move the untagging further up the call stack to cover that. For the GEM case I think amdgpu_gem_userptr_ioctl would be the right place. For the KFD case, I'd do this in amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu.
Will do in v14, thanks a lot for looking at this!
Is this applicable to the radeon driver (drivers/gpu/drm/radeon) as well? It seems to be using very similar structure.
I think so. Radeon doesn't have the KFD bits any more. But the GEM interface and MMU notifier are very similar.
Regards, Â Felix
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
radeon_ttm_tt_pin_userptr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/gpu/drm/radeon/radeon_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 9920a6fc11bf..872a98796117 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -497,9 +497,10 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { /* check that we only pin down anonymous memory to prevent problems with writeback */ - unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE; + unsigned long userptr = untagged_addr(gtt->userptr); + unsigned long end = userptr + ttm->num_pages * PAGE_SIZE; struct vm_area_struct *vma; - vma = find_vma(gtt->usermm, gtt->userptr); + vma = find_vma(gtt->usermm, userptr); if (!vma || vma->vm_file || vma->vm_end < end) return -EPERM; }
On Wed, Mar 20, 2019 at 03:51:29PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
radeon_ttm_tt_pin_userptr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/radeon/radeon_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 9920a6fc11bf..872a98796117 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -497,9 +497,10 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { /* check that we only pin down anonymous memory to prevent problems with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
struct vm_area_struct *vma;unsigned long end = userptr + ttm->num_pages * PAGE_SIZE;
vma = find_vma(gtt->usermm, gtt->userptr);
if (!vma || vma->vm_file || vma->vm_end < end) return -EPERM; }vma = find_vma(gtt->usermm, userptr);
Same comment as on the previous patch.
On Fri, Mar 22, 2019 at 5:01 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:29PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
radeon_ttm_tt_pin_userptr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/gpu/drm/radeon/radeon_ttm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 9920a6fc11bf..872a98796117 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -497,9 +497,10 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm) if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) { /* check that we only pin down anonymous memory to prevent problems with writeback */
unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
unsigned long userptr = untagged_addr(gtt->userptr);
unsigned long end = userptr + ttm->num_pages * PAGE_SIZE; struct vm_area_struct *vma;
vma = find_vma(gtt->usermm, gtt->userptr);
vma = find_vma(gtt->usermm, userptr); if (!vma || vma->vm_file || vma->vm_end < end) return -EPERM; }
Same comment as on the previous patch.
As Kevin wrote in the amd driver related thread, the call trace is: radeon_gem_userptr_ioctl()->radeon_ttm_tt_set_userptr()->...->radeon_ttm_tt_pin_userptr()->find_vma()
-- Catalin
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index 395379a480cb..9a35ed2c6a6f 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * again */ if (!ib_access_writable(access_flags)) { + unsigned long untagged_start = untagged_addr(start); struct vm_area_struct *vma;
down_read(¤t->mm->mmap_sem); @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * cover the memory, but for now it requires a single vma to * entirely cover the MR to support RO mappings. */ - vma = find_vma(current->mm, start); - if (vma && vma->vm_end >= start + length && - vma->vm_start <= start) { + vma = find_vma(current->mm, untagged_start); + if (vma && vma->vm_end >= untagged_start + length && + vma->vm_start <= untagged_start) { if (vma->vm_flags & VM_WRITE) access_flags |= IB_ACCESS_LOCAL_WRITE; } else {
On Wed, Mar 20, 2019 at 03:51:30PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index 395379a480cb..9a35ed2c6a6f 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * again */ if (!ib_access_writable(access_flags)) {
unsigned long untagged_start = untagged_addr(start);
struct vm_area_struct *vma;
down_read(¤t->mm->mmap_sem);
@@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * cover the memory, but for now it requires a single vma to * entirely cover the MR to support RO mappings. */
vma = find_vma(current->mm, start);
if (vma && vma->vm_end >= start + length &&
vma->vm_start <= start) {
vma = find_vma(current->mm, untagged_start);
if (vma && vma->vm_end >= untagged_start + length &&
} else {vma->vm_start <= untagged_start) { if (vma->vm_flags & VM_WRITE) access_flags |= IB_ACCESS_LOCAL_WRITE;
--
Thanks, Reviewed-by: Leon Romanovsky leonro@mellanox.com
Interesting, the followup question is why mlx4 is only one driver in IB which needs such code in umem_mr. I'll take a look on it.
Thanks
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) { + unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn; @@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK; + offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr); + vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end) + if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */ - user_address = vb->baddr; + user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
On Wed, Mar 20, 2019 at 03:51:31PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) {
- unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn;
@@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK;
offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr);
- vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end)
if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */
- user_address = vb->baddr;
user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
I don't think vb->baddr here is anonymous mmap() but worth checking the call paths.
On 22/03/2019 16:07, Catalin Marinas wrote:
On Wed, Mar 20, 2019 at 03:51:31PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) {
- unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn;
@@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK;
offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr);
- vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end)
if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */
- user_address = vb->baddr;
user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
I don't think vb->baddr here is anonymous mmap() but worth checking the call paths.
I spent some time on this, I didn't find any restriction on the kind of mapping that's allowed here. The API regarding V4L2_MEMORY_USERPTR doesn't seem to say anything about that either [0] [1]. It's probably best to ask the V4L2 maintainers.
Kevin
[0] https://www.kernel.org/doc/html/latest/media/uapi/v4l/vidioc-qbuf.html [1] https://www.kernel.org/doc/html/latest/media/uapi/v4l/userp.html
On Mon, Mar 25, 2019 at 3:08 PM Kevin Brodsky kevin.brodsky@arm.com wrote:
On 22/03/2019 16:07, Catalin Marinas wrote:
On Wed, Mar 20, 2019 at 03:51:31PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) {
- unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn;
@@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK;
offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr);
- vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end)
if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */
- user_address = vb->baddr;
user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
I don't think vb->baddr here is anonymous mmap() but worth checking the call paths.
The call path is __videobuf_iolock()->videobuf_dma_contig_user_get()->find_vma().
I spent some time on this, I didn't find any restriction on the kind of mapping that's allowed here. The API regarding V4L2_MEMORY_USERPTR doesn't seem to say anything about that either [0] [1]. It's probably best to ask the V4L2 maintainers.
Mauro, could you comment on whether the vb->baddr argument for the V4L2_MEMORY_USERPTR API can come from an anonymous memory mapping?
Kevin
[0] https://www.kernel.org/doc/html/latest/media/uapi/v4l/vidioc-qbuf.html [1] https://www.kernel.org/doc/html/latest/media/uapi/v4l/userp.html
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
check_mem_type() uses provided user pointers for vma lookups (via __check_mem_type()), which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/tee/optee/call.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c index a5afbe6dee68..e3be20264092 100644 --- a/drivers/tee/optee/call.c +++ b/drivers/tee/optee/call.c @@ -563,6 +563,7 @@ static int check_mem_type(unsigned long start, size_t num_pages) int rc;
down_read(&mm->mmap_sem); + start = untagged_addr(start); rc = __check_mem_type(find_vma(mm, start), start + num_pages * PAGE_SIZE); up_read(&mm->mmap_sem);
On Wed, Mar 20, 2019 at 03:51:32PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
check_mem_type() uses provided user pointers for vma lookups (via __check_mem_type()), which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/tee/optee/call.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c index a5afbe6dee68..e3be20264092 100644 --- a/drivers/tee/optee/call.c +++ b/drivers/tee/optee/call.c @@ -563,6 +563,7 @@ static int check_mem_type(unsigned long start, size_t num_pages) int rc;
down_read(&mm->mmap_sem);
- start = untagged_addr(start); rc = __check_mem_type(find_vma(mm, start), start + num_pages * PAGE_SIZE); up_read(&mm->mmap_sem);
I guess we could just untag this in tee_shm_register(). The tag is not relevant to a TEE implementation (firmware) anyway.
On Fri, Mar 22, 2019 at 5:22 PM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Mar 20, 2019 at 03:51:32PM +0100, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
check_mem_type() uses provided user pointers for vma lookups (via __check_mem_type()), which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com
drivers/tee/optee/call.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c index a5afbe6dee68..e3be20264092 100644 --- a/drivers/tee/optee/call.c +++ b/drivers/tee/optee/call.c @@ -563,6 +563,7 @@ static int check_mem_type(unsigned long start, size_t num_pages) int rc;
down_read(&mm->mmap_sem);
start = untagged_addr(start); rc = __check_mem_type(find_vma(mm, start), start + num_pages * PAGE_SIZE); up_read(&mm->mmap_sem);
I guess we could just untag this in tee_shm_register(). The tag is not relevant to a TEE implementation (firmware) anyway.
Will do in v14, thanks!
-- Catalin
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
vaddr_get_pfn() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/vfio/vfio_iommu_type1.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 73652e21efec..e556caa64f83 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -376,6 +376,8 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
down_read(&mm->mmap_sem);
+ vaddr = untagged_addr(vaddr); + vma = find_vma_intersection(mm, vaddr, vaddr + 1);
if (vma && vma->vm_flags & VM_PFNMAP) {
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch adds a simple test, that calls the uname syscall with a tagged user pointer as an argument. Without the kernel accepting tagged user pointers the test fails with EFAULT.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- tools/testing/selftests/arm64/.gitignore | 1 + tools/testing/selftests/arm64/Makefile | 11 ++++++++++ .../testing/selftests/arm64/run_tags_test.sh | 12 +++++++++++ tools/testing/selftests/arm64/tags_test.c | 21 +++++++++++++++++++ 4 files changed, 45 insertions(+) create mode 100644 tools/testing/selftests/arm64/.gitignore create mode 100644 tools/testing/selftests/arm64/Makefile create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh create mode 100644 tools/testing/selftests/arm64/tags_test.c
diff --git a/tools/testing/selftests/arm64/.gitignore b/tools/testing/selftests/arm64/.gitignore new file mode 100644 index 000000000000..e8fae8d61ed6 --- /dev/null +++ b/tools/testing/selftests/arm64/.gitignore @@ -0,0 +1 @@ +tags_test diff --git a/tools/testing/selftests/arm64/Makefile b/tools/testing/selftests/arm64/Makefile new file mode 100644 index 000000000000..a61b2e743e99 --- /dev/null +++ b/tools/testing/selftests/arm64/Makefile @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0 + +# ARCH can be overridden by the user for cross compiling +ARCH ?= $(shell uname -m 2>/dev/null || echo not) + +ifneq (,$(filter $(ARCH),aarch64 arm64)) +TEST_GEN_PROGS := tags_test +TEST_PROGS := run_tags_test.sh +endif + +include ../lib.mk diff --git a/tools/testing/selftests/arm64/run_tags_test.sh b/tools/testing/selftests/arm64/run_tags_test.sh new file mode 100755 index 000000000000..745f11379930 --- /dev/null +++ b/tools/testing/selftests/arm64/run_tags_test.sh @@ -0,0 +1,12 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +echo "--------------------" +echo "running tags test" +echo "--------------------" +./tags_test +if [ $? -ne 0 ]; then + echo "[FAIL]" +else + echo "[PASS]" +fi diff --git a/tools/testing/selftests/arm64/tags_test.c b/tools/testing/selftests/arm64/tags_test.c new file mode 100644 index 000000000000..2bd1830a7ebe --- /dev/null +++ b/tools/testing/selftests/arm64/tags_test.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <stdint.h> +#include <sys/utsname.h> + +#define SHIFT_TAG(tag) ((uint64_t)(tag) << 56) +#define SET_TAG(ptr, tag) (((uint64_t)(ptr) & ~SHIFT_TAG(0xff)) | \ + SHIFT_TAG(tag)) + +int main(void) +{ + struct utsname *ptr = (struct utsname *)malloc(sizeof(*ptr)); + void *tagged_ptr = (void *)SET_TAG(ptr, 0x42); + int err = uname(tagged_ptr); + + free(ptr); + return err; +}
dri-devel@lists.freedesktop.org