Shared Virtual Memory (SVM) allows the programmer to use a single virtual address space which will be shared between threads executing on CPUs and GPUs. It abstracts away from the user the location of the backing memory, and hence simplifies the user programming model. SVM supports two types of virtual memory allocation methods. Runtime allocator requires the driver to provide memory allocation and management interface, like buffer object (BO) interface. Whereas system allocator makes use of default OS memory allocation and management support like malloc().
This patch series adds both SVM system and runtime allocator support to i915 driver.
The patch series includes - SVM support for both system and runtime allocation. - Plugin in device memory with the Linux kernel. - User API advertising SVM capability and configuration by user on per vm basis. - User API to bind an address range or a BO with a device page table. - User API to migrate an address range to device memory. - Implicit migration by moving pages or BOs back from device to host memory upon CPU access. - CPU copy and blitter copy support for migrating the pages/BOs. - Large page mapping support - Page table dump support.
References: https://www.kernel.org/doc/Documentation/vm/hmm.rst The HMM use cases in the Linux kernel.
Niranjana Vishwanathapura (12): drm/i915/svm: Add SVM documentation drm/i915/svm: Define SVM UAPI drm/i915/svm: Runtime (RT) allocator support drm/i915/svm: Page table update support for SVM drm/i915/svm: Page table mirroring support drm/i915/svm: Device memory support drm/i915/svm: Implicitly migrate pages upon CPU fault drm/i915/svm: Page copy support during migration drm/i915/svm: Add functions to blitter copy SVM buffers drm/i915/svm: Use blitter copy for migration drm/i915/svm: Add support to en/disable SVM drm/i915/svm: Add page table dump support
Venkata Sandeep Dhanalakota (1): drm/i915/svm: Implicitly migrate BOs upon CPU access
Documentation/gpu/i915.rst | 29 + drivers/gpu/drm/i915/Kconfig | 23 + drivers/gpu/drm/i915/Kconfig.debug | 14 + drivers/gpu/drm/i915/Makefile | 6 + drivers/gpu/drm/i915/gem/i915_gem_context.c | 95 ++- drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 65 +- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 10 + drivers/gpu/drm/i915/gem/i915_gem_object.c | 43 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 6 + drivers/gpu/drm/i915/gem/i915_gem_svm.c | 51 ++ drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 + drivers/gpu/drm/i915/gem/i915_gem_wait.c | 2 +- drivers/gpu/drm/i915/i915_buddy.h | 12 + drivers/gpu/drm/i915/i915_drv.c | 30 +- drivers/gpu/drm/i915/i915_drv.h | 32 + drivers/gpu/drm/i915/i915_gem_gtt.c | 158 +++- drivers/gpu/drm/i915/i915_gem_gtt.h | 41 + drivers/gpu/drm/i915/i915_getparam.c | 3 + drivers/gpu/drm/i915/i915_svm.c | 298 +++++++ drivers/gpu/drm/i915/i915_svm.h | 71 ++ drivers/gpu/drm/i915/i915_svm_copy.c | 172 ++++ drivers/gpu/drm/i915/i915_svm_devmem.c | 772 ++++++++++++++++++ drivers/gpu/drm/i915/intel_memory_region.c | 4 - drivers/gpu/drm/i915/intel_memory_region.h | 18 + drivers/gpu/drm/i915/intel_region_lmem.c | 10 + include/uapi/drm/i915_drm.h | 70 ++ 28 files changed, 2024 insertions(+), 36 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h create mode 100644 drivers/gpu/drm/i915/i915_svm.c create mode 100644 drivers/gpu/drm/i915/i915_svm.h create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c
Add Shared Virtual Memory (SVM) support information.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- Documentation/gpu/i915.rst | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index d0947c5c4ab8..592fc26d2ca0 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -415,6 +415,35 @@ Object Tiling IOCTLs .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c :doc: buffer object tiling
+Shared Virtual Memory (SVM) +--------------------------- + +Shared Virtual Memory (SVM) allows the programmer to use a single virtual +address space which will be shared between threads executing on CPUs and GPUs. +It abstracts away from the user the location of the backing memory, and hence +simplifies the user programming model. +SVM supports two types of virtual memory allocation methods. +Runtime allocator requires the driver to provide memory allocation and +management interface, like buffer object (BO) interface. +Whereas system allocator makes use of default OS memory allocation and +management support like malloc(). + +Linux kernel has a Heterogeneous Memory Management (HMM) framework to +Support SVM system allocator. HMM’s address space mirroring support allows +sharing of the address space by duplicating sections of CPU page tables in the +device page tables. This enables both CPU and GPU access a physical memory +location using the same virtual address. Linux kernel also provides the ability +to plugin device memory with the system (as a special ZONE_DEVICE type) and +allocates struct page for each device memory page. It also provides a mechanism +to migrate pages from host to device memory and vice versa. +More information on HMM can be found here. +https://www.kernel.org/doc/Documentation/vm/hmm.rst + +i915 supports both SVM system and runtime allocator. As PCIe is a non-coherent +bus, it plugs in device memory as DEVICE_PRIVATE and no memory access across +PCIe link is allowed. Any such access will trigger migration of the page/s +or BOs before the memory is accessed. + Microcontrollers ================
Define UAPI for Shared Virtual Memory (SVM) fucntionality including SVM capability and configurability. When there is no GPU page fault support available, add ioctls to explicitly bind and migrate virtual address ranges.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com --- include/uapi/drm/i915_drm.h | 70 +++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index b127a99da1c1..33b3127fc3ae 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -360,6 +360,10 @@ typedef struct _drm_i915_sarea { #define DRM_I915_GEM_VM_CREATE 0x3a #define DRM_I915_GEM_VM_DESTROY 0x3b #define DRM_I915_GEM_OBJECT_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM +#define DRM_I915_GEM_VM_GETPARAM DRM_I915_GEM_CONTEXT_GETPARAM +#define DRM_I915_GEM_VM_SETPARAM DRM_I915_GEM_CONTEXT_SETPARAM +#define DRM_I915_BIND 0x3c +#define DRM_I915_SVM_MIGRATE 0x3d /* Must be kept compact -- no holes */
#define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) @@ -423,6 +427,10 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_GEM_VM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_VM_DESTROY DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control) #define DRM_IOCTL_I915_GEM_OBJECT_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_OBJECT_SETPARAM, struct drm_i915_gem_object_param) +#define DRM_IOCTL_I915_GEM_VM_GETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_GETPARAM, struct drm_i915_gem_vm_param) +#define DRM_IOCTL_I915_GEM_VM_SETPARAM DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_SETPARAM, struct drm_i915_gem_vm_param) +#define DRM_IOCTL_I915_BIND DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_BIND, struct drm_i915_bind) +#define DRM_IOCTL_I915_SVM_MIGRATE DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_SVM_MIGRATE, struct drm_i915_svm_migrate)
/* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -620,6 +628,9 @@ typedef struct drm_i915_irq_wait { */ #define I915_PARAM_PERF_REVISION 54
+/* Shared Virtual Memory (SVM) support capability */ +#define I915_PARAM_HAS_SVM 55 + /* Must be kept compact -- no holes and well documented */
typedef struct drm_i915_getparam { @@ -1815,6 +1826,17 @@ struct drm_i915_gem_vm_control { __u32 vm_id; };
+struct drm_i915_gem_vm_param { + __u32 vm_id; + __u32 rsvd; + +#define I915_VM_PARAM (2ull << 32) +#define I915_GEM_VM_PARAM_SVM 0x1 + __u64 param; + + __u64 value; +}; + struct drm_i915_reg_read { /* * Register offset. @@ -2268,6 +2290,54 @@ struct drm_i915_query_perf_config { __u8 data[]; };
+/** + * struct drm_i915_bind + * + * Bind an object/buffer in a vm's page table. + */ +struct drm_i915_bind { + /** VA start to bind **/ + __u64 start; + + /** + * VA length to [un]bind + * length only required while binding buffers. + */ + __u64 length; + + /** Type of memory to [un]bind **/ + __u32 type; +#define I915_BIND_SVM_BUFFER 0 +#define I915_BIND_SVM_GEM_OBJ 1 + + /** Object handle to [un]bind for I915_BIND_SVM_GEM_OBJ type **/ + __u32 handle; + + /** vm to [un]bind **/ + __u32 vm_id; + + /** Flags **/ + __u32 flags; +#define I915_BIND_UNBIND (1 << 0) +#define I915_BIND_READONLY (1 << 1) +}; + +/** + * struct drm_i915_svm_migrate + * + * Migrate an address range to a memory region. + */ +struct drm_i915_svm_migrate { + /** VA start to migrate **/ + __u64 start; + + /** VA length to migrate **/ + __u64 length; + + /** Memory region to migrate to **/ + __u32 region; +}; + #if defined(__cplusplus) } #endif
Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
Define UAPI for Shared Virtual Memory (SVM) fucntionality including SVM capability and configurability. When there is no GPU page fault support available, add ioctls to explicitly bind and migrate virtual address ranges.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
Having this as a separate commit ahead of the functionality breaks bisecting.
The uAPI should be exposed in similar sized chunks as it's implemented. If it has to go in as single patch, that should be after all the plumbing is already in place.
That also gives a clear indication to anybody backporting that you need to backport until the uAPI patch.
Regards, Joonas
On Tue, Nov 26, 2019 at 03:44:26PM +0200, Joonas Lahtinen wrote:
Quoting Niranjana Vishwanathapura (2019-11-22 22:57:23)
Define UAPI for Shared Virtual Memory (SVM) fucntionality including SVM capability and configurability. When there is no GPU page fault support available, add ioctls to explicitly bind and migrate virtual address ranges.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
Having this as a separate commit ahead of the functionality breaks bisecting.
The uAPI should be exposed in similar sized chunks as it's implemented. If it has to go in as single patch, that should be after all the plumbing is already in place.
That also gives a clear indication to anybody backporting that you need to backport until the uAPI patch.
Ok, will include ioctls in the relavant patches instead of this separate patch.
Thanks, Niranjana
Regards, Joonas
Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/Kconfig | 11 ++++ drivers/gpu/drm/i915/Makefile | 3 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 58 +++++++++++++++---- drivers/gpu/drm/i915/gem/i915_gem_svm.c | 51 ++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_svm.h | 22 +++++++ drivers/gpu/drm/i915/i915_drv.c | 22 +++++++ drivers/gpu/drm/i915/i915_drv.h | 22 +++++++ drivers/gpu/drm/i915/i915_gem_gtt.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.h | 13 +++++ 9 files changed, 192 insertions(+), 11 deletions(-) create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.c create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_svm.h
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index ba9595960bbe..c2e48710eec8 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -137,6 +137,17 @@ config DRM_I915_GVT_KVMGT Choose this option if you want to enable KVMGT support for Intel GVT-g.
+config DRM_I915_SVM + bool "Enable Shared Virtual Memory support in i915" + depends on STAGING + depends on DRM_I915 + default n + help + Choose this option if you want Shared Virtual Memory (SVM) + support in i915. With SVM support, one can share the virtual + address space between a process and the GPU. SVM is supported + on both integrated and discrete Intel GPUs. + menu "drm/i915 Debugging" depends on DRM_I915 depends on EXPERT diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e0fd10c0cfb8..75fe45633779 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -153,6 +153,9 @@ i915-y += \ intel_region_lmem.o \ intel_wopcm.o
+# SVM code +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o + # general-purpose microcontroller (GuC) support obj-y += gt/uc/ i915-y += gt/uc/intel_uc.o \ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 7a87e8270460..9d43ae6d643a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2864,10 +2864,14 @@ int i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { + struct drm_i915_gem_exec_object2 *exec2_list, *exec2_list_user; struct drm_i915_gem_execbuffer2 *args = data; - struct drm_i915_gem_exec_object2 *exec2_list; - struct drm_syncobj **fences = NULL; const size_t count = args->buffer_count; + struct drm_syncobj **fences = NULL; + unsigned int i = 0, svm_count = 0; + struct i915_address_space *vm; + struct i915_gem_context *ctx; + struct i915_svm_obj *svm_obj; int err;
if (!check_buffer_count(count)) { @@ -2878,15 +2882,46 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (!i915_gem_check_execbuffer(args)) return -EINVAL;
+ ctx = i915_gem_context_lookup(file->driver_priv, args->rsvd1); + if (!ctx || !rcu_access_pointer(ctx->vm)) + return -ENOENT; + + rcu_read_lock(); + vm = i915_vm_get(ctx->vm); + rcu_read_unlock(); + +alloc_again: + svm_count = vm->svm_count; /* Allocate an extra slot for use by the command parser */ - exec2_list = kvmalloc_array(count + 1, eb_element_size(), + exec2_list = kvmalloc_array(count + svm_count + 1, eb_element_size(), __GFP_NOWARN | GFP_KERNEL); if (exec2_list == NULL) { DRM_DEBUG("Failed to allocate exec list for %zd buffers\n", - count); + count + svm_count); return -ENOMEM; } - if (copy_from_user(exec2_list, + mutex_lock(&vm->mutex); + if (svm_count != vm->svm_count) { + mutex_unlock(&vm->mutex); + kvfree(exec2_list); + goto alloc_again; + } + + list_for_each_entry(svm_obj, &vm->svm_list, link) { + memset(&exec2_list[i], 0, sizeof(*exec2_list)); + exec2_list[i].handle = svm_obj->handle; + exec2_list[i].offset = svm_obj->offset; + exec2_list[i].flags = EXEC_OBJECT_PINNED | + EXEC_OBJECT_SUPPORTS_48B_ADDRESS; + i++; + } + exec2_list_user = &exec2_list[i]; + args->buffer_count += svm_count; + mutex_unlock(&vm->mutex); + i915_vm_put(vm); + i915_gem_context_put(ctx); + + if (copy_from_user(exec2_list_user, u64_to_user_ptr(args->buffers_ptr), sizeof(*exec2_list) * count)) { DRM_DEBUG("copy %zd exec entries failed\n", count); @@ -2903,6 +2938,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, }
err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences); + args->buffer_count -= svm_count;
/* * Now that we have begun execution of the batchbuffer, we ignore @@ -2913,7 +2949,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (args->flags & __EXEC_HAS_RELOC) { struct drm_i915_gem_exec_object2 __user *user_exec_list = u64_to_user_ptr(args->buffers_ptr); - unsigned int i;
/* Copy the new buffer offsets back to the user's exec list. */ /* @@ -2927,13 +2962,14 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, goto end;
for (i = 0; i < args->buffer_count; i++) { - if (!(exec2_list[i].offset & UPDATE)) + u64 *offset = &exec2_list_user[i].offset; + + if (!(*offset & UPDATE)) continue;
- exec2_list[i].offset = - gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK); - unsafe_put_user(exec2_list[i].offset, - &user_exec_list[i].offset, + *offset = gen8_canonical_addr(*offset & + PIN_OFFSET_MASK); + unsafe_put_user(*offset, &user_exec_list[i].offset, end_user); } end_user: diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.c b/drivers/gpu/drm/i915/gem/i915_gem_svm.c new file mode 100644 index 000000000000..973070056726 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "i915_gem_gtt.h" + +int i915_gem_svm_bind(struct i915_address_space *vm, + struct drm_i915_bind *args, + struct drm_file *file) +{ + struct i915_svm_obj *svm_obj, *tmp; + struct drm_i915_gem_object *obj; + int ret = 0; + + obj = i915_gem_object_lookup(file, args->handle); + if (!obj) + return -ENOENT; + + /* FIXME: Need to handle case with unending batch buffers */ + if (!(args->flags & I915_BIND_UNBIND)) { + svm_obj = kmalloc(sizeof(*svm_obj), GFP_KERNEL); + if (!svm_obj) { + ret = -ENOMEM; + goto put_obj; + } + svm_obj->handle = args->handle; + svm_obj->offset = args->start; + } + + mutex_lock(&vm->mutex); + if (!(args->flags & I915_BIND_UNBIND)) { + list_add(&svm_obj->link, &vm->svm_list); + vm->svm_count++; + } else { + list_for_each_entry_safe(svm_obj, tmp, &vm->svm_list, link) { + if (svm_obj->handle != args->handle) + continue; + + list_del_init(&svm_obj->link); + vm->svm_count--; + kfree(svm_obj); + break; + } + } + mutex_unlock(&vm->mutex); +put_obj: + i915_gem_object_put(obj); + return ret; +} diff --git a/drivers/gpu/drm/i915/gem/i915_gem_svm.h b/drivers/gpu/drm/i915/gem/i915_gem_svm.h new file mode 100644 index 000000000000..c394542dba75 --- /dev/null +++ b/drivers/gpu/drm/i915/gem/i915_gem_svm.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __I915_GEM_SVM_H +#define __I915_GEM_SVM_H + +#include "i915_drv.h" + +#if defined(CONFIG_DRM_I915_SVM) +int i915_gem_svm_bind(struct i915_address_space *vm, + struct drm_i915_bind *args, + struct drm_file *file); +#else +static inline int i915_gem_svm_bind(struct i915_address_space *vm, + struct drm_i915_bind *args, + struct drm_file *file) +{ return -ENOTSUPP; } +#endif + +#endif /* __I915_GEM_SVM_H */ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 4defadb26e7e..9c525d3f694c 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -61,6 +61,7 @@
#include "gem/i915_gem_context.h" #include "gem/i915_gem_ioctls.h" +#include "gem/i915_gem_svm.h" #include "gt/intel_gt.h" #include "gt/intel_gt_pm.h" #include "gt/intel_rc6.h" @@ -2687,6 +2688,26 @@ i915_gem_reject_pin_ioctl(struct drm_device *dev, void *data, return -ENODEV; }
+static int i915_bind_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_bind *args = data; + struct i915_address_space *vm; + int ret = -EINVAL; + + vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id); + if (unlikely(!vm)) + return -ENOENT; + + switch (args->type) { + case I915_BIND_SVM_GEM_OBJ: + ret = i915_gem_svm_bind(vm, args, file); + } + + i915_vm_put(vm); + return ret; +} + static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_INIT, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY), DRM_IOCTL_DEF_DRV(I915_FLUSH, drm_noop, DRM_AUTH), @@ -2746,6 +2767,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW), };
static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bbf4dfdfa8ba..f7051e6df656 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1911,6 +1911,28 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) return ctx; }
+static inline struct i915_address_space * +__i915_gem_address_space_lookup_rcu(struct drm_i915_file_private *file_priv, + u32 id) +{ + return idr_find(&file_priv->vm_idr, id); +} + +static inline struct i915_address_space * +i915_gem_address_space_lookup(struct drm_i915_file_private *file_priv, + u32 id) +{ + struct i915_address_space *vm; + + rcu_read_lock(); + vm = __i915_gem_address_space_lookup_rcu(file_priv, id); + if (vm) + vm = i915_vm_get(vm); + rcu_read_unlock(); + + return vm; +} + /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 6239a9adbf14..44ff4074db12 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -574,6 +574,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass) stash_init(&vm->free_pages);
INIT_LIST_HEAD(&vm->bound_list); + INIT_LIST_HEAD(&vm->svm_list); }
static int __setup_page_dma(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 402283ce2864..d618a5787c61 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -285,6 +285,13 @@ struct pagestash { struct pagevec pvec; };
+struct i915_svm_obj { + /** This obj's place in the SVM object list */ + struct list_head link; + u32 handle; + u64 offset; +}; + struct i915_address_space { struct kref ref; struct rcu_work rcu; @@ -329,6 +336,12 @@ struct i915_address_space { */ struct list_head bound_list;
+ /** + * List of SVM bind objects. + */ + struct list_head svm_list; + unsigned int svm_count; + struct pagestash free_pages;
/* Global GTT */
Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call.
The ioctl though is not svm specific, it is to do with "bulk residency" and can be used to reduce execbuf traffic to provide virtual address layout controls to e.g. Vulkan clients.
I915_VM_BIND { uint32_t vm_id; int32_t fd; /* or -1 for anon, or buf depending on flags */ uint64_t flags; uint64_t offset; /* offset info fd [page aligned] */ uint64_t length; /* page aligned */ uint64_t iova; /* page aligned */ uint64_t extensions; }; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
as I recall. I also recall it being part of a future command stream interface to reduce ioctls, but that is another story. -Chris
On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call.
The ioctl though is not svm specific, it is to do with "bulk residency" and can be used to reduce execbuf traffic to provide virtual address layout controls to e.g. Vulkan clients.
I915_VM_BIND { uint32_t vm_id; int32_t fd; /* or -1 for anon, or buf depending on flags */ uint64_t flags; uint64_t offset; /* offset info fd [page aligned] */ uint64_t length; /* page aligned */ uint64_t iova; /* page aligned */ uint64_t extensions; }; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
as I recall. I also recall it being part of a future command stream interface to reduce ioctls, but that is another story.
Thanks Chris. I will change I915_BIND to I915_VM_BIND. Currently, it is only addressing binding SVM system (buffer) and runtime (BOs) allocations. But it can be expanded for other bindings. I have 'type' field instead of 'fd' and 'extensions' & 'iov' can be added later if required. Is that OK?
-Chris
Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call.
The ioctl though is not svm specific, it is to do with "bulk residency" and can be used to reduce execbuf traffic to provide virtual address layout controls to e.g. Vulkan clients.
I915_VM_BIND { uint32_t vm_id; int32_t fd; /* or -1 for anon, or buf depending on flags */ uint64_t flags; uint64_t offset; /* offset info fd [page aligned] */ uint64_t length; /* page aligned */ uint64_t iova; /* page aligned */ uint64_t extensions; }; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
as I recall. I also recall it being part of a future command stream interface to reduce ioctls, but that is another story.
Thanks Chris. I will change I915_BIND to I915_VM_BIND.
We're very much depending on GEM VM even if BOs wouldn't be used, so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE and avoid user confusion.
Currently, it is only addressing binding SVM system (buffer) and runtime (BOs) allocations. But it can be expanded for other bindings. I have 'type' field instead of 'fd' and 'extensions' & 'iov' can be added later if required.
We should try to have the uAPI as final as early as possible. The change process is harder the later it is done, even for RFC :)
On the same note, I'm inclined to have I915_SVM_MIGRATE called I915_GEM_VM_PREFAULT from the start, as I already have got some confused questions from folks who mix it with memory pressure initiated migration. And it's inherently racy as anybody could race with it, so PREFAULT would give an impression of that.
And I think I915_GEM_VM_PREFAULT would be a generally applicable function, just like I915_GEM_VM_BIND. So we should use a struct member names that are close to I915_GEM_VM_BIND.
Better ideas for name to emphasis the nature of what is being done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
Regards, Joonas
Is that OK?
-Chris
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Nov 26, 2019 at 03:59:31PM +0200, Joonas Lahtinen wrote:
Quoting Niranjan Vishwanathapura (2019-11-25 18:40:57)
On Mon, Nov 25, 2019 at 09:59:37AM +0000, Chris Wilson wrote:
Quoting Niranjana Vishwanathapura (2019-11-22 20:57:24)
Shared Virtual Memory (SVM) runtime allocator support allows binding a shared virtual address to a buffer object (BO) in the device page table through an ioctl call.
The ioctl though is not svm specific, it is to do with "bulk residency" and can be used to reduce execbuf traffic to provide virtual address layout controls to e.g. Vulkan clients.
I915_VM_BIND { uint32_t vm_id; int32_t fd; /* or -1 for anon, or buf depending on flags */ uint64_t flags; uint64_t offset; /* offset info fd [page aligned] */ uint64_t length; /* page aligned */ uint64_t iova; /* page aligned */ uint64_t extensions; }; /* where page aligned is actually more I915_GTT_PAGE_ALIGNMENT */
as I recall. I also recall it being part of a future command stream interface to reduce ioctls, but that is another story.
Thanks Chris. I will change I915_BIND to I915_VM_BIND.
We're very much depending on GEM VM even if BOs wouldn't be used, so this is best called I915_GEM_VM_BIND to match I915_GEM_VM_CREATE and avoid user confusion.
Thanks Joonas. Ok, makes sense. I will make it as such.
Currently, it is only addressing binding SVM system (buffer) and runtime (BOs) allocations. But it can be expanded for other bindings. I have 'type' field instead of 'fd' and 'extensions' & 'iov' can be added later if required.
We should try to have the uAPI as final as early as possible. The change process is harder the later it is done, even for RFC :)
On the same note, I'm inclined to have I915_SVM_MIGRATE called I915_GEM_VM_PREFAULT from the start, as I already have got some confused questions from folks who mix it with memory pressure initiated migration. And it's inherently racy as anybody could race with it, so PREFAULT would give an impression of that.
And I think I915_GEM_VM_PREFAULT would be a generally applicable function, just like I915_GEM_VM_BIND. So we should use a struct member names that are close to I915_GEM_VM_BIND.
Better ideas for name to emphasis the nature of what is being done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
Though current patchset only supports migrating pages from host to device memory, I intend to support migrating from device to host memory as well with same ioctl. User would want that. Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
Also, migrating gem objects is currently handled by separate ioctl which is part of LMEM patch series. I am open to merging these ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
Niranjana
Regards, Joonas
Is that OK?
-Chris
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
We should try to have the uAPI as final as early as possible. The change process is harder the later it is done, even for RFC :)
On the same note, I'm inclined to have I915_SVM_MIGRATE called I915_GEM_VM_PREFAULT from the start, as I already have got some confused questions from folks who mix it with memory pressure initiated migration. And it's inherently racy as anybody could race with it, so PREFAULT would give an impression of that.
And I think I915_GEM_VM_PREFAULT would be a generally applicable function, just like I915_GEM_VM_BIND. So we should use a struct member names that are close to I915_GEM_VM_BIND.
Better ideas for name to emphasis the nature of what is being done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
Though current patchset only supports migrating pages from host to device memory, I intend to support migrating from device to host memory as well with same ioctl. User would want that. Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
In the HMM concept the CPU access would trigger fault, and trigger the transition, wouldn't it? But you're correct that it is kind of tied to the HMM concept, and may be confusing for BOs.
_PREFETCH is a good suggestion for the term, which lead to discussion to avoid explosion of IOCTLs, Chris suggested consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?
If we're looking at connections to fadvise(2), we're basically talking about equivalent of FADV_WILLNEED. That concept would be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED and explicitly passing the memory region? Because we can't decipher that from the running thread like CPU.
Thoughts?
Also, migrating gem objects is currently handled by separate ioctl which is part of LMEM patch series. I am open to merging these ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
The IOCTL in the LMEM series is about setting the allowed backing store types of a BO, not about the residency. There was some discussion around doing explicit migrations by changing that list. Current thinking is that we only need to allow setting it once at creation. That also means it might be convertible to creation time only property.
That'd eliminate the need for BO freeze IOCTL that was discussed with Mesa folks.
Regard, Joonas
On Thu, Nov 28, 2019 at 02:12:30PM +0200, Joonas Lahtinen wrote:
Quoting Niranjan Vishwanathapura (2019-11-27 21:23:56)
We should try to have the uAPI as final as early as possible. The change process is harder the later it is done, even for RFC :)
On the same note, I'm inclined to have I915_SVM_MIGRATE called I915_GEM_VM_PREFAULT from the start, as I already have got some confused questions from folks who mix it with memory pressure initiated migration. And it's inherently racy as anybody could race with it, so PREFAULT would give an impression of that.
And I think I915_GEM_VM_PREFAULT would be a generally applicable function, just like I915_GEM_VM_BIND. So we should use a struct member names that are close to I915_GEM_VM_BIND.
Better ideas for name to emphasis the nature of what is being done? I915_GEM_VM_FAULT/I915_GEM_VM_{M,}ADVICE...
Though current patchset only supports migrating pages from host to device memory, I intend to support migrating from device to host memory as well with same ioctl. User would want that. Not sure what would be a good name then, _MIGRATE/_PREFETCH/_MOVE?
In the HMM concept the CPU access would trigger fault, and trigger the transition, wouldn't it? But you're correct that it is kind of tied to the HMM concept, and may be confusing for BOs.
Yes it does. But I think we should give the user mechanism to explicitly migrate/prefetch it back to system memory.
_PREFETCH is a good suggestion for the term, which lead to discussion to avoid explosion of IOCTLs, Chris suggested consolidation, maybe we should have I915_GEM_VM_{M,}ADVISE?
If we're looking at connections to fadvise(2), we're basically talking about equivalent of FADV_WILLNEED. That concept would be quite familiar to users. GEM_VM_{M,}ADVISE with WILLNEED and explicitly passing the memory region? Because we can't decipher that from the running thread like CPU.
Thoughts?
Yah it is closer to mbind (instead of nodemask, we specify memory region/s). So, I915_GEM_VM_MBIND? I am ok with _PREFETCH also.
Also, migrating gem objects is currently handled by separate ioctl which is part of LMEM patch series. I am open to merging these ioctls together (similart to VM_BIND) into a single MIGRATE ioctl.
The IOCTL in the LMEM series is about setting the allowed backing store types of a BO, not about the residency. There was some discussion around doing explicit migrations by changing that list. Current thinking is that we only need to allow setting it once at creation. That also means it might be convertible to creation time only property.
That'd eliminate the need for BO freeze IOCTL that was discussed with Mesa folks.
Ok.
Thanks, Niranjana
Regard, Joonas
From: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
As PCIe is non-coherent link, do not allow direct access to buffer objects across the PCIe link for SVM case. Upon CPU accesses (mmap, pread), migrate buffer object to host memory.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Cc: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 10 ++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 29 +++++++++++++++++----- drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 +++ drivers/gpu/drm/i915/intel_memory_region.c | 4 --- drivers/gpu/drm/i915/intel_memory_region.h | 4 +++ 5 files changed, 40 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index d60973603cc1..0b41e2f3098c 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -13,6 +13,7 @@ #include "i915_drv.h" #include "i915_gem_gtt.h" #include "i915_gem_ioctls.h" +#include "i915_gem_lmem.h" #include "i915_gem_object.h" #include "i915_trace.h" #include "i915_vma.h" @@ -235,6 +236,15 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf) if (i915_gem_object_is_readonly(obj) && write) return VM_FAULT_SIGBUS;
+ /* Implicitly migrate BO to SMEM if it is SVM mapped */ + if (i915_gem_object_svm_mapped(obj) && i915_gem_object_is_lmem(obj)) { + u32 regions[] = { REGION_MAP(INTEL_MEMORY_SYSTEM, 0) }; + + ret = i915_gem_object_migrate_region(obj, regions, 1); + if (ret) + goto err; + } + /* We don't use vmf->pgoff since that has the fake offset */ page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 10defa723bc9..a7dee1b749cb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -493,12 +493,17 @@ __region_id(u32 region) return INTEL_REGION_UNKNOWN; }
+bool +i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) +{ + return false; +} + static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, struct drm_i915_gem_object_param *args, struct drm_file *file, struct drm_i915_gem_object *obj) { - struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context; u32 __user *uregions = u64_to_user_ptr(args->data); u32 uregions_copy[INTEL_REGION_UNKNOWN]; int i, ret; @@ -518,16 +523,28 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, ++uregions; }
+ ret = i915_gem_object_migrate_region(obj, uregions_copy, + args->size); + + return ret; +} + +int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj, + u32 *regions, int size) +{ + struct drm_i915_private *dev_priv = to_i915(obj->base.dev); + struct intel_context *ce = dev_priv->engine[BCS0]->kernel_context; + int i, ret; + mutex_lock(&dev_priv->drm.struct_mutex); ret = i915_gem_object_prepare_move(obj); if (ret) { DRM_ERROR("Cannot set memory region, object in use\n"); - goto err; + goto err; }
- for (i = 0; i < args->size; i++) { - u32 region = uregions_copy[i]; - enum intel_region_id id = __region_id(region); + for (i = 0; i < size; i++) { + enum intel_region_id id = __region_id(regions[i]);
if (id == INTEL_REGION_UNKNOWN) { ret = -EINVAL; @@ -537,7 +554,7 @@ static int i915_gem_object_region_select(struct drm_i915_private *dev_priv, ret = i915_gem_object_migrate(obj, ce, id); if (!ret) { if (!i915_gem_object_has_pages(obj) && - MEMORY_TYPE_FROM_REGION(region) == + MEMORY_TYPE_FROM_REGION(regions[i]) == INTEL_MEMORY_LOCAL) { /* * TODO: this should be part of get_pages(), diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 4d6da91beaff..eec31d9a4fa2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -47,6 +47,9 @@ int i915_gem_object_prepare_move(struct drm_i915_gem_object *obj); int i915_gem_object_migrate(struct drm_i915_gem_object *obj, struct intel_context *ce, enum intel_region_id id); +bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj); +int i915_gem_object_migrate_region(struct drm_i915_gem_object *obj, + u32 *uregions, int size);
void i915_gem_flush_free_objects(struct drm_i915_private *i915);
diff --git a/drivers/gpu/drm/i915/intel_memory_region.c b/drivers/gpu/drm/i915/intel_memory_region.c index baaeaecc64af..049a1b482d9d 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.c +++ b/drivers/gpu/drm/i915/intel_memory_region.c @@ -6,10 +6,6 @@ #include "intel_memory_region.h" #include "i915_drv.h"
-/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */ -#define REGION_MAP(type, inst) \ - BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst) - const u32 intel_region_map[] = { [INTEL_REGION_SMEM] = REGION_MAP(INTEL_MEMORY_SYSTEM, 0), [INTEL_REGION_LMEM] = REGION_MAP(INTEL_MEMORY_LOCAL, 0), diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h index 238722009677..4622c086c06d 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.h +++ b/drivers/gpu/drm/i915/intel_memory_region.h @@ -41,6 +41,10 @@ enum intel_region_id {
#define INTEL_MEMORY_TYPE_SHIFT 16
+/* XXX: Hysterical raisins. BIT(inst) needs to just be (inst) at some point. */ +#define REGION_MAP(type, inst) \ + (BIT((type) + INTEL_MEMORY_TYPE_SHIFT) | BIT(inst)) + #define MEMORY_TYPE_FROM_REGION(r) (ilog2((r) >> INTEL_MEMORY_TYPE_SHIFT)) #define MEMORY_INSTANCE_FROM_REGION(r) (ilog2((r) & 0xffff))
For Shared Virtual Memory (SVM) system (SYS) allocator, there is no backing buffer object (BO). Add support to bind a VA to PA mapping in the device page table.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/i915_gem_gtt.c | 60 ++++++++++++++++++++++++++++- drivers/gpu/drm/i915/i915_gem_gtt.h | 10 +++++ 2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 44ff4074db12..d9e95229ba1d 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -183,6 +183,50 @@ static void ppgtt_unbind_vma(struct i915_vma *vma) vma->vm->clear_range(vma->vm, vma->node.start, vma->size); }
+int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size) +{ + return vm->allocate_va_range(vm, start, size); +} + +int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes) +{ + struct i915_vma vma = {0}; + u32 pte_flags = 0; + + /* use a vma wrapper */ + vma.page_sizes.sg = sg_page_sizes; + vma.node.start = start; + vma.node.size = size; + vma.pages = st; + vma.vm = vm; + + /* Applicable to VLV, and gen8+ */ + if (flags & I915_GTT_SVM_READONLY) + pte_flags |= PTE_READ_ONLY; + + vm->insert_entries(vm, &vma, 0, pte_flags); + return 0; +} + +int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes) +{ + int ret; + + ret = svm_bind_addr_prepare(vm, start, size); + if (ret) + return ret; + + return svm_bind_addr_commit(vm, start, size, flags, st, sg_page_sizes); +} + +void svm_unbind_addr(struct i915_address_space *vm, + u64 start, u64 size) +{ + vm->clear_range(vm, start, size); +} + static int ppgtt_set_pages(struct i915_vma *vma) { GEM_BUG_ON(vma->pages); @@ -973,11 +1017,21 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, DBG("%s(%p):{ lvl:%d, start:%llx, end:%llx, idx:%d, len:%d, used:%d }\n", __func__, vm, lvl + 1, start, end, idx, len, atomic_read(px_used(pd))); - GEM_BUG_ON(!len || len >= atomic_read(px_used(pd))); + /* + * FIXME: In SVM case, during mmu invalidation, we need to clear ppgtt, + * but we don't know if the entry exist or not. So, we can't assume + * that it is called only when the entry exist. revisit. + * Also need to add the ebility to properly handle partial invalidations + * by downgrading the large mappings. + */ + GEM_BUG_ON(!len);
do { struct i915_page_table *pt = pd->entry[idx];
+ if (!pt) + continue; + if (atomic_fetch_inc(&pt->used) >> gen8_pd_shift(1) && gen8_pd_contains(start, end, lvl)) { DBG("%s(%p):{ lvl:%d, idx:%d, start:%llx, end:%llx } removing pd\n", @@ -1000,7 +1054,9 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, __func__, vm, lvl, start, end, gen8_pd_index(start, 0), count, atomic_read(&pt->used)); - GEM_BUG_ON(!count || count >= atomic_read(&pt->used)); + GEM_BUG_ON(!count); + if (count > atomic_read(&pt->used)) + count = atomic_read(&pt->used);
vaddr = kmap_atomic_px(pt); memset64(vaddr + gen8_pd_index(start, 0), diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index d618a5787c61..6a8d55490e39 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -39,6 +39,7 @@ #include <linux/mm.h> #include <linux/pagevec.h> #include <linux/workqueue.h> +#include <linux/scatterlist.h>
#include <drm/drm_mm.h>
@@ -678,4 +679,13 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
#define PIN_OFFSET_MASK (-I915_GTT_PAGE_SIZE)
+/* SVM API */ +#define I915_GTT_SVM_READONLY BIT(0) + +int svm_bind_addr_prepare(struct i915_address_space *vm, u64 start, u64 size); +int svm_bind_addr_commit(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes); +int svm_bind_addr(struct i915_address_space *vm, u64 start, u64 size, + u64 flags, struct sg_table *st, u32 sg_page_sizes); +void svm_unbind_addr(struct i915_address_space *vm, u64 start, u64 size); #endif
Use HMM page table mirroring support to build device page table. Implement the bind ioctl and bind the process address range in the specified context's ppgtt. Handle invalidation notifications by unbinding the address range.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/Kconfig | 3 + drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/i915_drv.c | 3 + drivers/gpu/drm/i915/i915_gem_gtt.c | 5 + drivers/gpu/drm/i915/i915_gem_gtt.h | 4 + drivers/gpu/drm/i915/i915_svm.c | 296 ++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_svm.h | 50 +++++ 7 files changed, 363 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_svm.c create mode 100644 drivers/gpu/drm/i915/i915_svm.h
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index c2e48710eec8..689e57fe3973 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -141,6 +141,9 @@ config DRM_I915_SVM bool "Enable Shared Virtual Memory support in i915" depends on STAGING depends on DRM_I915 + depends on MMU + select HMM_MIRROR + select MMU_NOTIFIER default n help Choose this option if you want Shared Virtual Memory (SVM) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 75fe45633779..7d4cd9eefd12 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -154,7 +154,8 @@ i915-y += \ intel_wopcm.o
# SVM code -i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o +i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ + i915_svm.o
# general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9c525d3f694c..c190df614c48 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -73,6 +73,7 @@ #include "i915_perf.h" #include "i915_query.h" #include "i915_suspend.h" +#include "i915_svm.h" #include "i915_switcheroo.h" #include "i915_sysfs.h" #include "i915_trace.h" @@ -2700,6 +2701,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data, return -ENOENT;
switch (args->type) { + case I915_BIND_SVM_BUFFER: + return i915_svm_bind(vm, args); case I915_BIND_SVM_GEM_OBJ: ret = i915_gem_svm_bind(vm, args, file); } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index d9e95229ba1d..08cbbb4d91cb 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -42,6 +42,7 @@
#include "i915_drv.h" #include "i915_scatterlist.h" +#include "i915_svm.h" #include "i915_trace.h" #include "i915_vgpu.h"
@@ -550,6 +551,7 @@ static void i915_address_space_fini(struct i915_address_space *vm) drm_mm_takedown(&vm->mm);
mutex_destroy(&vm->mutex); + mutex_destroy(&vm->svm_mutex); }
void __i915_vm_close(struct i915_address_space *vm) @@ -593,6 +595,7 @@ void i915_vm_release(struct kref *kref) GEM_BUG_ON(i915_is_ggtt(vm)); trace_i915_ppgtt_release(vm);
+ i915_svm_unbind_mm(vm); queue_rcu_work(vm->i915->wq, &vm->rcu); }
@@ -608,6 +611,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass) * attempt holding the lock is immediately reported by lockdep. */ mutex_init(&vm->mutex); + mutex_init(&vm->svm_mutex); lockdep_set_subclass(&vm->mutex, subclass); i915_gem_shrinker_taints_mutex(vm->i915, &vm->mutex);
@@ -619,6 +623,7 @@ static void i915_address_space_init(struct i915_address_space *vm, int subclass)
INIT_LIST_HEAD(&vm->bound_list); INIT_LIST_HEAD(&vm->svm_list); + RCU_INIT_POINTER(vm->svm, NULL); }
static int __setup_page_dma(struct i915_address_space *vm, diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 6a8d55490e39..722b1e7ce291 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -293,6 +293,8 @@ struct i915_svm_obj { u64 offset; };
+struct i915_svm; + struct i915_address_space { struct kref ref; struct rcu_work rcu; @@ -342,6 +344,8 @@ struct i915_address_space { */ struct list_head svm_list; unsigned int svm_count; + struct i915_svm *svm; + struct mutex svm_mutex; /* protects svm enabling */
struct pagestash free_pages;
diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c new file mode 100644 index 000000000000..4e414f257121 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm.c @@ -0,0 +1,296 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include <linux/mm_types.h> +#include <linux/sched/mm.h> + +#include "i915_svm.h" +#include "intel_memory_region.h" +#include "gem/i915_gem_context.h" + +static const u64 i915_range_flags[HMM_PFN_FLAG_MAX] = { + [HMM_PFN_VALID] = (1 << 0), /* HMM_PFN_VALID */ + [HMM_PFN_WRITE] = (1 << 1), /* HMM_PFN_WRITE */ + [HMM_PFN_DEVICE_PRIVATE] = (1 << 2) /* HMM_PFN_DEVICE_PRIVATE */ +}; + +static const u64 i915_range_values[HMM_PFN_VALUE_MAX] = { + [HMM_PFN_ERROR] = 0xfffffffffffffffeUL, /* HMM_PFN_ERROR */ + [HMM_PFN_NONE] = 0, /* HMM_PFN_NONE */ + [HMM_PFN_SPECIAL] = 0xfffffffffffffffcUL /* HMM_PFN_SPECIAL */ +}; + +static inline bool +i915_range_done(struct hmm_range *range) +{ + bool ret = hmm_range_valid(range); + + hmm_range_unregister(range); + return ret; +} + +static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{ + long ret; + + range->default_flags = 0; + range->pfn_flags_mask = -1UL; + + ret = hmm_range_register(range, &svm->mirror); + if (ret) { + up_read(&svm->mm->mmap_sem); + return (int)ret; + } + + if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) { + up_read(&svm->mm->mmap_sem); + return -EBUSY; + } + + ret = hmm_range_fault(range, 0); + if (ret <= 0) { + if (ret == 0) + ret = -EBUSY; + up_read(&svm->mm->mmap_sem); + hmm_range_unregister(range); + return ret; + } + return 0; +} + +static struct i915_svm *vm_get_svm(struct i915_address_space *vm) +{ + struct i915_svm *svm = vm->svm; + + mutex_lock(&vm->svm_mutex); + if (svm && !kref_get_unless_zero(&svm->ref)) + svm = NULL; + + mutex_unlock(&vm->svm_mutex); + return svm; +} + +static void release_svm(struct kref *ref) +{ + struct i915_svm *svm = container_of(ref, typeof(*svm), ref); + struct i915_address_space *vm = svm->vm; + + hmm_mirror_unregister(&svm->mirror); + mutex_destroy(&svm->mutex); + vm->svm = NULL; + kfree(svm); +} + +static void vm_put_svm(struct i915_address_space *vm) +{ + mutex_lock(&vm->svm_mutex); + if (vm->svm) + kref_put(&vm->svm->ref, release_svm); + mutex_unlock(&vm->svm_mutex); +} + +static u32 i915_svm_build_sg(struct i915_address_space *vm, + struct hmm_range *range, + struct sg_table *st) +{ + struct scatterlist *sg; + u32 sg_page_sizes = 0; + u64 i, npages; + + sg = NULL; + st->nents = 0; + npages = (range->end - range->start) / PAGE_SIZE; + + /* + * No needd to dma map the host pages and later unmap it, as + * GPU is not allowed to access it with SVM. Hence, no need + * of any intermediate data strucutre to hold the mappings. + */ + for (i = 0; i < npages; i++) { + u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1); + + if (sg && (addr == (sg_dma_address(sg) + sg->length))) { + sg->length += PAGE_SIZE; + sg_dma_len(sg) += PAGE_SIZE; + continue; + } + + if (sg) + sg_page_sizes |= sg->length; + + sg = sg ? __sg_next(sg) : st->sgl; + sg_dma_address(sg) = addr; + sg_dma_len(sg) = PAGE_SIZE; + sg->length = PAGE_SIZE; + st->nents++; + } + + sg_page_sizes |= sg->length; + sg_mark_end(sg); + return sg_page_sizes; +} + +static void i915_svm_unbind_addr(struct i915_address_space *vm, + u64 start, u64 length) +{ + struct i915_svm *svm = vm->svm; + + mutex_lock(&svm->mutex); + /* FIXME: Need to flush the TLB */ + svm_unbind_addr(vm, start, length); + mutex_unlock(&svm->mutex); +} + +int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args) +{ + struct vm_area_struct *vma; + struct hmm_range range; + struct i915_svm *svm; + u64 i, flags, npages; + struct sg_table st; + u32 sg_page_sizes; + int ret = 0; + + svm = vm_get_svm(vm); + if (!svm) + return -EINVAL; + + args->length += (args->start & ~PAGE_MASK); + args->start &= PAGE_MASK; + DRM_DEBUG_DRIVER("%sing start 0x%llx length 0x%llx vm_id 0x%x\n", + (args->flags & I915_BIND_UNBIND) ? "Unbind" : "Bind", + args->start, args->length, args->vm_id); + if (args->flags & I915_BIND_UNBIND) { + i915_svm_unbind_addr(vm, args->start, args->length); + goto bind_done; + } + + npages = args->length / PAGE_SIZE; + if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) { + ret = -ENOMEM; + goto bind_done; + } + + range.pfns = kvmalloc_array(npages, sizeof(uint64_t), GFP_KERNEL); + if (unlikely(!range.pfns)) { + ret = -ENOMEM; + goto range_done; + } + + ret = svm_bind_addr_prepare(vm, args->start, args->length); + if (unlikely(ret)) + goto prepare_done; + + range.pfn_shift = PAGE_SHIFT; + range.start = args->start; + range.flags = i915_range_flags; + range.values = i915_range_values; + range.end = range.start + args->length; + for (i = 0; i < npages; ++i) { + range.pfns[i] = range.flags[HMM_PFN_VALID]; + if (!(args->flags & I915_BIND_READONLY)) + range.pfns[i] |= range.flags[HMM_PFN_WRITE]; + } + + down_read(&svm->mm->mmap_sem); + vma = find_vma(svm->mm, range.start); + if (unlikely(!vma || vma->vm_end < range.end)) { + ret = -EFAULT; + goto vma_done; + } +again: + ret = i915_range_fault(svm, &range); + if (unlikely(ret)) + goto vma_done; + + sg_page_sizes = i915_svm_build_sg(vm, &range, &st); + + mutex_lock(&svm->mutex); + if (!i915_range_done(&range)) { + mutex_unlock(&svm->mutex); + goto again; + } + + flags = (args->flags & I915_BIND_READONLY) ? I915_GTT_SVM_READONLY : 0; + ret = svm_bind_addr_commit(vm, args->start, args->length, + flags, &st, sg_page_sizes); + mutex_unlock(&svm->mutex); +vma_done: + up_read(&svm->mm->mmap_sem); + if (unlikely(ret)) + i915_svm_unbind_addr(vm, args->start, args->length); +prepare_done: + kvfree(range.pfns); +range_done: + sg_free_table(&st); +bind_done: + vm_put_svm(vm); + return ret; +} + +static int +i915_sync_cpu_device_pagetables(struct hmm_mirror *mirror, + const struct mmu_notifier_range *update) +{ + struct i915_svm *svm = container_of(mirror, typeof(*svm), mirror); + unsigned long length = update->end - update->start; + + DRM_DEBUG_DRIVER("start 0x%lx length 0x%lx\n", update->start, length); + if (!mmu_notifier_range_blockable(update)) + return -EAGAIN; + + i915_svm_unbind_addr(svm->vm, update->start, length); + return 0; +} + +static void i915_mirror_release(struct hmm_mirror *mirror) +{ +} + +static const struct hmm_mirror_ops i915_mirror_ops = { + .sync_cpu_device_pagetables = &i915_sync_cpu_device_pagetables, + .release = &i915_mirror_release, +}; + +void i915_svm_unbind_mm(struct i915_address_space *vm) +{ + vm_put_svm(vm); +} + +int i915_svm_bind_mm(struct i915_address_space *vm) +{ + struct i915_svm *svm; + struct mm_struct *mm; + int ret = 0; + + mm = get_task_mm(current); + down_write(&mm->mmap_sem); + mutex_lock(&vm->svm_mutex); + if (vm->svm) + goto bind_out; + + svm = kzalloc(sizeof(*svm), GFP_KERNEL); + if (!svm) { + ret = -ENOMEM; + goto bind_out; + } + svm->mirror.ops = &i915_mirror_ops; + mutex_init(&svm->mutex); + kref_init(&svm->ref); + svm->mm = mm; + svm->vm = vm; + + ret = hmm_mirror_register(&svm->mirror, mm); + if (ret) + goto bind_out; + + vm->svm = svm; +bind_out: + mutex_unlock(&vm->svm_mutex); + up_write(&mm->mmap_sem); + mmput(mm); + return ret; +} diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h new file mode 100644 index 000000000000..f176f1dc493f --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -0,0 +1,50 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __I915_SVM_H +#define __I915_SVM_H + +#include <linux/hmm.h> + +#include "i915_drv.h" + +#if defined(CONFIG_DRM_I915_SVM) +struct i915_svm { + /* i915 address space */ + struct i915_address_space *vm; + + /* Associated process mm */ + struct mm_struct *mm; + + /* hmm_mirror to track memory invalidations for a process */ + struct hmm_mirror mirror; + + struct mutex mutex; /* protects svm operations */ + struct kref ref; +}; + +int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args); +void i915_svm_unbind_mm(struct i915_address_space *vm); +int i915_svm_bind_mm(struct i915_address_space *vm); +static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) +{ + return vm->svm; +} + +#else + +struct i915_svm { }; +static inline int i915_svm_bind(struct i915_address_space *vm, + struct drm_i915_bind *args) +{ return -ENOTSUPP; } +static inline void i915_svm_unbind_mm(struct i915_address_space *vm) { } +static inline int i915_svm_bind_mm(struct i915_address_space *vm) +{ return -ENOTSUPP; } +static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) +{ return false; } + +#endif + +#endif /* __I915_SVM_H */
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
+static inline bool +i915_range_done(struct hmm_range *range) +{
- bool ret = hmm_range_valid(range);
- hmm_range_unregister(range);
- return ret;
+}
This needs to be updated to follow the new API, but this pattern is generally wrong, failure if hmm_range_valid should retry the range_fault and so forth. See the hmm.rst.
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
+static u32 i915_svm_build_sg(struct i915_address_space *vm,
struct hmm_range *range,
struct sg_table *st)
+{
- struct scatterlist *sg;
- u32 sg_page_sizes = 0;
- u64 i, npages;
- sg = NULL;
- st->nents = 0;
- npages = (range->end - range->start) / PAGE_SIZE;
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
- range.pfn_shift = PAGE_SHIFT;
- range.start = args->start;
- range.flags = i915_range_flags;
- range.values = i915_range_values;
- range.end = range.start + args->length;
- for (i = 0; i < npages; ++i) {
range.pfns[i] = range.flags[HMM_PFN_VALID];
if (!(args->flags & I915_BIND_READONLY))
range.pfns[i] |= range.flags[HMM_PFN_WRITE];
- }
- down_read(&svm->mm->mmap_sem);
- vma = find_vma(svm->mm, range.start);
Why is find_vma needed?
- if (unlikely(!vma || vma->vm_end < range.end)) {
ret = -EFAULT;
goto vma_done;
- }
+again:
- ret = i915_range_fault(svm, &range);
- if (unlikely(ret))
goto vma_done;
- sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
Keep in mind what can be done with the range values is limited until the lock is obtained. Reading the pfns and flags is OK though.
+int i915_svm_bind_mm(struct i915_address_space *vm) +{
- struct i915_svm *svm;
- struct mm_struct *mm;
- int ret = 0;
- mm = get_task_mm(current);
I don't think the get_task_mm(current) is needed, the mmget is already done for current->mm ?
Jason
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
+static inline bool +i915_range_done(struct hmm_range *range) +{
- bool ret = hmm_range_valid(range);
- hmm_range_unregister(range);
- return ret;
+}
This needs to be updated to follow the new API, but this pattern is generally wrong, failure if hmm_range_valid should retry the range_fault and so forth. See the hmm.rst.
Thanks Jason for the feedback. Yah, will update as per new API in the next revision. The caller of this function is indeed retrying if the range is not valid. But I got the point. I made changes in my local build as per hmm.rst and it is working fine. Will include the change in next revision.
Noticed that as hmm_range_fault() retuns number of valid pages, hmm.rst example probably should be updated to check for that instead of non-zero return value.
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
+static u32 i915_svm_build_sg(struct i915_address_space *vm,
struct hmm_range *range,
struct sg_table *st)
+{
- struct scatterlist *sg;
- u32 sg_page_sizes = 0;
- u64 i, npages;
- sg = NULL;
- st->nents = 0;
- npages = (range->end - range->start) / PAGE_SIZE;
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here. Here we don't expect any access to host memory as mentioned in the comment above. Well for integrated gfx, if we need to use SVM (with iommu is enabled), we need to dma map. This requires we maintain the dma address for each page in some intermediate data structure. Currently, this is TBD.
- range.pfn_shift = PAGE_SHIFT;
- range.start = args->start;
- range.flags = i915_range_flags;
- range.values = i915_range_values;
- range.end = range.start + args->length;
- for (i = 0; i < npages; ++i) {
range.pfns[i] = range.flags[HMM_PFN_VALID];
if (!(args->flags & I915_BIND_READONLY))
range.pfns[i] |= range.flags[HMM_PFN_WRITE];
- }
- down_read(&svm->mm->mmap_sem);
- vma = find_vma(svm->mm, range.start);
Why is find_vma needed?
Ok, looks like not needed, will remove it.
- if (unlikely(!vma || vma->vm_end < range.end)) {
ret = -EFAULT;
goto vma_done;
- }
+again:
- ret = i915_range_fault(svm, &range);
- if (unlikely(ret))
goto vma_done;
- sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
Keep in mind what can be done with the range values is limited until the lock is obtained. Reading the pfns and flags is OK though.
Yah, I either have to build sgl here now and discard later if the range is not valid. Or, do it while holding the lock which increases contention period. So, settled on doing it here as it seemed better choice to me.
+int i915_svm_bind_mm(struct i915_address_space *vm) +{
- struct i915_svm *svm;
- struct mm_struct *mm;
- int ret = 0;
- mm = get_task_mm(current);
I don't think the get_task_mm(current) is needed, the mmget is already done for current->mm ?
No, I don't think mmget is already done for current->mm here.
Jason
On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
+static inline bool +i915_range_done(struct hmm_range *range) +{
- bool ret = hmm_range_valid(range);
- hmm_range_unregister(range);
- return ret;
+}
This needs to be updated to follow the new API, but this pattern is generally wrong, failure if hmm_range_valid should retry the range_fault and so forth. See the hmm.rst.
Thanks Jason for the feedback. Yah, will update as per new API in the next revision. The caller of this function is indeed retrying if the range is not valid. But I got the point. I made changes in my local build as per hmm.rst and it is working fine. Will include the change in next revision.
Generally speaking this locking approach is a live-lockable collision-retry scheme.
For maintainability it is best if the retry loop is explicit and doesn't unregister the range during the retry. I also encourage adding an absolute time bound to the retry as it *could* live lock.
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
Sorry, I ment calling hmm_range_register during fault processing.
If your driver works around user space objects that cover a VA then the range should be created when the object is created.
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here.
How did we get to device memory on a card? Isn't range->pfns a CPU PFN at this point?
I'm confused.
+int i915_svm_bind_mm(struct i915_address_space *vm) +{
- struct i915_svm *svm;
- struct mm_struct *mm;
- int ret = 0;
- mm = get_task_mm(current);
I don't think the get_task_mm(current) is needed, the mmget is already done for current->mm ?
No, I don't think mmget is already done for current->mm here.
I'm not certain, but I thought it is already done because it is 'current' and current cannot begin to destroy the mm while a syscall is currently running.
Jason
On Sat, Nov 23, 2019 at 11:53:52PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 08:44:18PM -0800, Niranjan Vishwanathapura wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
+static inline bool +i915_range_done(struct hmm_range *range) +{
- bool ret = hmm_range_valid(range);
- hmm_range_unregister(range);
- return ret;
+}
This needs to be updated to follow the new API, but this pattern is generally wrong, failure if hmm_range_valid should retry the range_fault and so forth. See the hmm.rst.
Thanks Jason for the feedback. Yah, will update as per new API in the next revision. The caller of this function is indeed retrying if the range is not valid. But I got the point. I made changes in my local build as per hmm.rst and it is working fine. Will include the change in next revision.
Generally speaking this locking approach is a live-lockable collision-retry scheme.
For maintainability it is best if the retry loop is explicit and doesn't unregister the range during the retry. I also encourage adding an absolute time bound to the retry as it *could* live lock.
Ok, thanks for the suggestion, will do.
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
Sorry, I ment calling hmm_range_register during fault processing.
If your driver works around user space objects that cover a VA then the range should be created when the object is created.
Oh ok. No, there is no user space object here. Binding the user space object to device page table is handled in patch 03 of this series (no HMM there). This is for binding a system allocated (malloc) memory. User calls the bind ioctl with the VA range.
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here.
How did we get to device memory on a card? Isn't range->pfns a CPU PFN at this point?
I'm confused.
Device memory plugin is done through devm_memremap_pages() in patch 07 of this series. In that patch, we convert the CPU PFN to device PFN before building the sgl (this is similar to the nouveau driver).
+int i915_svm_bind_mm(struct i915_address_space *vm) +{
- struct i915_svm *svm;
- struct mm_struct *mm;
- int ret = 0;
- mm = get_task_mm(current);
I don't think the get_task_mm(current) is needed, the mmget is already done for current->mm ?
No, I don't think mmget is already done for current->mm here.
I'm not certain, but I thought it is already done because it is 'current' and current cannot begin to destroy the mm while a syscall is currently running.
Ok, I don't know, at least the driver is not calling it. But it makes sense to me. I will remove get_task_mm and directly use current->mm.
Niranjana
Jason
On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
Sorry, I ment calling hmm_range_register during fault processing.
If your driver works around user space objects that cover a VA then the range should be created when the object is created.
Oh ok. No, there is no user space object here. Binding the user space object to device page table is handled in patch 03 of this series (no HMM there). This is for binding a system allocated (malloc) memory. User calls the bind ioctl with the VA range.
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here.
How did we get to device memory on a card? Isn't range->pfns a CPU PFN at this point?
I'm confused.
Device memory plugin is done through devm_memremap_pages() in patch 07 of this series. In that patch, we convert the CPU PFN to device PFN before building the sgl (this is similar to the nouveau driver).
But earlier just called hmm_range_fault(), it can return all kinds of pages. If these are only allowed to be device pages here then that must be checked (under lock)
And putting the cpu PFN of a ZONE_DEVICE device page into sg_dma_address still looks very wrong to me
Jason
On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
Sorry, I ment calling hmm_range_register during fault processing.
If your driver works around user space objects that cover a VA then the range should be created when the object is created.
Oh ok. No, there is no user space object here. Binding the user space object to device page table is handled in patch 03 of this series (no HMM there). This is for binding a system allocated (malloc) memory. User calls the bind ioctl with the VA range.
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here.
How did we get to device memory on a card? Isn't range->pfns a CPU PFN at this point?
I'm confused.
Device memory plugin is done through devm_memremap_pages() in patch 07 of this series. In that patch, we convert the CPU PFN to device PFN before building the sgl (this is similar to the nouveau driver).
But earlier just called hmm_range_fault(), it can return all kinds of pages. If these are only allowed to be device pages here then that must be checked (under lock)
Yah, let me add the check. (I kept is unchecked for debug purpose, forgot to add the check before sending the patches.)
And putting the cpu PFN of a ZONE_DEVICE device page into sg_dma_address still looks very wrong to me
The below call in patch 7 does convert any cpu PFN to device address. So, it won't be CPU PFN. i915_dmem_convert_pfn(vm->i915, &range);
Also, only reason to use sgl list is because i915 driver page table update functions takes an sgl, otherwise we can directly deal with range.pfns array.
Niranjana
Jason
On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
And putting the cpu PFN of a ZONE_DEVICE device page into sg_dma_address still looks very wrong to me
The below call in patch 7 does convert any cpu PFN to device address. So, it won't be CPU PFN. i915_dmem_convert_pfn(vm->i915, &range);
Well, then it should be done here in patch 6, surely?
Also, only reason to use sgl list is because i915 driver page table update functions takes an sgl, otherwise we can directly deal with range.pfns array.
Maybe that should be fixed instead of abusing sgl
Jason
On Tue, Nov 26, 2019 at 06:45:14PM +0000, Jason Gunthorpe wrote:
On Mon, Nov 25, 2019 at 08:32:58AM -0800, Niranjan Vishwanathapura wrote:
And putting the cpu PFN of a ZONE_DEVICE device page into sg_dma_address still looks very wrong to me
The below call in patch 7 does convert any cpu PFN to device address. So, it won't be CPU PFN. i915_dmem_convert_pfn(vm->i915, &range);
Well, then it should be done here in patch 6, surely?
Also, only reason to use sgl list is because i915 driver page table update functions takes an sgl, otherwise we can directly deal with range.pfns array.
Maybe that should be fixed instead of abusing sgl
Sorry, missed these comments.
Well, this i915 SVM support can be extended to work on intel integrated gfx also with host memory (no ZONE_DEVICE), though enabling it is not the primary focus for this patch series. There we need to dma map these CPU PFNs and assign to the sg_dma_address.
Device memory plugin for discreate graphics is added in patch 7 of this series, hence this convert_pfn function is added there.
Niranjana
Jason
On Mon, Nov 25, 2019 at 01:24:18PM +0000, Jason Gunthorpe wrote:
On Sun, Nov 24, 2019 at 01:12:47PM -0800, Niranjan Vishwanathapura wrote:
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Yah, not required. In my local build I tried with proper default_flags and set pfn_flags_mask to 0 and it is working fine.
Sorry, I ment calling hmm_range_register during fault processing.
If your driver works around user space objects that cover a VA then the range should be created when the object is created.
Oh ok. No, there is no user space object here. Binding the user space object to device page table is handled in patch 03 of this series (no HMM there). This is for binding a system allocated (malloc) memory. User calls the bind ioctl with the VA range.
- /*
* No needd to dma map the host pages and later unmap it, as
* GPU is not allowed to access it with SVM. Hence, no need
* of any intermediate data strucutre to hold the mappings.
*/
- for (i = 0; i < npages; i++) {
u64 addr = range->pfns[i] & ~((1UL << range->pfn_shift) - 1);
if (sg && (addr == (sg_dma_address(sg) + sg->length))) {
sg->length += PAGE_SIZE;
sg_dma_len(sg) += PAGE_SIZE;
continue;
}
if (sg)
sg_page_sizes |= sg->length;
sg = sg ? __sg_next(sg) : st->sgl;
sg_dma_address(sg) = addr;
sg_dma_len(sg) = PAGE_SIZE;
sg->length = PAGE_SIZE;
st->nents++;
It is odd to build the range into a sgl.
IMHO it is not a good idea to use the sg_dma_address like this, that should only be filled in by a dma map. Where does it end up being used?
The sgl is used to plug into the page table update function in i915.
For the device memory in discrete card, we don't need dma map which is the case here.
How did we get to device memory on a card? Isn't range->pfns a CPU PFN at this point?
I'm confused.
Device memory plugin is done through devm_memremap_pages() in patch 07 of this series. In that patch, we convert the CPU PFN to device PFN before building the sgl (this is similar to the nouveau driver).
But earlier just called hmm_range_fault(), it can return all kinds of pages. If these are only allowed to be device pages here then that must be checked (under lock)
And putting the cpu PFN of a ZONE_DEVICE device page into sg_dma_address still looks very wrong to me
Yeah, nouveau has different code path but this is because nouveau driver architecture allows it, i do not see any easy way to hammer this inside i915 current architecture. I will ponder on this a bit more.
Cheers, Jérôme
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
That being said, for best performance we still expect well behaving application to provide hint to kernel so that we know if a range of virtual address is likely to be use by the GPU or not. But this is not, and should not be a requirement.
I posted patchset and given talks about this, but long term i believe we want a common API to manage hint provided by userspace (see my talk at LPC this year about new syscall to bind memory to device). With such thing in place we could hang mmu notifier range to it. But the driver will still need to be able to handle the case where there is no hint provided by userspace and thus no before knowledge of what VA might be accessed.
Cheers, Jérôme
On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
That being said, for best performance we still expect well behaving application to provide hint to kernel so that we know if a range of virtual address is likely to be use by the GPU or not. But this is not, and should not be a requirement.
I posted patchset and given talks about this, but long term i believe we want a common API to manage hint provided by userspace (see my talk at LPC this year about new syscall to bind memory to device). With such thing in place we could hang mmu notifier range to it. But the driver will still need to be able to handle the case where there is no hint provided by userspace and thus no before knowledge of what VA might be accessed.
Thanks Jerome for the explanation. Will checkout your LPC talk. Yes I agree. When GPU faulting support is available, driver will handle the fault, migrate page if needed and bind the page using HMM. This patch series adds support for prefetch and bind hints (via explicit ioctls). Also, patch 12 of the series provides the ability to enable/disable SVM on a per VM basis for user, and by default SVM is disabled.
Niranjana
Cheers, Jérôme
On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
No, RDMA has the full address space option as well, it is called 'implicit ODP'
This is implemented by registering ranges at a level in our page table (currently 512G) whenever that level has populated pages below it.
I think is a better strategy than temporary ranges.
But other GPU drivers like AMD are using BO and TTM objects with fixed VA ranges and the range is tied to the BO/TTM.
I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
Jason
On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
No, RDMA has the full address space option as well, it is called 'implicit ODP'
This is implemented by registering ranges at a level in our page table (currently 512G) whenever that level has populated pages below it.
I think is a better strategy than temporary ranges.
But other GPU drivers like AMD are using BO and TTM objects with fixed VA ranges and the range is tied to the BO/TTM.
I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
I don't know the full details of the HMM usecases in amd and nouveau. AMD seems to be using it for usrptr objects which is tied to a BO. I am not sure if nouveau has any BO tied to these address ranges.
Niranjana
Jason
On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
No, RDMA has the full address space option as well, it is called 'implicit ODP'
This is implemented by registering ranges at a level in our page table (currently 512G) whenever that level has populated pages below it.
I think is a better strategy than temporary ranges.
HMM original design did not have range and was well suited to nouveau. Recent changes make it more tie to the range and less suited to nouveau. I would not consider 512GB implicit range as good thing. Plan i have is to create implicit range and align them on vma. I do not know when i will have time to get to that.
But other GPU drivers like AMD are using BO and TTM objects with fixed VA ranges and the range is tied to the BO/TTM.
I'm not sure if this i915 usage is closer to AMD or closer to nouveau.
I don't know the full details of the HMM usecases in amd and nouveau. AMD seems to be using it for usrptr objects which is tied to a BO. I am not sure if nouveau has any BO tied to these address ranges.
It is closer to nouveau, AMD usage is old userptr usecase where you have a BO tie to range. While SVM means any valid CPU address and thus imply that there is no BO tie to it (there is still BO usecase that must still work here).
Cheers, Jérôme
On Wed, Dec 04, 2019 at 04:51:36PM -0500, Jerome Glisse wrote:
On Tue, Dec 03, 2019 at 11:19:43AM -0800, Niranjan Vishwanathapura wrote:
On Tue, Nov 26, 2019 at 06:32:52PM +0000, Jason Gunthorpe wrote:
On Mon, Nov 25, 2019 at 11:33:27AM -0500, Jerome Glisse wrote:
On Fri, Nov 22, 2019 at 11:33:12PM +0000, Jason Gunthorpe wrote:
On Fri, Nov 22, 2019 at 12:57:27PM -0800, Niranjana Vishwanathapura wrote:
[...]
+static int +i915_range_fault(struct i915_svm *svm, struct hmm_range *range) +{
- long ret;
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, &svm->mirror);
- if (ret) {
up_read(&svm->mm->mmap_sem);
return (int)ret;
- }
Using a temporary range is the pattern from nouveau, is it really necessary in this driver?
Just to comment on this, for GPU the usage model is not application register range of virtual address it wants to use. It is GPU can access _any_ CPU valid address just like the CPU would (modulo mmap of device file).
This is because the API you want in userspace is application passing random pointer to the GPU and GPU being able to chase down any kind of random pointer chain (assuming all valid ie pointing to valid virtual address for the process).
This is unlike the RDMA case.
No, RDMA has the full address space option as well, it is called 'implicit ODP'
This is implemented by registering ranges at a level in our page table (currently 512G) whenever that level has populated pages below it.
I think is a better strategy than temporary ranges.
HMM original design did not have range and was well suited to nouveau. Recent changes make it more tie to the range and less suited to nouveau. I would not consider 512GB implicit range as good thing. Plan i have is to create implicit range and align them on vma. I do not know when i will have time to get to that.
For mlx5 the 512G is aligned to the page table levels, so it is a reasonable approximation. GPU could do the same.
Not sure VMA related is really any better..
Jason
Plugin device memory through HMM as DEVICE_PRIVATE. Add support functions to allocate pages and free pages from device memory. Implement ioctl to migrate pages from host to device memory. For now, only support migrating pages from host memory to device memory.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/Kconfig | 9 + drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.c | 13 - drivers/gpu/drm/i915/i915_buddy.h | 12 + drivers/gpu/drm/i915/i915_drv.c | 1 + drivers/gpu/drm/i915/i915_svm.c | 2 + drivers/gpu/drm/i915/i915_svm.h | 15 + drivers/gpu/drm/i915/i915_svm_devmem.c | 391 +++++++++++++++++++++ drivers/gpu/drm/i915/intel_memory_region.h | 14 + drivers/gpu/drm/i915/intel_region_lmem.c | 10 + 10 files changed, 456 insertions(+), 14 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_svm_devmem.c
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 689e57fe3973..66337f2ca2bf 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -141,9 +141,18 @@ config DRM_I915_SVM bool "Enable Shared Virtual Memory support in i915" depends on STAGING depends on DRM_I915 + depends on ARCH_ENABLE_MEMORY_HOTPLUG + depends on ARCH_ENABLE_MEMORY_HOTREMOVE + depends on MEMORY_HOTPLUG + depends on MEMORY_HOTREMOVE + depends on ARCH_HAS_PTE_DEVMAP + depends on SPARSEMEM_VMEMMAP + depends on ZONE_DEVICE + depends on DEVICE_PRIVATE depends on MMU select HMM_MIRROR select MMU_NOTIFIER + select MIGRATE_VMA_HELPER default n help Choose this option if you want Shared Virtual Memory (SVM) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 7d4cd9eefd12..b574ec31ea2e 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -155,7 +155,8 @@ i915-y += \
# SVM code i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ - i915_svm.o + i915_svm.o \ + i915_svm_devmem.o
# general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index a7dee1b749cb..dd88fa87b7fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -480,19 +480,6 @@ int __init i915_global_objects_init(void) return 0; }
-static enum intel_region_id -__region_id(u32 region) -{ - enum intel_region_id id; - - for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) { - if (intel_region_map[id] == region) - return id; - } - - return INTEL_REGION_UNKNOWN; -} - bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) { diff --git a/drivers/gpu/drm/i915/i915_buddy.h b/drivers/gpu/drm/i915/i915_buddy.h index ed41f3507cdc..afc493e6c130 100644 --- a/drivers/gpu/drm/i915/i915_buddy.h +++ b/drivers/gpu/drm/i915/i915_buddy.h @@ -9,6 +9,9 @@ #include <linux/bitops.h> #include <linux/list.h>
+/* 512 bits (one per pages) supports 2MB blocks */ +#define I915_BUDDY_MAX_PAGES 512 + struct i915_buddy_block { #define I915_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12) #define I915_BUDDY_HEADER_STATE GENMASK_ULL(11, 10) @@ -32,6 +35,15 @@ struct i915_buddy_block { */ struct list_head link; struct list_head tmp_link; + + unsigned long pfn_first; + /* + * FIXME: There are other alternatives to bitmap. Like splitting the + * block into contiguous 4K sized blocks. But it is part of bigger + * issues involving partially invalidating large mapping, freeing the + * blocks etc., revisit. + */ + unsigned long bitmap[BITS_TO_LONGS(I915_BUDDY_MAX_PAGES)]; };
#define I915_BUDDY_MAX_ORDER I915_BUDDY_HEADER_ORDER diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index c190df614c48..740b4b9d39a8 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2771,6 +2771,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_BIND, i915_bind_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_SVM_MIGRATE, i915_svm_migrate_ioctl, DRM_RENDER_ALLOW) };
static struct drm_driver driver = { diff --git a/drivers/gpu/drm/i915/i915_svm.c b/drivers/gpu/drm/i915/i915_svm.c index 4e414f257121..fe7d53634606 100644 --- a/drivers/gpu/drm/i915/i915_svm.c +++ b/drivers/gpu/drm/i915/i915_svm.c @@ -206,6 +206,8 @@ int i915_svm_bind(struct i915_address_space *vm, struct drm_i915_bind *args) if (unlikely(ret)) goto vma_done;
+ /* XXX: Assuming the range is exclusively LMEM or SMEM, fix it */ + i915_dmem_convert_pfn(vm->i915, &range); sg_page_sizes = i915_svm_build_sg(vm, &range, &st);
mutex_lock(&svm->mutex); diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h index f176f1dc493f..a1b62997e925 100644 --- a/drivers/gpu/drm/i915/i915_svm.h +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -33,6 +33,14 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) return vm->svm; }
+void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, + struct hmm_range *range); +int i915_svm_migrate_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv); +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, + u64 size); +void i915_svm_devmem_remove(struct i915_devmem *devmem); + #else
struct i915_svm { }; @@ -45,6 +53,13 @@ static inline int i915_svm_bind_mm(struct i915_address_space *vm) static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) { return false; }
+static inline int i915_svm_migrate_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ return -ENOTSUPP; } +static inline +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size) +{ return NULL; } +static inline void i915_svm_devmem_remove(struct i915_devmem *devmem) { } #endif
#endif /* __I915_SVM_H */ diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c new file mode 100644 index 000000000000..40c2f79ff614 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -0,0 +1,391 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include <linux/mm_types.h> +#include <linux/sched/mm.h> + +#include "i915_svm.h" +#include "intel_memory_region.h" + +struct i915_devmem_migrate { + struct drm_i915_private *i915; + struct migrate_vma *args; + + enum intel_region_id src_id; + enum intel_region_id dst_id; + u64 npages; +}; + +struct i915_devmem { + struct drm_i915_private *i915; + struct dev_pagemap pagemap; + unsigned long pfn_first; + unsigned long pfn_last; +}; + +static inline bool +i915_dmem_page(struct drm_i915_private *dev_priv, struct page *page) +{ + if (!is_device_private_page(page)) + return false; + + return true; +} + +void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, + struct hmm_range *range) +{ + unsigned long i, npages; + + npages = (range->end - range->start) >> PAGE_SHIFT; + for (i = 0; i < npages; ++i) { + struct i915_buddy_block *block; + struct intel_memory_region *mem; + struct page *page; + u64 addr; + + page = hmm_device_entry_to_page(range, range->pfns[i]); + if (!page) + continue; + + if (!(range->pfns[i] & range->flags[HMM_PFN_DEVICE_PRIVATE])) + continue; + + if (!i915_dmem_page(dev_priv, page)) { + WARN(1, "Some unknown device memory !\n"); + range->pfns[i] = 0; + continue; + } + + block = page->zone_device_data; + mem = block->private; + addr = mem->region.start + + i915_buddy_block_offset(block); + addr += (page_to_pfn(page) - block->pfn_first) << PAGE_SHIFT; + + range->pfns[i] &= ~range->flags[HMM_PFN_DEVICE_PRIVATE]; + range->pfns[i] &= ((1UL << range->pfn_shift) - 1); + range->pfns[i] |= (addr >> PAGE_SHIFT) << range->pfn_shift; + } +} + +static int +i915_devmem_page_alloc_locked(struct intel_memory_region *mem, + unsigned long npages, + struct list_head *blocks) +{ + unsigned long size = ALIGN((npages * PAGE_SIZE), mem->mm.chunk_size); + struct i915_buddy_block *block; + int ret; + + INIT_LIST_HEAD(blocks); + ret = __intel_memory_region_get_pages_buddy(mem, size, 0, blocks); + if (unlikely(ret)) + goto alloc_failed; + + list_for_each_entry(block, blocks, link) { + block->pfn_first = mem->devmem->pfn_first; + block->pfn_first += i915_buddy_block_offset(block) / + PAGE_SIZE; + bitmap_zero(block->bitmap, I915_BUDDY_MAX_PAGES); + DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n", + "Allocated block", block->pfn_first, + i915_buddy_block_offset(block), + i915_buddy_block_size(&mem->mm, block)); + } + +alloc_failed: + return ret; +} + +static struct page * +i915_devmem_page_get_locked(struct intel_memory_region *mem, + struct list_head *blocks) +{ + struct i915_buddy_block *block, *on; + + list_for_each_entry_safe(block, on, blocks, link) { + unsigned long weight, max; + unsigned long i, pfn; + struct page *page; + + max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE; + i = find_first_zero_bit(block->bitmap, max); + if (unlikely(i == max)) { + WARN(1, "Getting a page should have never failed\n"); + break; + } + + set_bit(i, block->bitmap); + pfn = block->pfn_first + i; + page = pfn_to_page(pfn); + get_page(page); + lock_page(page); + page->zone_device_data = block; + weight = bitmap_weight(block->bitmap, max); + if (weight == max) + list_del_init(&block->link); + DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n", + "Allocated page", pfn, weight); + return page; + } + return NULL; +} + +static void +i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, + struct page *page) +{ + unlock_page(page); + put_page(page); +} + +static int +i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) +{ + struct drm_i915_private *i915 = migrate->i915; + struct migrate_vma *args = migrate->args; + struct intel_memory_region *mem; + struct list_head blocks = {0}; + unsigned long i, npages, cnt; + struct page *page; + int ret; + + npages = (args->end - args->start) >> PAGE_SHIFT; + DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages); + + /* Check source pages */ + for (i = 0, cnt = 0; i < npages; i++) { + args->dst[i] = 0; + page = migrate_pfn_to_page(args->src[i]); + if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) + continue; + + args->dst[i] = MIGRATE_PFN_VALID; + cnt++; + } + + if (!cnt) { + ret = -ENOMEM; + goto migrate_out; + } + + mem = i915->mm.regions[migrate->dst_id]; + ret = i915_devmem_page_alloc_locked(mem, cnt, &blocks); + if (unlikely(ret)) + goto migrate_out; + + /* Allocate device memory */ + for (i = 0, cnt = 0; i < npages; i++) { + if (!args->dst[i]) + continue; + + page = i915_devmem_page_get_locked(mem, &blocks); + if (unlikely(!page)) { + WARN(1, "Failed to get dst page\n"); + args->dst[i] = 0; + continue; + } + + cnt++; + args->dst[i] = migrate_pfn(page_to_pfn(page)) | + MIGRATE_PFN_LOCKED; + } + + if (!cnt) { + ret = -ENOMEM; + goto migrate_out; + } + + /* Copy the pages */ + migrate->npages = npages; +migrate_out: + if (unlikely(ret)) { + for (i = 0; i < npages; i++) { + if (args->dst[i] & MIGRATE_PFN_LOCKED) { + page = migrate_pfn_to_page(args->dst[i]); + i915_devmem_page_free_locked(i915, page); + } + args->dst[i] = 0; + } + } + + return ret; +} + +void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate) +{ + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); +} + +static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate) +{ + int ret; + + ret = i915_devmem_migrate_alloc_and_copy(migrate); + if (!ret) { + migrate_vma_pages(migrate->args); + i915_devmem_migrate_finalize_and_map(migrate); + } + migrate_vma_finalize(migrate->args); +} + +int i915_devmem_migrate_vma(struct intel_memory_region *mem, + struct vm_area_struct *vma, + unsigned long start, + unsigned long end) +{ + unsigned long npages = (end - start) >> PAGE_SHIFT; + unsigned long max = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages); + struct i915_devmem_migrate migrate = {0}; + struct migrate_vma args = { + .vma = vma, + .start = start, + }; + unsigned long c, i; + int ret = 0; + + /* XXX: Opportunistically migrate additional pages? */ + DRM_DEBUG_DRIVER("start 0x%lx end 0x%lx\n", start, end); + args.src = kcalloc(max, sizeof(args.src), GFP_KERNEL); + if (unlikely(!args.src)) + return -ENOMEM; + + args.dst = kcalloc(max, sizeof(args.dst), GFP_KERNEL); + if (unlikely(!args.dst)) { + kfree(args.src); + return -ENOMEM; + } + + /* XXX: Support migrating from LMEM to SMEM */ + migrate.args = &args; + migrate.i915 = mem->i915; + migrate.src_id = INTEL_REGION_SMEM; + migrate.dst_id = MEMORY_TYPE_FROM_REGION(mem->id); + for (i = 0; i < npages; i += c) { + c = min_t(unsigned long, I915_BUDDY_MAX_PAGES, npages); + args.end = start + (c << PAGE_SHIFT); + ret = migrate_vma_setup(&args); + if (unlikely(ret)) + goto migrate_done; + if (args.cpages) + i915_devmem_migrate_chunk(&migrate); + args.start = args.end; + } +migrate_done: + kfree(args.dst); + kfree(args.src); + return ret; +} + +static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + +static void i915_devmem_page_free(struct page *page) +{ + struct i915_buddy_block *block = page->zone_device_data; + struct intel_memory_region *mem = block->private; + unsigned long i, max, weight; + + max = i915_buddy_block_size(&mem->mm, block) / PAGE_SIZE; + i = page_to_pfn(page) - block->pfn_first; + clear_bit(i, block->bitmap); + weight = bitmap_weight(block->bitmap, max); + DRM_DEBUG_DRIVER("%s pfn 0x%lx block weight 0x%lx\n", + "Freeing page", page_to_pfn(page), weight); + if (!weight) { + DRM_DEBUG_DRIVER("%s pfn_first 0x%lx off 0x%llx size 0x%llx\n", + "Freeing block", block->pfn_first, + i915_buddy_block_offset(block), + i915_buddy_block_size(&mem->mm, block)); + __intel_memory_region_put_block_buddy(block); + } +} + +static const struct dev_pagemap_ops i915_devmem_pagemap_ops = { + .page_free = i915_devmem_page_free, + .migrate_to_ram = i915_devmem_migrate_to_ram, +}; + +struct i915_devmem *i915_svm_devmem_add(struct drm_i915_private *i915, u64 size) +{ + struct device *dev = &i915->drm.pdev->dev; + struct i915_devmem *devmem; + struct resource *res; + + devmem = kzalloc(sizeof(*devmem), GFP_KERNEL); + if (!devmem) + return NULL; + + devmem->i915 = i915; + res = devm_request_free_mem_region(dev, &iomem_resource, size); + if (IS_ERR(res)) + goto out_free; + + devmem->pagemap.type = MEMORY_DEVICE_PRIVATE; + devmem->pagemap.res = *res; + devmem->pagemap.ops = &i915_devmem_pagemap_ops; + if (IS_ERR(devm_memremap_pages(dev, &devmem->pagemap))) + goto out_free; + + devmem->pfn_first = res->start >> PAGE_SHIFT; + devmem->pfn_last = res->end >> PAGE_SHIFT; + return devmem; +out_free: + kfree(devmem); + return NULL; +} + +void i915_svm_devmem_remove(struct i915_devmem *devmem) +{ + /* XXX: Is it the right way to release? */ + release_resource(&devmem->pagemap.res); + kfree(devmem); +} + +int i915_svm_migrate_ioctl(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_i915_private *i915 = to_i915(dev); + struct drm_i915_svm_migrate *args = data; + unsigned long addr, end, size = args->length; + struct intel_memory_region *mem; + enum intel_region_id id; + struct mm_struct *mm; + + DRM_DEBUG_DRIVER("start 0x%llx length 0x%llx region 0x%x\n", + args->start, args->length, args->region); + id = __region_id(args->region); + if ((MEMORY_TYPE_FROM_REGION(args->region) != INTEL_MEMORY_LOCAL) || + id == INTEL_REGION_UNKNOWN) + return -EINVAL; + + mem = i915->mm.regions[id]; + + mm = get_task_mm(current); + down_read(&mm->mmap_sem); + + for (addr = args->start, end = args->start + size; addr < end;) { + struct vm_area_struct *vma; + unsigned long next; + + vma = find_vma_intersection(mm, addr, end); + if (!vma) + break; + + addr &= PAGE_MASK; + next = min(vma->vm_end, end); + next = round_up(next, PAGE_SIZE); + /* This is a best effort so we ignore errors */ + i915_devmem_migrate_vma(mem, vma, addr, next); + addr = next; + } + + up_read(&mm->mmap_sem); + mmput(mm); + return 0; +} diff --git a/drivers/gpu/drm/i915/intel_memory_region.h b/drivers/gpu/drm/i915/intel_memory_region.h index 4622c086c06d..95e1eff0c0b0 100644 --- a/drivers/gpu/drm/i915/intel_memory_region.h +++ b/drivers/gpu/drm/i915/intel_memory_region.h @@ -71,6 +71,7 @@ struct intel_memory_region_ops { struct intel_memory_region { struct drm_i915_private *i915;
+ struct i915_devmem *devmem; const struct intel_memory_region_ops *ops;
struct io_mapping iomap; @@ -100,6 +101,19 @@ struct intel_memory_region { } objects; };
+static inline enum intel_region_id +__region_id(u32 region) +{ + enum intel_region_id id; + + for (id = 0; id < INTEL_REGION_UNKNOWN; ++id) { + if (intel_region_map[id] == region) + return id; + } + + return INTEL_REGION_UNKNOWN; +} + int intel_memory_region_init_buddy(struct intel_memory_region *mem); void intel_memory_region_release_buddy(struct intel_memory_region *mem);
diff --git a/drivers/gpu/drm/i915/intel_region_lmem.c b/drivers/gpu/drm/i915/intel_region_lmem.c index eddb392917aa..2ba4a4720eb6 100644 --- a/drivers/gpu/drm/i915/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/intel_region_lmem.c @@ -4,6 +4,7 @@ */
#include "i915_drv.h" +#include "i915_svm.h" #include "intel_memory_region.h" #include "gem/i915_gem_lmem.h" #include "gem/i915_gem_region.h" @@ -66,6 +67,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem) static void region_lmem_release(struct intel_memory_region *mem) { + i915_svm_devmem_remove(mem->devmem); release_fake_lmem_bar(mem); io_mapping_fini(&mem->iomap); intel_memory_region_release_buddy(mem); @@ -122,6 +124,14 @@ intel_setup_fake_lmem(struct drm_i915_private *i915) PAGE_SIZE, io_start, &intel_region_lmem_ops); + if (!IS_ERR(mem)) { + mem->devmem = i915_svm_devmem_add(i915, mappable_end); + if (IS_ERR(mem->devmem)) { + intel_memory_region_put(mem); + mem = ERR_CAST(mem->devmem); + } + } + if (!IS_ERR(mem)) { DRM_INFO("Intel graphics fake LMEM: %pR\n", &mem->region); DRM_INFO("Intel graphics fake LMEM IO start: %llx\n",
As PCIe is non-coherent link, do not allow direct memory access across PCIe link. Handle CPU fault by migrating pages back to host memory.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/i915_svm_devmem.c | 70 +++++++++++++++++++++++++- 1 file changed, 69 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 40c2f79ff614..37a205d3a82f 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -280,9 +280,77 @@ int i915_devmem_migrate_vma(struct intel_memory_region *mem, return ret; }
+static vm_fault_t +i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) +{ + struct device *dev = migrate->i915->drm.dev; + struct migrate_vma *args = migrate->args; + struct page *dpage, *spage; + + DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); + /* Allocate host page */ + spage = migrate_pfn_to_page(args->src[0]); + if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE))) + return 0; + + dpage = alloc_page_vma(GFP_HIGHUSER, args->vma, args->start); + if (unlikely(!dpage)) + return VM_FAULT_SIGBUS; + lock_page(dpage); + + args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED; + + /* Copy the pages */ + migrate->npages = 1; + + return 0; +} + +void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate) +{ + DRM_DEBUG_DRIVER("\n"); +} + +static inline struct i915_devmem *page_to_devmem(struct page *page) +{ + return container_of(page->pgmap, struct i915_devmem, pagemap); +} + static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) { - return VM_FAULT_SIGBUS; + struct i915_devmem *devmem = page_to_devmem(vmf->page); + struct drm_i915_private *i915 = devmem->i915; + struct i915_devmem_migrate migrate = {0}; + unsigned long src = 0, dst = 0; + vm_fault_t ret; + struct migrate_vma args = { + .vma = vmf->vma, + .start = vmf->address, + .end = vmf->address + PAGE_SIZE, + .src = &src, + .dst = &dst, + }; + + /* XXX: Opportunistically migrate more pages? */ + DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start); + migrate.i915 = i915; + migrate.args = &args; + migrate.src_id = INTEL_REGION_LMEM; + migrate.dst_id = INTEL_REGION_SMEM; + if (migrate_vma_setup(&args) < 0) + return VM_FAULT_SIGBUS; + if (!args.cpages) + return 0; + + ret = i915_devmem_fault_alloc_and_copy(&migrate); + if (ret || dst == 0) + goto done; + + migrate_vma_pages(&args); + i915_devmem_fault_finalize_and_map(&migrate); +done: + migrate_vma_finalize(&args); + return ret; }
static void i915_devmem_page_free(struct page *page)
Copy the pages duing SVM migration using memcpy().
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/i915_svm_devmem.c | 72 ++++++++++++++++++++++++++ 1 file changed, 72 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 37a205d3a82f..960b04091e44 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -142,6 +142,69 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, put_page(page); }
+static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate) +{ + const unsigned long *src = migrate->args->src; + unsigned long *dst = migrate->args->dst; + struct drm_i915_private *i915 = migrate->i915; + struct intel_memory_region *mem; + void *src_vaddr, *dst_vaddr; + u64 src_addr, dst_addr; + struct page *page; + int i, ret = 0; + + /* XXX: Copy multiple pages at a time */ + for (i = 0; !ret && i < migrate->npages; i++) { + if (!dst[i]) + continue; + + page = migrate_pfn_to_page(dst[i]); + mem = i915->mm.regions[migrate->dst_id]; + dst_addr = page_to_pfn(page); + if (migrate->dst_id != INTEL_REGION_SMEM) + dst_addr -= mem->devmem->pfn_first; + dst_addr <<= PAGE_SHIFT; + + if (migrate->dst_id == INTEL_REGION_SMEM) + dst_vaddr = phys_to_virt(dst_addr); + else + dst_vaddr = io_mapping_map_atomic_wc(&mem->iomap, + dst_addr); + if (unlikely(!dst_vaddr)) + return -ENOMEM; + + page = migrate_pfn_to_page(src[i]); + mem = i915->mm.regions[migrate->src_id]; + src_addr = page_to_pfn(page); + if (migrate->src_id != INTEL_REGION_SMEM) + src_addr -= mem->devmem->pfn_first; + src_addr <<= PAGE_SHIFT; + + if (migrate->src_id == INTEL_REGION_SMEM) + src_vaddr = phys_to_virt(src_addr); + else + src_vaddr = io_mapping_map_atomic_wc(&mem->iomap, + src_addr); + + if (likely(src_vaddr)) + memcpy(dst_vaddr, src_vaddr, PAGE_SIZE); + else + ret = -ENOMEM; + + if (migrate->dst_id != INTEL_REGION_SMEM) + io_mapping_unmap_atomic(dst_vaddr); + + if (migrate->src_id != INTEL_REGION_SMEM && src_vaddr) + io_mapping_unmap_atomic(src_vaddr); + + DRM_DEBUG_DRIVER("src [%d] 0x%llx, dst [%d] 0x%llx\n", + migrate->src_id, src_addr, + migrate->dst_id, dst_addr); + } + + return ret; +} + static int i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) { @@ -201,6 +264,8 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
/* Copy the pages */ migrate->npages = npages; + /* XXX: Flush the caches? */ + ret = i915_migrate_cpu_copy(migrate); migrate_out: if (unlikely(ret)) { for (i = 0; i < npages; i++) { @@ -286,6 +351,7 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) struct device *dev = migrate->i915->drm.dev; struct migrate_vma *args = migrate->args; struct page *dpage, *spage; + int err;
DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); /* Allocate host page */ @@ -302,6 +368,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
/* Copy the pages */ migrate->npages = 1; + err = i915_migrate_cpu_copy(migrate); + if (unlikely(err)) { + __free_page(dpage); + args->dst[0] = 0; + return VM_FAULT_SIGBUS; + }
return 0; }
Add support function to blitter copy SVM VAs without requiring any gem objects. Also add function to wait for completion of the copy.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/Makefile | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 3 + drivers/gpu/drm/i915/gem/i915_gem_wait.c | 2 +- drivers/gpu/drm/i915/i915_svm.h | 6 + drivers/gpu/drm/i915/i915_svm_copy.c | 172 +++++++++++++++++++++ 5 files changed, 184 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/i915/i915_svm_copy.c
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index b574ec31ea2e..97d40172bf27 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -156,7 +156,8 @@ i915-y += \ # SVM code i915-$(CONFIG_DRM_I915_SVM) += gem/i915_gem_svm.o \ i915_svm.o \ - i915_svm_devmem.o + i915_svm_devmem.o \ + i915_svm_copy.o
# general-purpose microcontroller (GuC) support obj-y += gt/uc/ diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index eec31d9a4fa2..3fe697659366 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -481,6 +481,9 @@ static inline void __start_cpu_write(struct drm_i915_gem_object *obj) int i915_gem_object_wait(struct drm_i915_gem_object *obj, unsigned int flags, long timeout); +long i915_gem_object_wait_fence(struct dma_fence *fence, + unsigned int flags, + long timeout); int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj, unsigned int flags, const struct i915_sched_attr *attr); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index 8af55cd3e690..b7905aa8f821 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -12,7 +12,7 @@ #include "i915_gem_ioctls.h" #include "i915_gem_object.h"
-static long +long i915_gem_object_wait_fence(struct dma_fence *fence, unsigned int flags, long timeout) diff --git a/drivers/gpu/drm/i915/i915_svm.h b/drivers/gpu/drm/i915/i915_svm.h index a1b62997e925..5d28e531f737 100644 --- a/drivers/gpu/drm/i915/i915_svm.h +++ b/drivers/gpu/drm/i915/i915_svm.h @@ -33,6 +33,12 @@ static inline bool i915_vm_is_svm_enabled(struct i915_address_space *vm) return vm->svm; }
+int i915_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 size, + struct dma_fence **fence); +int i915_svm_copy_blt_wait(struct drm_i915_private *i915, + struct dma_fence *fence); + void i915_dmem_convert_pfn(struct drm_i915_private *dev_priv, struct hmm_range *range); int i915_svm_migrate_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/i915_svm_copy.c b/drivers/gpu/drm/i915/i915_svm_copy.c new file mode 100644 index 000000000000..42f7d563f6b4 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_svm_copy.c @@ -0,0 +1,172 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "i915_drv.h" +#include "gem/i915_gem_object_blt.h" +#include "gt/intel_engine_pm.h" +#include "gt/intel_engine_pool.h" +#include "gt/intel_gpu_commands.h" +#include "gt/intel_gt.h" + +static struct i915_vma * +intel_emit_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 buff_size) +{ + struct drm_i915_private *i915 = ce->vm->i915; + const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */ + struct intel_engine_pool_node *pool; + struct i915_vma *batch; + u64 count, rem; + u32 size, *cmd; + int err; + + GEM_BUG_ON(intel_engine_is_virtual(ce->engine)); + intel_engine_pm_get(ce->engine); + + if (INTEL_GEN(i915) < 8) + return ERR_PTR(-ENOTSUPP); + + count = div_u64(round_up(buff_size, block_size), block_size); + size = (1 + 11 * count) * sizeof(u32); + size = round_up(size, PAGE_SIZE); + pool = intel_engine_get_pool(ce->engine, size); + if (IS_ERR(pool)) { + err = PTR_ERR(pool); + goto out_pm; + } + + cmd = i915_gem_object_pin_map(pool->obj, I915_MAP_FORCE_WC); + if (IS_ERR(cmd)) { + err = PTR_ERR(cmd); + goto out_put; + } + + rem = buff_size; + do { + size = min_t(u64, rem, block_size); + GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX); + + if (INTEL_GEN(i915) >= 9) { + *cmd++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2); + *cmd++ = BLT_DEPTH_32 | PAGE_SIZE; + *cmd++ = 0; + *cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cmd++ = lower_32_bits(dst_start); + *cmd++ = upper_32_bits(dst_start); + *cmd++ = 0; + *cmd++ = PAGE_SIZE; + *cmd++ = lower_32_bits(src_start); + *cmd++ = upper_32_bits(src_start); + } else if (INTEL_GEN(i915) >= 8) { + *cmd++ = XY_SRC_COPY_BLT_CMD | + BLT_WRITE_RGBA | (10 - 2); + *cmd++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | PAGE_SIZE; + *cmd++ = 0; + *cmd++ = size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4; + *cmd++ = lower_32_bits(dst_start); + *cmd++ = upper_32_bits(dst_start); + *cmd++ = 0; + *cmd++ = PAGE_SIZE; + *cmd++ = lower_32_bits(src_start); + *cmd++ = upper_32_bits(src_start); + } + + /* Allow ourselves to be preempted in between blocks */ + *cmd++ = MI_ARB_CHECK; + + src_start += size; + dst_start += size; + rem -= size; + } while (rem); + + *cmd = MI_BATCH_BUFFER_END; + intel_gt_chipset_flush(ce->vm->gt); + + i915_gem_object_unpin_map(pool->obj); + + batch = i915_vma_instance(pool->obj, ce->vm, NULL); + if (IS_ERR(batch)) { + err = PTR_ERR(batch); + goto out_put; + } + + err = i915_vma_pin(batch, 0, 0, PIN_USER); + if (unlikely(err)) + goto out_put; + + batch->private = pool; + return batch; + +out_put: + intel_engine_pool_put(pool); +out_pm: + intel_engine_pm_put(ce->engine); + return ERR_PTR(err); +} + +int i915_svm_copy_blt(struct intel_context *ce, + u64 src_start, u64 dst_start, u64 size, + struct dma_fence **fence) +{ + struct drm_i915_private *i915 = ce->gem_context->i915; + struct i915_request *rq; + struct i915_vma *batch; + int err; + + DRM_DEBUG_DRIVER("src_start 0x%llx dst_start 0x%llx size 0x%llx\n", + src_start, dst_start, size); + mutex_lock(&i915->drm.struct_mutex); + batch = intel_emit_svm_copy_blt(ce, src_start, dst_start, size); + if (IS_ERR(batch)) { + err = PTR_ERR(batch); + goto out_unlock; + } + + rq = i915_request_create(ce); + if (IS_ERR(rq)) { + err = PTR_ERR(rq); + goto out_batch; + } + + err = intel_emit_vma_mark_active(batch, rq); + if (unlikely(err)) + goto out_request; + + if (ce->engine->emit_init_breadcrumb) { + err = ce->engine->emit_init_breadcrumb(rq); + if (unlikely(err)) + goto out_request; + } + + err = rq->engine->emit_bb_start(rq, + batch->node.start, batch->node.size, + 0); +out_request: + if (unlikely(err)) + i915_request_skip(rq, err); + else + *fence = &rq->fence; + + i915_request_add(rq); +out_batch: + intel_emit_vma_release(ce, batch); +out_unlock: + mutex_unlock(&i915->drm.struct_mutex); + return err; +} + +int i915_svm_copy_blt_wait(struct drm_i915_private *i915, + struct dma_fence *fence) +{ + long timeout; + + mutex_lock(&i915->drm.struct_mutex); + timeout = i915_gem_object_wait_fence(fence, + I915_WAIT_INTERRUPTIBLE | + I915_WAIT_ALL, + MAX_SCHEDULE_TIMEOUT); + mutex_unlock(&i915->drm.struct_mutex); + return (timeout < 0) ? timeout : 0; +}
Use blitter engine to copy pages during migration. As blitter context virtual address space is shared with other flows, ensure virtual address are allocated properly from that address space. Also ensure completion of blitter copy by waiting on the fence of the issued request.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/i915_svm_devmem.c | 249 ++++++++++++++++++++++++- 1 file changed, 245 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_svm_devmem.c b/drivers/gpu/drm/i915/i915_svm_devmem.c index 960b04091e44..4f772983679b 100644 --- a/drivers/gpu/drm/i915/i915_svm_devmem.c +++ b/drivers/gpu/drm/i915/i915_svm_devmem.c @@ -15,7 +15,15 @@ struct i915_devmem_migrate {
enum intel_region_id src_id; enum intel_region_id dst_id; + dma_addr_t *host_dma; + bool blitter_copy; u64 npages; + + /* required for blitter copy */ + struct drm_mm_node src_node; + struct drm_mm_node dst_node; + struct intel_context *ce; + struct dma_fence *fence; };
struct i915_devmem { @@ -142,6 +150,139 @@ i915_devmem_page_free_locked(struct drm_i915_private *dev_priv, put_page(page); }
+static int i915_devmem_bind_addr(struct i915_devmem_migrate *migrate, + bool is_src) +{ + struct i915_gem_context *ctx = migrate->ce->gem_context; + struct drm_i915_private *i915 = migrate->i915; + struct i915_address_space *vm = ctx->vm; + struct intel_memory_region *mem; + u64 npages = migrate->npages; + enum intel_region_id mem_id; + struct drm_mm_node *node; + struct scatterlist *sg; + u32 sg_page_sizes = 0; + struct sg_table st; + u64 flags = 0; + int i, ret; + + if (unlikely(sg_alloc_table(&st, npages, GFP_KERNEL))) + return -ENOMEM; + + if (is_src) { + node = &migrate->src_node; + mem_id = migrate->src_id; + flags |= I915_GTT_SVM_READONLY; + } else { + node = &migrate->dst_node; + mem_id = migrate->dst_id; + } + + mutex_lock(&vm->mutex); + ret = i915_gem_gtt_insert(vm, node, npages * PAGE_SIZE, + I915_GTT_PAGE_SIZE_2M, 0, + 0, vm->total, PIN_USER); + mutex_unlock(&vm->mutex); + if (unlikely(ret)) + return ret; + + sg = NULL; + st.nents = 0; + + /* + * XXX: If the source page is missing, source it from a temporary + * zero filled page. If needed, destination page missing scenarios + * can be similarly handled by draining data into a temporary page. + */ + for (i = 0; i < npages; i++) { + u64 addr; + + if (mem_id == INTEL_REGION_SMEM) { + addr = migrate->host_dma[i]; + } else { + struct page *page; + unsigned long mpfn; + + mpfn = is_src ? migrate->args->src[i] : + migrate->args->dst[i]; + page = migrate_pfn_to_page(mpfn); + mem = i915->mm.regions[mem_id]; + addr = page_to_pfn(page) - mem->devmem->pfn_first; + addr <<= PAGE_SHIFT; + addr += mem->region.start; + } + + if (sg && (addr == (sg_dma_address(sg) + sg->length))) { + sg->length += PAGE_SIZE; + sg_dma_len(sg) += PAGE_SIZE; + continue; + } + + if (sg) + sg_page_sizes |= sg->length; + + sg = sg ? __sg_next(sg) : st.sgl; + sg_dma_address(sg) = addr; + sg_dma_len(sg) = PAGE_SIZE; + sg->length = PAGE_SIZE; + st.nents++; + } + + sg_page_sizes |= sg->length; + sg_mark_end(sg); + i915_sg_trim(&st); + + ret = svm_bind_addr(vm, node->start, npages * PAGE_SIZE, + flags, &st, sg_page_sizes); + sg_free_table(&st); + + return ret; +} + +static void i915_devmem_unbind_addr(struct i915_devmem_migrate *migrate, + bool is_src) +{ + struct i915_gem_context *ctx = migrate->ce->gem_context; + struct i915_address_space *vm = ctx->vm; + struct drm_mm_node *node; + + node = is_src ? &migrate->src_node : &migrate->dst_node; + svm_unbind_addr(vm, node->start, migrate->npages * PAGE_SIZE); + mutex_lock(&vm->mutex); + drm_mm_remove_node(node); + mutex_unlock(&vm->mutex); +} + +static int i915_migrate_blitter_copy(struct i915_devmem_migrate *migrate) +{ + struct drm_i915_private *i915 = migrate->i915; + int ret; + + migrate->ce = i915->engine[BCS0]->kernel_context; + ret = i915_devmem_bind_addr(migrate, true); + if (unlikely(ret)) + return ret; + + ret = i915_devmem_bind_addr(migrate, false); + if (unlikely(ret)) { + i915_devmem_unbind_addr(migrate, true); + return ret; + } + + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); + ret = i915_svm_copy_blt(migrate->ce, + migrate->src_node.start, + migrate->dst_node.start, + migrate->npages * PAGE_SIZE, + &migrate->fence); + if (unlikely(ret)) { + i915_devmem_unbind_addr(migrate, false); + i915_devmem_unbind_addr(migrate, true); + } + + return ret; +} + static int i915_migrate_cpu_copy(struct i915_devmem_migrate *migrate) { const unsigned long *src = migrate->args->src; @@ -210,6 +351,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) { struct drm_i915_private *i915 = migrate->i915; struct migrate_vma *args = migrate->args; + struct device *dev = i915->drm.dev; struct intel_memory_region *mem; struct list_head blocks = {0}; unsigned long i, npages, cnt; @@ -218,13 +360,35 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate)
npages = (args->end - args->start) >> PAGE_SHIFT; DRM_DEBUG_DRIVER("start 0x%lx npages %ld\n", args->start, npages); + /* + * XXX: Set proper condition for cpu vs blitter copy for performance, + * functionality and to avoid any deadlocks around blitter usage. + */ + migrate->blitter_copy = true; + + if (migrate->blitter_copy) { + /* Allocate storage for DMA addresses, so we can unmap later. */ + migrate->host_dma = kcalloc(npages, sizeof(*migrate->host_dma), + GFP_KERNEL); + if (unlikely(!migrate->host_dma)) + migrate->blitter_copy = false; + }
- /* Check source pages */ + /* Check and DMA map source pages */ for (i = 0, cnt = 0; i < npages; i++) { args->dst[i] = 0; page = migrate_pfn_to_page(args->src[i]); - if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) + if (unlikely(!page || !(args->src[i] & MIGRATE_PFN_MIGRATE))) { + migrate->blitter_copy = false; continue; + } + + migrate->host_dma[i] = dma_map_page(dev, page, 0, PAGE_SIZE, + PCI_DMA_TODEVICE); + if (unlikely(dma_mapping_error(dev, migrate->host_dma[i]))) { + migrate->blitter_copy = false; + migrate->host_dma[i] = 0; + }
args->dst[i] = MIGRATE_PFN_VALID; cnt++; @@ -248,6 +412,7 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) page = i915_devmem_page_get_locked(mem, &blocks); if (unlikely(!page)) { WARN(1, "Failed to get dst page\n"); + migrate->blitter_copy = false; args->dst[i] = 0; continue; } @@ -265,7 +430,11 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) /* Copy the pages */ migrate->npages = npages; /* XXX: Flush the caches? */ - ret = i915_migrate_cpu_copy(migrate); + /* XXX: Fallback to cpu_copy if blitter_copy fails? */ + if (migrate->blitter_copy) + ret = i915_migrate_blitter_copy(migrate); + else + ret = i915_migrate_cpu_copy(migrate); migrate_out: if (unlikely(ret)) { for (i = 0; i < npages; i++) { @@ -277,12 +446,42 @@ i915_devmem_migrate_alloc_and_copy(struct i915_devmem_migrate *migrate) } }
+ if (unlikely(migrate->host_dma && (!migrate->blitter_copy || ret))) { + for (i = 0; i < npages; i++) { + if (migrate->host_dma[i]) + dma_unmap_page(dev, migrate->host_dma[i], + PAGE_SIZE, PCI_DMA_TODEVICE); + } + kfree(migrate->host_dma); + migrate->host_dma = NULL; + } + return ret; }
void i915_devmem_migrate_finalize_and_map(struct i915_devmem_migrate *migrate) { + struct drm_i915_private *i915 = migrate->i915; + unsigned long i; + int ret; + + if (!migrate->blitter_copy) + return; + DRM_DEBUG_DRIVER("npages %lld\n", migrate->npages); + ret = i915_svm_copy_blt_wait(i915, migrate->fence); + if (unlikely(ret < 0)) + WARN(1, "Waiting for dma fence failed %d\n", ret); + + i915_devmem_unbind_addr(migrate, true); + i915_devmem_unbind_addr(migrate, false); + + for (i = 0; i < migrate->npages; i++) + dma_unmap_page(i915->drm.dev, migrate->host_dma[i], + PAGE_SIZE, PCI_DMA_TODEVICE); + + kfree(migrate->host_dma); + migrate->host_dma = NULL; }
static void i915_devmem_migrate_chunk(struct i915_devmem_migrate *migrate) @@ -354,6 +553,12 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) int err;
DRM_DEBUG_DRIVER("start 0x%lx\n", args->start); + /* + * XXX: Set proper condition for cpu vs blitter copy for performance, + * functionality and to avoid any deadlocks around blitter usage. + */ + migrate->blitter_copy = false; + /* Allocate host page */ spage = migrate_pfn_to_page(args->src[0]); if (unlikely(!spage || !(args->src[0] & MIGRATE_PFN_MIGRATE))) @@ -364,12 +569,31 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate) return VM_FAULT_SIGBUS; lock_page(dpage);
+ /* DMA map host page */ + if (migrate->blitter_copy) { + migrate->host_dma[0] = dma_map_page(dev, dpage, 0, PAGE_SIZE, + PCI_DMA_FROMDEVICE); + if (unlikely(dma_mapping_error(dev, migrate->host_dma[0]))) { + migrate->host_dma[0] = 0; + err = -ENOMEM; + goto fault_out; + } + } + args->dst[0] = migrate_pfn(page_to_pfn(dpage)) | MIGRATE_PFN_LOCKED;
/* Copy the pages */ migrate->npages = 1; - err = i915_migrate_cpu_copy(migrate); + /* XXX: Fallback to cpu_copy if blitter_copy fails? */ + if (migrate->blitter_copy) + err = i915_migrate_blitter_copy(migrate); + else + err = i915_migrate_cpu_copy(migrate); +fault_out: if (unlikely(err)) { + if (migrate->host_dma[0]) + dma_unmap_page(dev, migrate->host_dma[0], + PAGE_SIZE, PCI_DMA_FROMDEVICE); __free_page(dpage); args->dst[0] = 0; return VM_FAULT_SIGBUS; @@ -380,7 +604,22 @@ i915_devmem_fault_alloc_and_copy(struct i915_devmem_migrate *migrate)
void i915_devmem_fault_finalize_and_map(struct i915_devmem_migrate *migrate) { + struct drm_i915_private *i915 = migrate->i915; + int err; + DRM_DEBUG_DRIVER("\n"); + if (!migrate->blitter_copy) + return; + + err = i915_svm_copy_blt_wait(i915, migrate->fence); + if (unlikely(err < 0)) + WARN(1, "Waiting for dma fence failed %d\n", err); + + i915_devmem_unbind_addr(migrate, true); + i915_devmem_unbind_addr(migrate, false); + + dma_unmap_page(i915->drm.dev, migrate->host_dma[0], + PAGE_SIZE, PCI_DMA_FROMDEVICE); }
static inline struct i915_devmem *page_to_devmem(struct page *page) @@ -394,6 +633,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) struct drm_i915_private *i915 = devmem->i915; struct i915_devmem_migrate migrate = {0}; unsigned long src = 0, dst = 0; + dma_addr_t dma_addr = 0; vm_fault_t ret; struct migrate_vma args = { .vma = vmf->vma, @@ -407,6 +647,7 @@ static vm_fault_t i915_devmem_migrate_to_ram(struct vm_fault *vmf) DRM_DEBUG_DRIVER("addr 0x%lx\n", args.start); migrate.i915 = i915; migrate.args = &args; + migrate.host_dma = &dma_addr; migrate.src_id = INTEL_REGION_LMEM; migrate.dst_id = INTEL_REGION_SMEM; if (migrate_vma_setup(&args) < 0)
Add SVM as a capability and allow user to enable/disable SVM functionality on a per context basis.
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 95 ++++++++++++++++++- drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 1 + .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 ++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 11 +++ drivers/gpu/drm/i915/i915_drv.c | 4 +- drivers/gpu/drm/i915/i915_drv.h | 10 ++ drivers/gpu/drm/i915/i915_getparam.c | 3 + 8 files changed, 128 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 46926478ed2e..bc6cbf3b3d97 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -76,6 +76,7 @@
#include "i915_gem_context.h" #include "i915_globals.h" +#include "i915_svm.h" #include "i915_trace.h" #include "i915_user_extensions.h" #include "i915_gem_ioctls.h" @@ -965,6 +966,78 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data, return 0; }
+static int i915_gem_vm_setparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_vm_param *args = data; + struct i915_address_space *vm; + int err = 0; + u32 id; + + id = args->vm_id; + if (!id) + return -ENOENT; + + err = mutex_lock_interruptible(&file_priv->vm_idr_lock); + if (err) + return err; + + vm = idr_find(&file_priv->vm_idr, id); + + mutex_unlock(&file_priv->vm_idr_lock); + if (!vm) + return -ENOENT; + + switch (lower_32_bits(args->param)) { + case I915_GEM_VM_PARAM_SVM: + /* FIXME: Ensure ppgtt is empty before switching */ + if (!i915_has_svm(file_priv->dev_priv)) + err = -ENOTSUPP; + else if (args->value) + err = i915_svm_bind_mm(vm); + else + i915_svm_unbind_mm(vm); + break; + default: + err = -EINVAL; + } + return err; +} + +static int i915_gem_vm_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_gem_vm_param *args = data; + struct i915_address_space *vm; + int err = 0; + u32 id; + + id = args->vm_id; + if (!id) + return -ENOENT; + + err = mutex_lock_interruptible(&file_priv->vm_idr_lock); + if (err) + return err; + + vm = idr_find(&file_priv->vm_idr, id); + + mutex_unlock(&file_priv->vm_idr_lock); + if (!vm) + return -ENOENT; + + switch (lower_32_bits(args->param)) { + case I915_GEM_VM_PARAM_SVM: + args->value = i915_vm_is_svm_enabled(vm); + break; + default: + err = -EINVAL; + } + return err; +} + struct context_barrier_task { struct i915_active base; void (*task)(void *data); @@ -2379,6 +2452,21 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, return ret; }
+int i915_gem_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_gem_context_param *args = data; + u32 class = upper_32_bits(args->param); + + switch (class) { + case 0: + return i915_gem_context_getparam_ioctl(dev, data, file); + case 2: + return i915_gem_vm_getparam_ioctl(dev, data, file); + } + return -EINVAL; +} + int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { @@ -2401,14 +2489,15 @@ int i915_gem_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { struct drm_i915_gem_context_param *args = data; - u32 object_class = upper_32_bits(args->param); + u32 class = upper_32_bits(args->param);
- switch (object_class) { + switch (class) { case 0: return i915_gem_context_setparam_ioctl(dev, data, file); case 1: return i915_gem_object_setparam_ioctl(dev, data, file); - + case 2: + return i915_gem_vm_setparam_ioctl(dev, data, file); } return -EINVAL; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index 0b17636cfea2..59eedb9e8ae5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -175,6 +175,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +int i915_gem_getparam_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); int i915_gem_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file); int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index c060bc428f49..eefd2ee64c3f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -129,6 +129,7 @@ struct i915_gem_context { #define UCONTEXT_BANNABLE 2 #define UCONTEXT_RECOVERABLE 3 #define UCONTEXT_PERSISTENCE 4 +#define UCONTEXT_SVM_ENABLED 5
/** * @flags: small set of booleans diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 9d43ae6d643a..cbcfbf86ea61 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -25,6 +25,7 @@ #include "i915_gem_clflush.h" #include "i915_gem_context.h" #include "i915_gem_ioctls.h" +#include "i915_svm.h" #include "i915_trace.h"
enum { @@ -444,6 +445,11 @@ eb_validate_vma(struct i915_execbuffer *eb, if (unlikely(entry->alignment && !is_power_of_2(entry->alignment))) return -EINVAL;
+ /* Only allow user PINNED addresses for SVM enabled contexts */ + if (unlikely(i915_vm_is_svm_enabled(eb->gem_context->vm) && + !(entry->flags & EXEC_OBJECT_PINNED))) + return -EINVAL; + /* * Offset can be used as input (EXEC_OBJECT_PINNED), reject * any non-page-aligned or non-canonical addresses. diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index dd88fa87b7fe..93d8aef22b92 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -34,6 +34,7 @@ #include "i915_gem_object_blt.h" #include "i915_gem_region.h" #include "i915_globals.h" +#include "i915_svm.h" #include "i915_trace.h"
static struct i915_global_object { @@ -483,6 +484,16 @@ int __init i915_global_objects_init(void) bool i915_gem_object_svm_mapped(struct drm_i915_gem_object *obj) { + struct i915_vma *vma; + + spin_lock(&obj->vma.lock); + list_for_each_entry(vma, &obj->vma.list, obj_link) + if (i915_vm_is_svm_enabled(vma->vm)) { + spin_unlock(&obj->vma.lock); + return true; + } + + spin_unlock(&obj->vma.lock); return false; }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 740b4b9d39a8..f78912ffa03d 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -2699,6 +2699,8 @@ static int i915_bind_ioctl(struct drm_device *dev, void *data, vm = i915_gem_address_space_lookup(file->driver_priv, args->vm_id); if (unlikely(!vm)) return -ENOENT; + if (unlikely(!i915_vm_is_svm_enabled(vm))) + return -ENOTSUPP;
switch (args->type) { case I915_BIND_SVM_BUFFER: @@ -2762,7 +2764,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_RENDER_ALLOW), - DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW), + DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_getparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_setparam_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_RENDER_ALLOW), diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index f7051e6df656..2ea9d6ac9743 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -38,6 +38,7 @@ #include <linux/i2c-algo-bit.h> #include <linux/backlight.h> #include <linux/hash.h> +#include <linux/hmm.h> #include <linux/intel-iommu.h> #include <linux/kref.h> #include <linux/mm_types.h> @@ -1735,6 +1736,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, /* Only valid when HAS_DISPLAY() is true */ #define INTEL_DISPLAY_ENABLED(dev_priv) (WARN_ON(!HAS_DISPLAY(dev_priv)), !i915_modparams.disable_display)
+static inline bool i915_has_svm(struct drm_i915_private *dev_priv) +{ +#ifdef CONFIG_DRM_I915_SVM + return INTEL_GEN(dev_priv) >= 8; +#else + return false; +#endif +} + static inline bool intel_vtd_active(void) { #ifdef CONFIG_INTEL_IOMMU diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index cf8a8c3ef047..3023b7b40f66 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -160,6 +160,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, case I915_PARAM_PERF_REVISION: value = i915_perf_ioctl_version(); break; + case I915_PARAM_HAS_SVM: + value = i915_has_svm(i915); + break; default: DRM_DEBUG("Unknown parameter %d\n", param->param); return -EINVAL;
Add support to dump page table for debug purpose. Here is an example dump. Format is, [<page table index>] <start VA pfn>: <value>
Page Table dump start 0x0 len 0xffffffffffffffff [0x0fe] 0x7f0000000: 0x6b0003 [0x1e6] 0x7f7980000: 0x6c0003 [0x16d] 0x7f79ada00: 0x5f0003 [0x000] 0x7f79ada00: 0x610803 [0x16e] 0x7f79adc00: 0x6d0003 [0x000] 0x7f79adc00: 0x630803 [0x100] 0x800000000: 0x6f0003 [0x000] 0x800000000: 0x700000 [0x000] 0x800000000: 0x710003 [0x000] 0x800000000: 0x5d0803
Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Jon Bloomfield jon.bloomfield@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Sudeep Dutt sudeep.dutt@intel.com Signed-off-by: Niranjana Vishwanathapura niranjana.vishwanathapura@intel.com --- drivers/gpu/drm/i915/Kconfig.debug | 14 +++ .../gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 + drivers/gpu/drm/i915/i915_gem_gtt.c | 92 +++++++++++++++++++ drivers/gpu/drm/i915/i915_gem_gtt.h | 14 +++ 4 files changed, 121 insertions(+)
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug index 1140525da75a..06953c1d6fd4 100644 --- a/drivers/gpu/drm/i915/Kconfig.debug +++ b/drivers/gpu/drm/i915/Kconfig.debug @@ -223,3 +223,17 @@ config DRM_I915_DEBUG_RUNTIME_PM driver loading, suspend and resume operations.
If in doubt, say "N" + +config DRM_I915_DUMP_PPGTT + bool "Enable PPGTT Page Table dump support" + depends on DRM_I915 + default n + help + Choose this option to enable PPGTT page table dump support. + The page table snapshot helps developers to debug page table + related issues. This will affect performance and dumps a lot of + information, so only recommended for developer debug. + + Recommended for driver developers only. + + If in doubt, say "N". diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index cbcfbf86ea61..c771659d72e2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2707,6 +2707,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, intel_engine_pool_mark_active(eb.batch->private, eb.request);
trace_i915_request_queue(eb.request, eb.batch_flags); + ppgtt_dump(eb.context->vm, 0, eb.context->vm->total); err = eb_submit(&eb); err_request: add_to_client(eb.request, file); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 08cbbb4d91cb..6a7348af5717 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1215,6 +1215,97 @@ static int gen8_ppgtt_alloc(struct i915_address_space *vm, return err; }
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT +static void __gen8_ppgtt_dump(struct i915_address_space * const vm, + struct i915_page_directory * const pd, + u64 start, u64 end, int lvl) +{ + char *prefix[4] = { "\t\t\t\t", "\t\t\t", "\t\t", "\t"}; + char *format = "%s [0x%03x] 0x%llx: 0x%llx\n"; + unsigned int idx, len; + gen8_pte_t *vaddr; + unsigned int pdpe; + bool is_large; + + GEM_BUG_ON(end > vm->total >> GEN8_PTE_SHIFT); + + len = gen8_pd_range(start, end, lvl--, &idx); + GEM_BUG_ON(!len || (idx + len - 1) >> gen8_pd_shift(1)); + + spin_lock(&pd->lock); + GEM_BUG_ON(!atomic_read(px_used(pd))); /* Must be pinned! */ + do { + struct i915_page_table *pt = pd->entry[idx]; + + if (!pt) { + start += (1 << gen8_pd_shift(lvl + 1)); + continue; + } + + vaddr = kmap_atomic_px(&pd->pt); + pdpe = gen8_pd_index(start, lvl + 1); + DRM_DEBUG_DRIVER(format, prefix[lvl + 1], pdpe, + start, vaddr[pdpe]); + is_large = (vaddr[pdpe] & GEN8_PDE_PS_2M); + kunmap_atomic(vaddr); + if (is_large) { + start += (1 << gen8_pd_shift(lvl + 1)); + continue; + } + + if (lvl) { + atomic_inc(&pt->used); + spin_unlock(&pd->lock); + + __gen8_ppgtt_dump(vm, as_pd(pt), + start, end, lvl); + + start += (1 << gen8_pd_shift(lvl + 1)); + spin_lock(&pd->lock); + atomic_dec(&pt->used); + GEM_BUG_ON(!atomic_read(&pt->used)); + } else { + unsigned int count = gen8_pt_count(start, end); + + pdpe = gen8_pd_index(start, lvl); + vaddr = kmap_atomic_px(pt); + while (count) { + if (vaddr[pdpe] != vm->scratch[lvl].encode) + DRM_DEBUG_DRIVER(format, prefix[lvl], + pdpe, start, + vaddr[pdpe]); + start++; + count--; + pdpe++; + } + + kunmap_atomic(vaddr); + GEM_BUG_ON(atomic_read(&pt->used) > I915_PDES); + } + } while (idx++, --len); + spin_unlock(&pd->lock); +} + +static void gen8_ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) +{ + GEM_BUG_ON(!IS_ALIGNED(start, BIT_ULL(GEN8_PTE_SHIFT))); + GEM_BUG_ON(!IS_ALIGNED(length, BIT_ULL(GEN8_PTE_SHIFT))); + GEM_BUG_ON(range_overflows(start, length, vm->total)); + + start >>= GEN8_PTE_SHIFT; + length >>= GEN8_PTE_SHIFT; + GEM_BUG_ON(length == 0); + + DRM_DEBUG_DRIVER("PPGTT dump: start 0x%llx length 0x%llx\n", + start, length); + __gen8_ppgtt_dump(vm, i915_vm_to_ppgtt(vm)->pd, + start, start + length, vm->top); +} +#else +#define gen8_ppgtt_dump NULL +#endif + static inline struct sgt_dma { struct scatterlist *sg; dma_addr_t dma, max; @@ -1584,6 +1675,7 @@ static struct i915_ppgtt *gen8_ppgtt_create(struct drm_i915_private *i915) ppgtt->vm.insert_entries = gen8_ppgtt_insert; ppgtt->vm.allocate_va_range = gen8_ppgtt_alloc; ppgtt->vm.clear_range = gen8_ppgtt_clear; + ppgtt->vm.dump_va_range = gen8_ppgtt_dump;
if (intel_vgpu_active(i915)) gen8_ppgtt_notify_vgt(ppgtt, true); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 722b1e7ce291..a4358fb1c711 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -367,6 +367,8 @@ struct i915_address_space { u64 start, u64 length); void (*clear_range)(struct i915_address_space *vm, u64 start, u64 length); + void (*dump_va_range)(struct i915_address_space *vm, + u64 start, u64 length); void (*insert_page)(struct i915_address_space *vm, dma_addr_t addr, u64 offset, @@ -683,6 +685,18 @@ int i915_gem_gtt_insert(struct i915_address_space *vm,
#define PIN_OFFSET_MASK (-I915_GTT_PAGE_SIZE)
+#ifdef CONFIG_DRM_I915_DUMP_PPGTT +static inline void ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) +{ + if (vm->dump_va_range) + vm->dump_va_range(vm, start, length); +} +#else +static inline void ppgtt_dump(struct i915_address_space *vm, + u64 start, u64 length) { } +#endif + /* SVM API */ #define I915_GTT_SVM_READONLY BIT(0)
Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
Shared Virtual Memory (SVM) allows the programmer to use a single virtual address space which will be shared between threads executing on CPUs and GPUs. It abstracts away from the user the location of the backing memory, and hence simplifies the user programming model. SVM supports two types of virtual memory allocation methods. Runtime allocator requires the driver to provide memory allocation and management interface, like buffer object (BO) interface. Whereas system allocator makes use of default OS memory allocation and management support like malloc().
This patch series adds both SVM system and runtime allocator support to i915 driver.
The patch series includes
- SVM support for both system and runtime allocation.
- Plugin in device memory with the Linux kernel.
- User API advertising SVM capability and configuration by user on per vm basis.
- User API to bind an address range or a BO with a device page table.
- User API to migrate an address range to device memory.
- Implicit migration by moving pages or BOs back from device to host memory upon CPU access.
- CPU copy and blitter copy support for migrating the pages/BOs.
- Large page mapping support
- Page table dump support.
Link to the IGTs, any selftests? Link to userspace changes?
Regards, Joonas
On Tue, Nov 26, 2019 at 04:03:21PM +0200, Joonas Lahtinen wrote:
Quoting Niranjana Vishwanathapura (2019-11-22 22:57:21)
Shared Virtual Memory (SVM) allows the programmer to use a single virtual address space which will be shared between threads executing on CPUs and GPUs. It abstracts away from the user the location of the backing memory, and hence simplifies the user programming model. SVM supports two types of virtual memory allocation methods. Runtime allocator requires the driver to provide memory allocation and management interface, like buffer object (BO) interface. Whereas system allocator makes use of default OS memory allocation and management support like malloc().
This patch series adds both SVM system and runtime allocator support to i915 driver.
The patch series includes
- SVM support for both system and runtime allocation.
- Plugin in device memory with the Linux kernel.
- User API advertising SVM capability and configuration by user on per vm basis.
- User API to bind an address range or a BO with a device page table.
- User API to migrate an address range to device memory.
- Implicit migration by moving pages or BOs back from device to host memory upon CPU access.
- CPU copy and blitter copy support for migrating the pages/BOs.
- Large page mapping support
- Page table dump support.
Link to the IGTs, any selftests? Link to userspace changes?
I have some IGT tests, I will post them soon.
Niranjana
Regards, Joonas
dri-devel@lists.freedesktop.org