This is already the case for our kernel internal mappings, and since we now only support a single mode this should always be WC if the object can be placed in lmem.
v2: rebase and also update set_domain
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 6 ++++++ drivers/gpu/drm/i915/gem/i915_gem_mman.c | 9 +++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.c | 21 +++++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 4 ++++ 4 files changed, 40 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index 073822100da7..d0c91697bb22 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -571,6 +571,12 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, if (READ_ONCE(obj->write_domain) == read_domains) goto out_unpin;
+ if (i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL) && + read_domains != I915_GEM_DOMAIN_WC) { + err = -EINVAL; + goto out_unpin; + } + if (read_domains & I915_GEM_DOMAIN_WC) err = i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index a90f796e85c0..f3586b36dd53 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -688,6 +688,15 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj, !i915_gem_object_has_iomem(obj)) return -ENODEV;
+ /* + * Note that even if the object can also be placed in smem then we still + * map as WC here, since we can only support a single mode. On DG1 this + * sucks since we can't turn off snooping for this case. + */ + if (mmap_type != I915_MMAP_TYPE_WC && + i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL)) + return -ENODEV; + mmo = mmap_offset_attach(obj, mmap_type, file); if (IS_ERR(mmo)) return PTR_ERR(mmo); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 07e8ff9a8aae..326956c18f76 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -513,6 +513,27 @@ bool i915_gem_object_has_iomem(const struct drm_i915_gem_object *obj) return obj->mem_flags & I915_BO_FLAG_IOMEM; }
+/** + * i915_gem_object_placements_contain_type - Check whether the object can be + * placed at certain memory type + * @obj: Pointer to the object + * @type: The memory type to check + * + * Return: True if the object can be placed in @type. False otherwise. + */ +bool i915_gem_object_placements_contain_type(struct drm_i915_gem_object *obj, + enum intel_memory_type type) +{ + unsigned int i; + + for (i = 0; i < obj->mm.n_placements; i++) { + if (obj->mm.placements[i]->type == type) + return true; + } + + return false; +} + void i915_gem_init__objects(struct drm_i915_private *i915) { INIT_WORK(&i915->mm.free_work, __i915_gem_free_work); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index ea3224a480c4..e1daa58bc225 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -12,6 +12,7 @@ #include <drm/drm_device.h>
#include "display/intel_frontbuffer.h" +#include "intel_memory_region.h" #include "i915_gem_object_types.h" #include "i915_gem_gtt.h" #include "i915_gem_ww.h" @@ -597,6 +598,9 @@ bool i915_gem_object_migratable(struct drm_i915_gem_object *obj);
bool i915_gem_object_validates_to_lmem(struct drm_i915_gem_object *obj);
+bool i915_gem_object_placements_contain_type(struct drm_i915_gem_object *obj, + enum intel_memory_type type); + #ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj)
We only support single mode and this should be immutable. For smem only placements on DGFX this should be WB. On DG1 everything is snooped, always, and so should be coherent.
I915_GEM_DOMAIN_GTT looks like it's for the aperture which is now gone for DGFX, so hopefully can also be safely rejected.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Daniel Vetter daniel.vetter@ffwll.ch --- drivers/gpu/drm/i915/gem/i915_gem_domain.c | 7 +++++++ drivers/gpu/drm/i915/gem/i915_gem_mman.c | 10 ++++++++++ 2 files changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c index d0c91697bb22..e3459a524e64 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c @@ -577,6 +577,13 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data, goto out_unpin; }
+ if (IS_DGFX(to_i915(obj->base.dev)) && obj->mm.n_placements == 1 && + i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_SYSTEM) && + read_domains != I915_GEM_DOMAIN_CPU) { + err = -EINVAL; + goto out_unpin; + } + if (read_domains & I915_GEM_DOMAIN_WC) err = i915_gem_object_set_to_wc_domain(obj, write_domain); else if (read_domains & I915_GEM_DOMAIN_GTT) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index f3586b36dd53..afc9f3dc38b9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -673,6 +673,7 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj, enum i915_mmap_type mmap_type, u64 *offset, struct drm_file *file) { + struct drm_i915_private *i915 = to_i915(obj->base.dev); struct i915_mmap_offset *mmo;
if (i915_gem_object_never_mmap(obj)) @@ -697,6 +698,15 @@ __assign_mmap_offset(struct drm_i915_gem_object *obj, i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_LOCAL)) return -ENODEV;
+ /* + * For smem only placements on DGFX we need to default to WB. On DG1 + * everything is snooped always, so should always be coherent. + */ + if (IS_DGFX(i915) && + mmap_type != I915_MMAP_TYPE_WB && obj->mm.n_placements == 1 && + i915_gem_object_placements_contain_type(obj, INTEL_MEMORY_SYSTEM)) + return -ENODEV; + mmo = mmap_offset_attach(obj, mmap_type, file); if (IS_ERR(mmo)) return PTR_ERR(mmo);
On 6/25/21 2:27 PM, Matthew Auld wrote:
Same thing here as in the previous patch.
Also do we need to modify i915_coherent_map_type() to also include HAS_SNOOP()?
While we're at it, that "always_coherent" argument to i915_coherent_map_type() appears scary to me and probably needs some documentation. It seems used for page-tables. Is it because we know those are always snooped?
/Thomas
On 28/06/2021 08:41, Thomas Hellström wrote:
Yeah, it's either because the caller has/will mark the pages as coherent(which translates to some special ppGTT bits), or we manually flush ourselves. In i915_coherent_map_type() we should account for DG1 somehow.
Historically I don't think we enabled snooping by default since it's considered slow compared to shared LLC. On DG1 this is a different story though.
Also the pin_map() interface is pretty much only for kernel internal objects, so I don't think we have any users which try to map userspace objects with that interface. Ok, except for vm_access it seems, but that should hopefully be a simple fix to use the correct caching mode? We can maybe add some sanity checking there if someone tries to map a userspace object?
For all the other callers of pin_map() which should all be kernel internal do we still need to force WB for system memory? By design we only support a single mm.mapping there. For lmem we already use WC only.
Hi,
On 6/28/21 11:12 AM, Matthew Auld wrote:
I'm not fully sure that's sufficient, see below.
We're only allowed to map with the same caching mode as the linear kernel mapping for discrete. Otherwise things may blow up on non-intel architectures. We can probably update 195_ttm_select_tt_caching to always use WB for system pages for kernel objects, but then we must make sure we don't try to map these WC.
/Thomas
On 28/06/2021 10:38, Thomas Hellström wrote:
Ok, do you think that should be a separate series? It looks like our internal objects don't use ttm(?). Should it? If so should we make a region for it, or can we just make create_internal use the ttm system region? It should be pretty much the same, except we don't want swapping, clearing or eviction, and ideally we would have some way of marking the pages as volatile(I think we can just keep IS_SHRINKABLE for that).
Or can we keep create_internal as is and then it's just a case of dealing with all the pin_map() callers?
On 6/28/21 12:20 PM, Matthew Auld wrote:
For now, I think, the internal objects don't really need to use ttm, and if we want to move them over, as you say, it's just a matter of reusing the SYSTEM region with suitable flags. I think we should restrict this to deal with pin_map() for now, and a separate series would work fine.
Although there seem to be callers, at least in selftests, that use pin_map with create_shmem() which maps to TTM system on discrete.
/Thomas
On 6/25/21 2:27 PM, Matthew Auld wrote:
A couple of questions:
1) Since this now becomes uapi, are we completely sure that we are not going to want to map these bos cached when they are evicted or migrated. If we think we are going to want to do that, we should probably just silently accept any mapping mode user-space wants and then have the kernel override user-space's wishes.
Ping Daniel about this as well.
2) How are we going to handle kernel maps to make sure we don't use a different kernel map caching mode to these objects. Will that be a later series? Seems like we at least need to modify "i915_coherent_map_type"
Apart from this, code looks good to me.
/Thomas
dri-devel@lists.freedesktop.org