While porting i915 to arm64 we noticed some issues accessing lmem. Some writes were getting corrupted and the final state of the buffer didn't have exactly what we wrote. This became evident when enabling GuC submission: depending on the number of engines the ADS struct was being corrupted and GuC would reject it, refusin to initialize.
From Documentation/core-api/bus-virt-phys-mapping.rst:
This memory is called "PCI memory" or "shared memory" or "IO memory" or whatever, and there is only one way to access it: the readb/writeb and related functions. You should never take the address of such memory, because there is really nothing you can do with such an address: it's not conceptually in the same memory space as "real memory" at all, so you cannot just dereference a pointer. (Sadly, on x86 it **is** in the same memory space, so on x86 it actually works to just deference a pointer, but it's not portable).
When reading or writing words directly to IO memory, in order to be portable the Linux kernel provides the abstraction detailed in section "Differences between I/O access functions" of Documentation/driver-api/device-io.rst.
This limits our ability to simply overlay our structs on top a buffer and directly access it since that buffer may come from IO memory rather than system memory. Hence the approach taken in intel_guc_ads.c needs to be refactored. This is not the only place in i915 that neeed to be changed, but the one causing the most problems, with a real reproducer. This first set of patch focuses on fixing the gem object to pass the ADS
After the addition of a few helpers in the dma_buf_map API, most of intel_guc_ads.c can be converted to use it. The exception is the regset initialization: we'd incur into a lot of extra indirection when reading/writting each register. So the regset is converted to use a temporary buffer allocated on probe, which is then copied to its final location when finishing the initialization or on gt reset.
Testing on some discrete cards, after this change we can correctly pass the ADS struct to GuC and have it initialized correctly.
thanks Lucas De Marchi
Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Cc: Christian König christian.koenig@amd.com Cc: Daniel Vetter daniel@ffwll.ch Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Cc: David Airlie airlied@linux.ie Cc: John Harrison John.C.Harrison@Intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Matt Roper matthew.d.roper@intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com
Lucas De Marchi (19): dma-buf-map: Add read/write helpers dma-buf-map: Add helper to initialize second map drm/i915/gt: Add helper for shmem copy to dma_buf_map drm/i915/guc: Keep dma_buf_map of ads_blob around drm/i915/guc: Add read/write helpers for ADS blob drm/i915/guc: Convert golden context init to dma_buf_map drm/i915/guc: Convert policies update to dma_buf_map drm/i915/guc: Convert engine record to dma_buf_map dma-buf-map: Add wrapper over memset drm/i915/guc: Convert guc_ads_private_data_reset to dma_buf_map drm/i915/guc: Convert golden context prep to dma_buf_map drm/i915/guc: Replace check for golden context size drm/i915/guc: Convert mapping table to dma_buf_map drm/i915/guc: Convert capture list to dma_buf_map drm/i915/guc: Prepare for error propagation drm/i915/guc: Use a single pass to calculate regset drm/i915/guc: Convert guc_mmio_reg_state_init to dma_buf_map drm/i915/guc: Convert __guc_ads_init to dma_buf_map drm/i915/guc: Remove plain ads_blob pointer
drivers/gpu/drm/i915/gt/shmem_utils.c | 32 ++ drivers/gpu/drm/i915/gt/shmem_utils.h | 3 + drivers/gpu/drm/i915/gt/uc/intel_guc.h | 14 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 374 +++++++++++------- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h | 3 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +- include/linux/dma-buf-map.h | 127 ++++++ 7 files changed, 405 insertions(+), 159 deletions(-)
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__
+#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h>
@@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } }
+/** + * dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping + * @dst: The dma-buf mapping structure + * @offset: The offset from which to copy + * @src: The source buffer + * @len: The number of byte in src + * + * Copies data into a dma-buf mapping with an offset. The source buffer is in + * system memory. Depending on the buffer's location, the helper picks the + * correct method of accessing the memory. + */ +static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset, + const void *src, size_t len) +{ + if (dst->is_iomem) + memcpy_toio(dst->vaddr_iomem + offset, src, len); + else + memcpy(dst->vaddr + offset, src, len); +} + +/** + * dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into system memory + * @dst: Destination in system memory + * @src: The dma-buf mapping structure + * @src: The offset from which to copy + * @len: The number of byte in src + * + * Copies data from a dma-buf mapping with an offset. The dest buffer is in + * system memory. Depending on the mapping location, the helper picks the + * correct method of accessing the memory. + */ +static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src, + size_t offset, size_t len) +{ + if (src->is_iomem) + memcpy_fromio(dst, src->vaddr_iomem + offset, len); + else + memcpy(dst, src->vaddr + offset, len); +} + /** * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping * @dst: The dma-buf mapping structure @@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; }
+/** + * dma_buf_map_read_field - Read struct member from dma-buf mapping with + * arbitrary size and handling un-aligned accesses + * + * @map__: The dma-buf mapping structure + * @type__: The struct to be used containing the field to read + * @field__: Member from struct we want to read + * + * Read a value from dma-buf mapping calculating the offset and size: this assumes + * the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32 + * or u64 can be read, based on the offset and size of type__.field__. + */ +#define dma_buf_map_read_field(map__, type__, field__) ({ \ + type__ *t__; \ + typeof(t__->field__) val__; \ + dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \ + sizeof(t__->field__)); \ + val__; \ +}) + +/** + * dma_buf_map_write_field - Write struct member to the dma-buf mapping with + * arbitrary size and handling un-aligned accesses + * + * @map__: The dma-buf mapping structure + * @type__: The struct to be used containing the field to write + * @field__: Member from struct we want to write + * @val__: Value to be written + * + * Write a value to the dma-buf mapping calculating the offset and size. + * A single u8, u16, u32 or u64 can be written based on the offset and size of + * type__.field__. + */ +#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \ + type__ *t__; \ + typeof(t__->field__) val____ = val__; \ + dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \ + &val____, sizeof(t__->field__)); \ +}) + #endif /* __DMA_BUF_MAP_H__ */
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__
+#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h>
@@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } }
+/**
- dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
- @dst: The dma-buf mapping structure
- @offset: The offset from which to copy
- @src: The source buffer
- @len: The number of byte in src
- Copies data into a dma-buf mapping with an offset. The source buffer is in
- system memory. Depending on the buffer's location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset,
const void *src, size_t len)
+{
- if (dst->is_iomem)
memcpy_toio(dst->vaddr_iomem + offset, src, len);
- else
memcpy(dst->vaddr + offset, src, len);
+}
+/**
- dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into system memory
- @dst: Destination in system memory
- @src: The dma-buf mapping structure
- @src: The offset from which to copy
- @len: The number of byte in src
- Copies data from a dma-buf mapping with an offset. The dest buffer is in
- system memory. Depending on the mapping location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src,
size_t offset, size_t len)
+{
- if (src->is_iomem)
memcpy_fromio(dst, src->vaddr_iomem + offset, len);
- else
memcpy(dst, src->vaddr + offset, len);
+}
Well that's certainly a valid use case, but I suggest to change the implementation of the existing functions to call the new ones with offset=0.
This way we only have one implementation.
/**
- dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
- @dst: The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; }
+/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
Uff well that absolutely looks like overkill to me.
That's a rather special use case as far as I can see and I think we should only have this in the common framework if more than one driver is using it.
Regards, Christian.
#endif /* __DMA_BUF_MAP_H__ */
On Thu, Jan 27, 2022 at 08:24:04AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__ +#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h> @@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } } +/**
- dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
- @dst: The dma-buf mapping structure
- @offset: The offset from which to copy
- @src: The source buffer
- @len: The number of byte in src
- Copies data into a dma-buf mapping with an offset. The source buffer is in
- system memory. Depending on the buffer's location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset,
const void *src, size_t len)
+{
- if (dst->is_iomem)
memcpy_toio(dst->vaddr_iomem + offset, src, len);
- else
memcpy(dst->vaddr + offset, src, len);
+}
+/**
- dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into system memory
- @dst: Destination in system memory
- @src: The dma-buf mapping structure
- @src: The offset from which to copy
- @len: The number of byte in src
- Copies data from a dma-buf mapping with an offset. The dest buffer is in
- system memory. Depending on the mapping location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src,
size_t offset, size_t len)
+{
- if (src->is_iomem)
memcpy_fromio(dst, src->vaddr_iomem + offset, len);
- else
memcpy(dst, src->vaddr + offset, len);
+}
Well that's certainly a valid use case, but I suggest to change the implementation of the existing functions to call the new ones with offset=0.
This way we only have one implementation.
Trivial - but agree with Christian that is a good cleanup.
/**
- dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
- @dst: The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; } +/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
Uff well that absolutely looks like overkill to me.
Hold on...
That's a rather special use case as far as I can see and I think we should only have this in the common framework if more than one driver is using it.
I disagree, this is rather elegant.
The i915 can't be the *only* driver that defines a struct which describes the layout of a dma_buf object.
IMO this base macro allows *all* other drivers to build on this write directly to fields in structures those drivers have defined. Patches later in this series do this for the GuC ads.
Matt
Regards, Christian.
#endif /* __DMA_BUF_MAP_H__ */
Am 27.01.22 um 08:36 schrieb Matthew Brost:
[SNIP]
/** * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping * @dst: The dma-buf mapping structure @@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; } +/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
Uff well that absolutely looks like overkill to me.
Hold on...
That's a rather special use case as far as I can see and I think we should only have this in the common framework if more than one driver is using it.
I disagree, this is rather elegant.
The i915 can't be the *only* driver that defines a struct which describes the layout of a dma_buf object.
That's not the problem, amdgpu as well as nouveau are doing that as well. The problem is DMA-buf is a buffer sharing framework between drivers.
In other words which importer is supposed to use this with a DMA-buf exported by another device?
IMO this base macro allows *all* other drivers to build on this write directly to fields in structures those drivers have defined.
Exactly that's the point. This is something drivers should absolutely *NOT* do.
That are driver internals and it is extremely questionable to move this into the common framework.
Regards, Christian.
Patches later in this series do this for the GuC ads.
Matt
Regards, Christian.
#endif /* __DMA_BUF_MAP_H__ */
On Thu, Jan 27, 2022 at 08:59:36AM +0100, Christian König wrote:
Am 27.01.22 um 08:36 schrieb Matthew Brost:
[SNIP]
/** * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping * @dst: The dma-buf mapping structure @@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; } +/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
Uff well that absolutely looks like overkill to me.
Hold on...
That's a rather special use case as far as I can see and I think we should only have this in the common framework if more than one driver is using it.
I disagree, this is rather elegant.
The i915 can't be the *only* driver that defines a struct which describes the layout of a dma_buf object.
That's not the problem, amdgpu as well as nouveau are doing that as well. The problem is DMA-buf is a buffer sharing framework between drivers.
In other words which importer is supposed to use this with a DMA-buf exported by another device?
IMO this base macro allows *all* other drivers to build on this write directly to fields in structures those drivers have defined.
Exactly that's the point. This is something drivers should absolutely *NOT* do.
That are driver internals and it is extremely questionable to move this into the common framework.
See my other reply.
This is about struct dma_buf_map, which is just a tagged pointer.
Which happens to be used by the dma_buf cross-driver interface, but it's also used plenty internally in buffer allocation helpers, fbdev, everything else. And it was _meant_ to be used like that - this thing is my idea, I know :-)
I guess we could move/rename it, but like I said I really don't have any good ideas. Got some? -Daniel
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__
+#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h>
@@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } }
+/**
- dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
- @dst: The dma-buf mapping structure
- @offset: The offset from which to copy
- @src: The source buffer
- @len: The number of byte in src
- Copies data into a dma-buf mapping with an offset. The source buffer is in
- system memory. Depending on the buffer's location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset,
const void *src, size_t len)
+{
- if (dst->is_iomem)
memcpy_toio(dst->vaddr_iomem + offset, src, len);
- else
memcpy(dst->vaddr + offset, src, len);
+}
Please don't add a new function. Rather please add the offset parameter to dma_buf_map_memcpy_to() and update the callers. There are only two calls to dma_buf_map_memcpy_to() within the kernel. To make it clear what the offset applies to, I'd call the parameter 'dst_offset'.
+/**
- dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into system memory
- @dst: Destination in system memory
- @src: The dma-buf mapping structure
- @src: The offset from which to copy
- @len: The number of byte in src
- Copies data from a dma-buf mapping with an offset. The dest buffer is in
- system memory. Depending on the mapping location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src,
size_t offset, size_t len)
+{
- if (src->is_iomem)
memcpy_fromio(dst, src->vaddr_iomem + offset, len);
- else
memcpy(dst, src->vaddr + offset, len);
+}
With the dma_buf_map_memcpy_to() changes, please just call this function dma_buf_map_memcpy_from().
/**
- dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
- @dst: The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; }
+/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
As the original author of this file, I feel like this shouldn't be here. At least not until we have another driver using that pattern.
Best regards Thomas
- #endif /* __DMA_BUF_MAP_H__ */
On Thu, Jan 27, 2022 at 03:26:43PM +0100, Thomas Zimmermann wrote:
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__ +#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h> @@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } } +/**
- dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
- @dst: The dma-buf mapping structure
- @offset: The offset from which to copy
- @src: The source buffer
- @len: The number of byte in src
- Copies data into a dma-buf mapping with an offset. The source buffer is in
- system memory. Depending on the buffer's location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset,
const void *src, size_t len)
+{
- if (dst->is_iomem)
memcpy_toio(dst->vaddr_iomem + offset, src, len);
- else
memcpy(dst->vaddr + offset, src, len);
+}
Please don't add a new function. Rather please add the offset parameter to dma_buf_map_memcpy_to() and update the callers. There are only two calls to dma_buf_map_memcpy_to() within the kernel. To make it clear what the offset applies to, I'd call the parameter 'dst_offset'.
+/**
- dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf mapping into system memory
- @dst: Destination in system memory
- @src: The dma-buf mapping structure
- @src: The offset from which to copy
- @len: The number of byte in src
- Copies data from a dma-buf mapping with an offset. The dest buffer is in
- system memory. Depending on the mapping location, the helper picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src,
size_t offset, size_t len)
+{
- if (src->is_iomem)
memcpy_fromio(dst, src->vaddr_iomem + offset, len);
- else
memcpy(dst, src->vaddr + offset, len);
+}
With the dma_buf_map_memcpy_to() changes, please just call this function dma_buf_map_memcpy_from().
/**
- dma_buf_map_memcpy_to - Memcpy into dma-buf mapping
- @dst: The dma-buf mapping structure
@@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; } +/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \
- type__ *t__; \
- typeof(t__->field__) val__; \
- dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \
sizeof(t__->field__)); \
- val__; \
+})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and size.
- A single u8, u16, u32 or u64 can be written based on the offset and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \
- type__ *t__; \
- typeof(t__->field__) val____ = val__; \
- dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \
&val____, sizeof(t__->field__)); \
+})
As the original author of this file, I feel like this shouldn't be here. At least not until we have another driver using that pattern.
Let me try to clear out the confusion. Then maybe I can extend the documentation of this function in v2 if I'm able to convince this is useful here.
This is not about importer/exporter, having this to work cross-driver. This is about using dma_buf_map (which we are talking about on renaming to iosys_map or something else) for inner driver allocations/abstractions. The abstraction added by iosys_map helps on sharing the same functions we had before. And this macro here is very useful when the buffer is described by a struct layout. Example:
struct bla { struct inner inner1; struct inner inner2; u32 x, y ,z; };
Functions that would previously do:
struct bla *bla = ...;
bla->x = 100; bla->y = 200; bla->inner1.inner_inner_field = 30;
Can do the below, having the system/IO memory abstracted away (calling it iosys_map here instead of dma_buf_map, hopeful it helps):
struct iosys_map *map = ...;
iosys_map_write_field(map, struct bla, x, 100); iosys_map_write_field(map, struct bla, y, 200); iosys_map_write_field(map, struct bla, inner1.inner_inner_field, 30);
When we are using mostly the same map, the individual drivers can add quick helpers on top. See the ads_blob_write() added in this series, which guarantees the map it's working on is always the guc->ads_map, while reducing verbosity to use the API. From patch "drm/i915/guc: Add read/write helpers for ADS blob":
#define ads_blob_read(guc_, field_) \ dma_buf_map_read_field(&(guc_)->ads_map, struct __guc_ads_blob, \ field_)
#define ads_blob_write(guc_, field_, val_) \ dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\ field_, val_)
So in intel_guc_ads, we can have a lot of:
- bla->x = 100; + ads_blob_write(guc, x, 10);
thanks Lucas De Marchi
Hi
Am 27.01.22 um 17:34 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 03:26:43PM +0100, Thomas Zimmermann wrote:
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
In certain situations it's useful to be able to read or write to an offset that is calculated by having the memory layout given by a struct declaration. Usually we are going to read/write a u8, u16, u32 or u64.
Add a pair of macros dma_buf_map_read_field()/dma_buf_map_write_field() to calculate the offset of a struct member and memcpy the data from/to the dma_buf_map. We could use readb, readw, readl, readq and the write* counterparts, however due to alignment issues this may not work on all architectures. If alignment needs to be checked to call the right function, it's not possible to decide at compile-time which function to call: so just leave the decision to the memcpy function that will do exactly that on IO memory or dereference the pointer.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 81 +++++++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 19fa0b5ae5ec..65e927d9ce33 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -6,6 +6,7 @@ #ifndef __DMA_BUF_MAP_H__ #define __DMA_BUF_MAP_H__ +#include <linux/kernel.h> #include <linux/io.h> #include <linux/string.h> @@ -229,6 +230,46 @@ static inline void dma_buf_map_clear(struct dma_buf_map *map) } } +/**
- dma_buf_map_memcpy_to_offset - Memcpy into offset of dma-buf mapping
- @dst: The dma-buf mapping structure
- @offset: The offset from which to copy
- @src: The source buffer
- @len: The number of byte in src
- Copies data into a dma-buf mapping with an offset. The source
buffer is in
- system memory. Depending on the buffer's location, the helper
picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_to_offset(struct dma_buf_map *dst, size_t offset, + const void *src, size_t len) +{ + if (dst->is_iomem) + memcpy_toio(dst->vaddr_iomem + offset, src, len); + else + memcpy(dst->vaddr + offset, src, len); +}
Please don't add a new function. Rather please add the offset parameter to dma_buf_map_memcpy_to() and update the callers. There are only two calls to dma_buf_map_memcpy_to() within the kernel. To make it clear what the offset applies to, I'd call the parameter 'dst_offset'.
+/**
- dma_buf_map_memcpy_from_offset - Memcpy from offset of dma-buf
mapping into system memory
- @dst: Destination in system memory
- @src: The dma-buf mapping structure
- @src: The offset from which to copy
- @len: The number of byte in src
- Copies data from a dma-buf mapping with an offset. The dest
buffer is in
- system memory. Depending on the mapping location, the helper
picks the
- correct method of accessing the memory.
- */
+static inline void dma_buf_map_memcpy_from_offset(void *dst, const struct dma_buf_map *src, + size_t offset, size_t len) +{ + if (src->is_iomem) + memcpy_fromio(dst, src->vaddr_iomem + offset, len); + else + memcpy(dst, src->vaddr + offset, len); +}
With the dma_buf_map_memcpy_to() changes, please just call this function dma_buf_map_memcpy_from().
/** * dma_buf_map_memcpy_to - Memcpy into dma-buf mapping * @dst: The dma-buf mapping structure @@ -263,4 +304,44 @@ static inline void dma_buf_map_incr(struct dma_buf_map *map, size_t incr) map->vaddr += incr; } +/**
- dma_buf_map_read_field - Read struct member from dma-buf mapping
with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to read
- @field__: Member from struct we want to read
- Read a value from dma-buf mapping calculating the offset and
size: this assumes
- the dma-buf mapping is aligned with a a struct type__. A single
u8, u16, u32
- or u64 can be read, based on the offset and size of type__.field__.
- */
+#define dma_buf_map_read_field(map__, type__, field__) ({ \ + type__ *t__; \ + typeof(t__->field__) val__; \ + dma_buf_map_memcpy_from_offset(&val__, map__, offsetof(type__, field__), \ + sizeof(t__->field__)); \ + val__; \ +})
+/**
- dma_buf_map_write_field - Write struct member to the dma-buf
mapping with
- arbitrary size and handling un-aligned accesses
- @map__: The dma-buf mapping structure
- @type__: The struct to be used containing the field to write
- @field__: Member from struct we want to write
- @val__: Value to be written
- Write a value to the dma-buf mapping calculating the offset and
size.
- A single u8, u16, u32 or u64 can be written based on the offset
and size of
- type__.field__.
- */
+#define dma_buf_map_write_field(map__, type__, field__, val__) ({ \ + type__ *t__; \ + typeof(t__->field__) val____ = val__; \ + dma_buf_map_memcpy_to_offset(map__, offsetof(type__, field__), \ + &val____, sizeof(t__->field__)); \ +})
As the original author of this file, I feel like this shouldn't be here. At least not until we have another driver using that pattern.
Let me try to clear out the confusion. Then maybe I can extend the documentation of this function in v2 if I'm able to convince this is useful here.
This is not about importer/exporter, having this to work cross-driver. This is about using dma_buf_map (which we are talking about on renaming to iosys_map or something else) for inner driver allocations/abstractions. The abstraction added by iosys_map helps on sharing the same functions we had before. And this macro here is very useful when the buffer is described by a struct layout. Example:
struct bla { struct inner inner1; struct inner inner2; u32 x, y ,z; };
Functions that would previously do:
struct bla *bla = ...;
bla->x = 100; bla->y = 200; bla->inner1.inner_inner_field = 30;
Can do the below, having the system/IO memory abstracted away (calling it iosys_map here instead of dma_buf_map, hopeful it helps):
struct iosys_map *map = ...;
Please don't start renaming anything here. If we want to do this, let's have a separate mail thread for coloring the bike shed.
iosys_map_write_field(map, struct bla, x, 100); iosys_map_write_field(map, struct bla, y, 200); iosys_map_write_field(map, struct bla, inner1.inner_inner_field, 30);
I don't have strong feelings about these macros. They just seemed not needed in general. But I we want to add them here, I 'd like to propose a few small changes.
Again, please add an offset parameter for the map's pointer.
Then I'd call them either dma_buf_map_rd/dma_buf_map_wr for read/write OR dma_buf_map_ld/dma_buf_map_st for load/store. They should take a C type. Something like this
dma_buf_map_wr(map, offset, int32, 0x01234); val = dam_buf_map_rd(map, offset, int32);
Hopefully, that's flexible enough for all users. On top of that, you can build additional helpers like dma_buf_map_rd_field() and dma_buf_map_wr_field().
Ok?
Best regards Thomas
When we are using mostly the same map, the individual drivers can add quick helpers on top. See the ads_blob_write() added in this series, which guarantees the map it's working on is always the guc->ads_map, while reducing verbosity to use the API. From patch "drm/i915/guc: Add read/write helpers for ADS blob":
#define ads_blob_read(guc_, field_) \ dma_buf_map_read_field(&(guc_)->ads_map, struct __guc_ads_blob, \ field_)
#define ads_blob_write(guc_, field_, val_) \ dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\ field_, val_)
So in intel_guc_ads, we can have a lot of:
- bla->x = 100; + ads_blob_write(guc, x, 10);
thanks Lucas De Marchi
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- include/linux/dma-buf-map.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 65e927d9ce33..3514a859f628 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -131,6 +131,35 @@ struct dma_buf_map { .is_iomem = false, \ }
+/** + * DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another dma_buf_map + * @map_: The dma-buf mapping structure to copy from + * @offset: Offset to add to the other mapping + * + * Initializes a new dma_buf_struct based on another. This is the equivalent of doing: + * + * .. code-block: c + * + * dma_buf_map map = other_map; + * dma_buf_map_incr(&map, &offset); + * + * Example usage: + * + * .. code-block: c + * + * void foo(struct device *dev, struct dma_buf_map *base_map) + * { + * ... + * struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, FIELD_OFFSET); + * ... + * } + */ +#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map) \ + { \ + .vaddr = (map_)->vaddr + (offset_), \ + .is_iomem = (map_)->is_iomem, \ + } + /** * dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory * @map: The dma-buf mapping structure
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
If you need to copy only a certain subset of the mapping use the functions you added in patch #1.
Regards, Christian.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 65e927d9ce33..3514a859f628 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -131,6 +131,35 @@ struct dma_buf_map { .is_iomem = false, \ }
+/**
- DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another dma_buf_map
- @map_: The dma-buf mapping structure to copy from
- @offset: Offset to add to the other mapping
- Initializes a new dma_buf_struct based on another. This is the equivalent of doing:
- .. code-block: c
- dma_buf_map map = other_map;
- dma_buf_map_incr(&map, &offset);
- Example usage:
- .. code-block: c
- void foo(struct device *dev, struct dma_buf_map *base_map)
- {
...
struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, FIELD_OFFSET);
...
- }
- */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map) \
- { \
.vaddr = (map_)->vaddr + (offset_), \
.is_iomem = (map_)->is_iomem, \
- }
- /**
- dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory
- @map: The dma-buf mapping structure
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here. There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Regards, Christian.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
s/you guys/you/ if you have to blame anyone - I'm the only s-o-b in these patches. I'm sending these to _build consensus_ on what may be a good use for it showing a real problem it's helping to fix.
From its documentation:
* The type :c:type:`struct dma_buf_map <dma_buf_map>` and its helpers are * actually independent from the dma-buf infrastructure. When sharing buffers * among devices, drivers have to know the location of the memory to access * the buffers in a safe way. :c:type:`struct dma_buf_map <dma_buf_map>` * solves this problem for dma-buf and its users. If other drivers or * sub-systems require similar functionality, the type could be generalized * and moved to a more prominent header file.
if there is no consensus and a better alternative, I'm perfectly fine in throwing it out and using the better approach.
Lucas De Marchi
Am 27.01.22 um 09:18 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
[SNIP]
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
s/you guys/you/ if you have to blame anyone - I'm the only s-o-b in these patches. I'm sending these to _build consensus_ on what may be a good use for it showing a real problem it's helping to fix.
Well a cover letter would have been helpful, my impression was that you have a larger set and just want to upstream some minor DMA-buf changes necessary for it.
Now I know why people are bugging me all the time to add cover letters to add more context to my sets.
From its documentation:
* The type :c:type:`struct dma_buf_map <dma_buf_map>` and its helpers are * actually independent from the dma-buf infrastructure. When sharing buffers * among devices, drivers have to know the location of the memory to access * the buffers in a safe way. :c:type:`struct dma_buf_map <dma_buf_map>` * solves this problem for dma-buf and its users. If other drivers or * sub-systems require similar functionality, the type could be generalized * and moved to a more prominent header file.
if there is no consensus and a better alternative, I'm perfectly fine in throwing it out and using the better approach.
When Thomas Zimmermann upstreamed the dma_buf_map work we had a discussion if that shouldn't be independent of the DMA-buf framework.
The consensus was that as soon as we have more widely use for it this should be made independent. So basically that is what's happening now.
I suggest the following approach: 1. Find a funky name for this, something like iomem_, kiomap_ or similar. 2. Separate this from all you driver dependent work and move the dma_buf_map structure out of DMA-buf into this new whatever_ prefix. 3. Ping Thomas, LKML, me and probably a couple of other core people if this is the right idea or not. 4. Work on dropping the map parameter from dma_buf_vunmap(). This is basically why we can't modify the pointers returned from dma_buf_vmap() and has already cause a few problems with dma_buf_map_incr().
Regards, Christian.
Lucas De Marchi
On Thu, Jan 27, 2022 at 09:55:05AM +0100, Christian König wrote:
Am 27.01.22 um 09:18 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
[SNIP]
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
s/you guys/you/ if you have to blame anyone - I'm the only s-o-b in these patches. I'm sending these to _build consensus_ on what may be a good use for it showing a real problem it's helping to fix.
Well a cover letter would have been helpful, my impression was that you have a larger set and just want to upstream some minor DMA-buf changes necessary for it.
I missed adding this sentence to the cover letter, as my impression was that dma-buf-map was already used outside inter-driver framework. But there is actually a cover letter:
https://lore.kernel.org/all/20220126203702.1784589-1-lucas.demarchi@intel.co...
And looking at it now, it seems I missed adding Thomas Zimmermann to Cc.
Now I know why people are bugging me all the time to add cover letters to add more context to my sets.
From its documentation:
* The type :c:type:`struct dma_buf_map <dma_buf_map>` and its helpers are * actually independent from the dma-buf infrastructure. When sharing buffers * among devices, drivers have to know the location of the memory to access * the buffers in a safe way. :c:type:`struct dma_buf_map <dma_buf_map>` * solves this problem for dma-buf and its users. If other drivers or * sub-systems require similar functionality, the type could be generalized * and moved to a more prominent header file.
if there is no consensus and a better alternative, I'm perfectly fine in throwing it out and using the better approach.
When Thomas Zimmermann upstreamed the dma_buf_map work we had a discussion if that shouldn't be independent of the DMA-buf framework.
The consensus was that as soon as we have more widely use for it this should be made independent. So basically that is what's happening now.
I suggest the following approach:
- Find a funky name for this, something like iomem_, kiomap_ or similar.
iosys_map?
- Separate this from all you driver dependent work and move the
dma_buf_map structure out of DMA-buf into this new whatever_ prefix.
should this be a follow up to the driver work or a prerequisite?
thanks Lucas De Marchi
- Ping Thomas, LKML, me and probably a couple of other core people if
this is the right idea or not. 4. Work on dropping the map parameter from dma_buf_vunmap(). This is basically why we can't modify the pointers returned from dma_buf_vmap() and has already cause a few problems with dma_buf_map_incr().
Regards, Christian.
Lucas De Marchi
Am 27.01.22 um 10:12 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 09:55:05AM +0100, Christian König wrote:
Am 27.01.22 um 09:18 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi: > [SNIP]
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
s/you guys/you/ if you have to blame anyone - I'm the only s-o-b in these patches. I'm sending these to _build consensus_ on what may be a good use for it showing a real problem it's helping to fix.
Well a cover letter would have been helpful, my impression was that you have a larger set and just want to upstream some minor DMA-buf changes necessary for it.
I missed adding this sentence to the cover letter, as my impression was that dma-buf-map was already used outside inter-driver framework. But there is actually a cover letter:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kerne...
And looking at it now, it seems I missed adding Thomas Zimmermann to Cc.
Now I know why people are bugging me all the time to add cover letters to add more context to my sets.
From its documentation:
* The type :c:type:`struct dma_buf_map <dma_buf_map>` and its helpers are * actually independent from the dma-buf infrastructure. When sharing buffers * among devices, drivers have to know the location of the memory to access * the buffers in a safe way. :c:type:`struct dma_buf_map <dma_buf_map>` * solves this problem for dma-buf and its users. If other drivers or * sub-systems require similar functionality, the type could be generalized * and moved to a more prominent header file.
if there is no consensus and a better alternative, I'm perfectly fine in throwing it out and using the better approach.
When Thomas Zimmermann upstreamed the dma_buf_map work we had a discussion if that shouldn't be independent of the DMA-buf framework.
The consensus was that as soon as we have more widely use for it this should be made independent. So basically that is what's happening now.
I suggest the following approach:
- Find a funky name for this, something like iomem_, kiomap_ or
similar.
iosys_map?
Works for me.
- Separate this from all you driver dependent work and move the
dma_buf_map structure out of DMA-buf into this new whatever_ prefix.
should this be a follow up to the driver work or a prerequisite?
Prerequisite. Structural changes like this always separate to the actually work switching over to them because the later needs a much fewer audience for review.
Regards, Christian.
thanks Lucas De Marchi
- Ping Thomas, LKML, me and probably a couple of other core people
if this is the right idea or not. 4. Work on dropping the map parameter from dma_buf_vunmap(). This is basically why we can't modify the pointers returned from dma_buf_vmap() and has already cause a few problems with dma_buf_map_incr().
Regards, Christian.
Lucas De Marchi
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros. -Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed. -Daniel
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi: > When dma_buf_map struct is passed around, it's useful to be able to > initialize a second map that takes care of reading/writing to an offset > of the original map. > > Add a helper that copies the struct and add the offset to the proper > address. Well what you propose here can lead to all kind of problems and is rather bad design as far as I can see.
The struct dma_buf_map is only to be filled in by the exporter and should not be modified in this way by the importer.
humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Christian.
-Daniel
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
On Thu, Jan 27, 2022 at 11:21:20AM +0100, Christian König wrote:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: > Am 26.01.22 um 21:36 schrieb Lucas De Marchi: > > When dma_buf_map struct is passed around, it's useful to be able to > > initialize a second map that takes care of reading/writing to an offset > > of the original map. > > > > Add a helper that copies the struct and add the offset to the proper > > address. > Well what you propose here can lead to all kind of problems and is > rather bad design as far as I can see. > > The struct dma_buf_map is only to be filled in by the exporter and > should not be modified in this way by the importer. humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Yeah but isn't that the same if it's just a void *?
If you pass the wrong pointer to an unmap function and not exactly what you go from the map function, then things go boom. This is like complaining that the following code wont work
u32 *stuff
stuff = kmap_local(some_page); *stuff++ = 0; *stuff = 1; kunmap_locak(stuff);
It's just ... don't do that :-) Also since we pass dma_buf_map by value and not by pointer anywhere, the risk of this happening is pretty low since you tend to work on a copy. Same with void * pointers really.
Now if people start to pass around struct dma_buf_map * as pointers for anything else than out parameters, then we're screwed. But that's like passing around void ** for lolz, which is just wrong (except when it's an out parameter or actually an array of pointers ofc).
Or I really don't get your concern and you mean something else? -Daniel
Christian.
-Daniel
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
Am 27.01.22 um 12:16 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 11:21:20AM +0100, Christian König wrote:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi: > On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: >> Am 26.01.22 um 21:36 schrieb Lucas De Marchi: >>> When dma_buf_map struct is passed around, it's useful to be able to >>> initialize a second map that takes care of reading/writing to an offset >>> of the original map. >>> >>> Add a helper that copies the struct and add the offset to the proper >>> address. >> Well what you propose here can lead to all kind of problems and is >> rather bad design as far as I can see. >> >> The struct dma_buf_map is only to be filled in by the exporter and >> should not be modified in this way by the importer. > humn... not sure if I was clear. There is no importer and exporter here. Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
> There is a role delegation on filling out and reading a buffer when > that buffer represents a struct layout. > > struct bla { > int a; > int b; > int c; > struct foo foo; > struct bar bar; > int d; > } > > > This implementation allows you to have: > > fill_foo(struct dma_buf_map *bla_map) { ... } > fill_bar(struct dma_buf_map *bla_map) { ... } > > and the first thing these do is to make sure the map it's pointing to > is relative to the struct it's supposed to write/read. Otherwise you're > suggesting everything to be relative to struct bla, or to do the same > I'm doing it, but IMO more prone to error: > > struct dma_buf_map map = *bla_map; > dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Yeah but isn't that the same if it's just a void *?
If you pass the wrong pointer to an unmap function and not exactly what you go from the map function, then things go boom. This is like complaining that the following code wont work
u32 *stuff
stuff = kmap_local(some_page); *stuff++ = 0; *stuff = 1; kunmap_locak(stuff);
It's just ... don't do that :-) Also since we pass dma_buf_map by value and not by pointer anywhere, the risk of this happening is pretty low since you tend to work on a copy. Same with void * pointers really.
Now if people start to pass around struct dma_buf_map * as pointers for anything else than out parameters, then we're screwed. But that's like passing around void ** for lolz, which is just wrong (except when it's an out parameter or actually an array of pointers ofc).
Or I really don't get your concern and you mean something else?
No that's pretty much it. It's just that we hide the pointer inside a structure and it is absolutely not obvious to a driver dev that you can't do:
dma_buf_vmap(.., &map); dma_buf_map_inr(&map, x); dma_buf_vunmap(.., &map);
As bare minimum I strongly suggest that we add some WARN_ONs to the framework to check that the pointer given to dma_buf_vunmap() is at least page aligned.
Christian.
-Daniel
Christian.
-Daniel
Lucas De Marchi
-Daniel
> IMO this construct is worse because at a point in time in the function > the map was pointing to the wrong thing the function was supposed to > read/write. > > It's also useful when the function has double duty, updating a global > part of the struct and a table inside it (see example in patch 6) > > thanks > Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
On Thu, Jan 27, 2022 at 12:44:21PM +0100, Christian König wrote:
Am 27.01.22 um 12:16 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 11:21:20AM +0100, Christian König wrote:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote: > Am 27.01.22 um 08:57 schrieb Lucas De Marchi: > > On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: > > > Am 26.01.22 um 21:36 schrieb Lucas De Marchi: > > > > When dma_buf_map struct is passed around, it's useful to be able to > > > > initialize a second map that takes care of reading/writing to an offset > > > > of the original map. > > > > > > > > Add a helper that copies the struct and add the offset to the proper > > > > address. > > > Well what you propose here can lead to all kind of problems and is > > > rather bad design as far as I can see. > > > > > > The struct dma_buf_map is only to be filled in by the exporter and > > > should not be modified in this way by the importer. > > humn... not sure if I was clear. There is no importer and exporter here. > Yeah, and exactly that's what I'm pointing out as problem here. > > You are using the inter driver framework for something internal to the > driver. That is an absolutely clear NAK! > > We could discuss that, but you guys are just sending around patches to do > this without any consensus that this is a good idea. Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
> > There is a role delegation on filling out and reading a buffer when > > that buffer represents a struct layout. > > > > struct bla { > > int a; > > int b; > > int c; > > struct foo foo; > > struct bar bar; > > int d; > > } > > > > > > This implementation allows you to have: > > > > fill_foo(struct dma_buf_map *bla_map) { ... } > > fill_bar(struct dma_buf_map *bla_map) { ... } > > > > and the first thing these do is to make sure the map it's pointing to > > is relative to the struct it's supposed to write/read. Otherwise you're > > suggesting everything to be relative to struct bla, or to do the same > > I'm doing it, but IMO more prone to error: > > > > struct dma_buf_map map = *bla_map; > > dma_buf_map_incr(map, offsetof(...)); Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Yeah but isn't that the same if it's just a void *?
If you pass the wrong pointer to an unmap function and not exactly what you go from the map function, then things go boom. This is like complaining that the following code wont work
u32 *stuff
stuff = kmap_local(some_page); *stuff++ = 0; *stuff = 1; kunmap_locak(stuff);
It's just ... don't do that :-) Also since we pass dma_buf_map by value and not by pointer anywhere, the risk of this happening is pretty low since you tend to work on a copy. Same with void * pointers really.
Now if people start to pass around struct dma_buf_map * as pointers for anything else than out parameters, then we're screwed. But that's like passing around void ** for lolz, which is just wrong (except when it's an out parameter or actually an array of pointers ofc).
Or I really don't get your concern and you mean something else?
No that's pretty much it. It's just that we hide the pointer inside a structure and it is absolutely not obvious to a driver dev that you can't do:
dma_buf_vmap(.., &map); dma_buf_map_inr(&map, x); dma_buf_vunmap(.., &map);
As bare minimum I strongly suggest that we add some WARN_ONs to the framework to check that the pointer given to dma_buf_vunmap() is at least page aligned.
Yeah that might be a good idea. But then we also have to add that check to dma_buf_vmap, just in case a driver does something really funny :-) -Daniel
Christian.
-Daniel
Christian.
-Daniel
Lucas De Marchi
-Daniel
> > IMO this construct is worse because at a point in time in the function > > the map was pointing to the wrong thing the function was supposed to > > read/write. > > > > It's also useful when the function has double duty, updating a global > > part of the struct and a table inside it (see example in patch 6) > > > > thanks
> > Lucas De Marchi
Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
On Thu, Jan 27, 2022 at 12:44:21PM +0100, Christian König wrote:
Am 27.01.22 um 12:16 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 11:21:20AM +0100, Christian König wrote:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote: >Am 27.01.22 um 08:57 schrieb Lucas De Marchi: >>On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: >>>Am 26.01.22 um 21:36 schrieb Lucas De Marchi: >>>>When dma_buf_map struct is passed around, it's useful to be able to >>>>initialize a second map that takes care of reading/writing to an offset >>>>of the original map. >>>> >>>>Add a helper that copies the struct and add the offset to the proper >>>>address. >>>Well what you propose here can lead to all kind of problems and is >>>rather bad design as far as I can see. >>> >>>The struct dma_buf_map is only to be filled in by the exporter and >>>should not be modified in this way by the importer. >>humn... not sure if I was clear. There is no importer and exporter here. >Yeah, and exactly that's what I'm pointing out as problem here. > >You are using the inter driver framework for something internal to the >driver. That is an absolutely clear NAK! > >We could discuss that, but you guys are just sending around patches to do >this without any consensus that this is a good idea. Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
>>There is a role delegation on filling out and reading a buffer when >>that buffer represents a struct layout. >> >>struct bla { >> int a; >> int b; >> int c; >> struct foo foo; >> struct bar bar; >> int d; >>} >> >> >>This implementation allows you to have: >> >> fill_foo(struct dma_buf_map *bla_map) { ... } >> fill_bar(struct dma_buf_map *bla_map) { ... } >> >>and the first thing these do is to make sure the map it's pointing to >>is relative to the struct it's supposed to write/read. Otherwise you're >>suggesting everything to be relative to struct bla, or to do the same >>I'm doing it, but IMO more prone to error: >> >> struct dma_buf_map map = *bla_map; >> dma_buf_map_incr(map, offsetof(...)); Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Yeah but isn't that the same if it's just a void *?
If you pass the wrong pointer to an unmap function and not exactly what you go from the map function, then things go boom. This is like complaining that the following code wont work
u32 *stuff
stuff = kmap_local(some_page); *stuff++ = 0; *stuff = 1; kunmap_locak(stuff);
It's just ... don't do that :-) Also since we pass dma_buf_map by value and not by pointer anywhere, the risk of this happening is pretty low since you tend to work on a copy. Same with void * pointers really.
Now if people start to pass around struct dma_buf_map * as pointers for anything else than out parameters, then we're screwed. But that's like passing around void ** for lolz, which is just wrong (except when it's an out parameter or actually an array of pointers ofc).
Or I really don't get your concern and you mean something else?
No that's pretty much it. It's just that we hide the pointer inside a structure and it is absolutely not obvious to a driver dev that you can't do:
dma_buf_vmap(.., &map); dma_buf_map_inr(&map, x); dma_buf_vunmap(.., &map);
As bare minimum I strongly suggest that we add some WARN_ONs to the framework to check that the pointer given to dma_buf_vunmap() is at least page aligned.
Agreed, that should cover most of the cases. I can add a patch doing that.
thanks Lucas De Marchi
Hi
Am 27.01.22 um 11:21 schrieb Christian König:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: > Am 26.01.22 um 21:36 schrieb Lucas De Marchi: >> When dma_buf_map struct is passed around, it's useful to be able to >> initialize a second map that takes care of reading/writing to an >> offset >> of the original map. >> >> Add a helper that copies the struct and add the offset to the >> proper >> address. > Well what you propose here can lead to all kind of problems and is > rather bad design as far as I can see. > > The struct dma_buf_map is only to be filled in by the exporter and > should not be modified in this way by the importer. humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
Indeed. It's always been a problem with that API, even when it still took raw pointers.
The IMHO correct solution would distinguish between a buffer (struct dma_buf_map) and a pointer into that buffer (struct dma_buf_ptr).
I don't feel like typing that.
Best regards Thomas
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Christian.
-Daniel
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
On Thu, Jan 27, 2022 at 11:21:20AM +0100, Christian König wrote:
Am 27.01.22 um 11:00 schrieb Daniel Vetter:
On Thu, Jan 27, 2022 at 01:33:32AM -0800, Lucas De Marchi wrote:
On Thu, Jan 27, 2022 at 09:57:25AM +0100, Daniel Vetter wrote:
On Thu, Jan 27, 2022 at 09:02:54AM +0100, Christian König wrote:
Am 27.01.22 um 08:57 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 08:27:11AM +0100, Christian König wrote: >Am 26.01.22 um 21:36 schrieb Lucas De Marchi: >>When dma_buf_map struct is passed around, it's useful to be able to >>initialize a second map that takes care of reading/writing to an offset >>of the original map. >> >>Add a helper that copies the struct and add the offset to the proper >>address. >Well what you propose here can lead to all kind of problems and is >rather bad design as far as I can see. > >The struct dma_buf_map is only to be filled in by the exporter and >should not be modified in this way by the importer. humn... not sure if I was clear. There is no importer and exporter here.
Yeah, and exactly that's what I'm pointing out as problem here.
You are using the inter driver framework for something internal to the driver. That is an absolutely clear NAK!
We could discuss that, but you guys are just sending around patches to do this without any consensus that this is a good idea.
Uh I suggested this, also we're already using dma_buf_map all over the place as a convenient abstraction. So imo that's all fine, it should allow drivers to simplify some code where on igpu it's in normal kernel memory and on dgpu it's behind some pci bar.
Maybe we should have a better name for that struct (and maybe also a better place), but way back when we discussed that bikeshed I didn't come up with anything better really.
I suggest iosys_map since it abstracts access to IO and system memory.
There is a role delegation on filling out and reading a buffer when that buffer represents a struct layout.
struct bla { int a; int b; int c; struct foo foo; struct bar bar; int d; }
This implementation allows you to have:
fill_foo(struct dma_buf_map *bla_map) { ... } fill_bar(struct dma_buf_map *bla_map) { ... }
and the first thing these do is to make sure the map it's pointing to is relative to the struct it's supposed to write/read. Otherwise you're suggesting everything to be relative to struct bla, or to do the same I'm doing it, but IMO more prone to error:
struct dma_buf_map map = *bla_map; dma_buf_map_incr(map, offsetof(...));
Wrt the issue at hand I think the above is perfectly fine code. The idea with dma_buf_map is really that it's just a special pointer, so writing the code exactly as pointer code feels best. Unfortunately you cannot make them typesafe (because of C), so the code sometimes looks a bit ugly. Otherwise we could do stuff like container_of and all that with typechecking in the macros.
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
Hm yeah that's a good point that this allows us to rely on the compiler to check for uninitialized variables.
Maybe include the above (with editing, but keeping the examples) in the kerneldoc to explain why/how to use this? With that the concept at least has my
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
I'll leave it up to you & Christian to find a prettier color choice for the naming bikeshed.
There is one major issue remaining with this and that is dma_buf_vunmap():
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
Here we expect the original pointer as returned by dma_buf_map(), otherwise we vunmap() the wrong area!
yeah... I think the most confusing aspect here is about the name.
void dma_buf_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map);
this function is the implementation of the dma_buf, not dma_buf_map, which is another thing entirely. I think the rename will be benefitial for this to be cleared out, because then it's more obvious the shallow copy of the map is the equivalent of having
u8 *p = buffer; ... p += 10;
Etc. You can't kfree(p) and expect it to work.
Lucas De Marchi
For all TTM based driver this doesn't matter since we keep the vmap base separately in the BO anyway (IIRC), but we had at least one case where this made boom last year.
Christian.
-Daniel
Lucas De Marchi
-Daniel
IMO this construct is worse because at a point in time in the function the map was pointing to the wrong thing the function was supposed to read/write.
It's also useful when the function has double duty, updating a global part of the struct and a table inside it (see example in patch 6)
thanks Lucas De Marchi
-- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 65e927d9ce33..3514a859f628 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -131,6 +131,35 @@ struct dma_buf_map { .is_iomem = false, \ }
+/**
- DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another dma_buf_map
- @map_: The dma-buf mapping structure to copy from
- @offset: Offset to add to the other mapping
- Initializes a new dma_buf_struct based on another. This is the equivalent of doing:
- .. code-block: c
- dma_buf_map map = other_map;
- dma_buf_map_incr(&map, &offset);
- Example usage:
- .. code-block: c
- void foo(struct device *dev, struct dma_buf_map *base_map)
- {
...
struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, FIELD_OFFSET);
...
- }
- */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map) \
- { \
.vaddr = (map_)->vaddr + (offset_), \
.is_iomem = (map_)->is_iomem, \
- }
It's illegal to access .vaddr with raw pointer. Always use a dma_buf_memcpy_() interface. So why would you need this macro when you have dma_buf_memcpy_*() with an offset parameter?
I've also been very careful to distinguish between .vaddr and .vaddr_iomem, even in places where I wouldn't have to. This macro breaks the assumption.
Best regards Thomas
/**
- dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory
- @map: The dma-buf mapping structure
On Thu, Jan 27, 2022 at 03:33:12PM +0100, Thomas Zimmermann wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 65e927d9ce33..3514a859f628 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -131,6 +131,35 @@ struct dma_buf_map { .is_iomem = false, \ } +/**
- DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from another dma_buf_map
- @map_: The dma-buf mapping structure to copy from
- @offset: Offset to add to the other mapping
- Initializes a new dma_buf_struct based on another. This is the equivalent of doing:
- .. code-block: c
- dma_buf_map map = other_map;
- dma_buf_map_incr(&map, &offset);
- Example usage:
- .. code-block: c
- void foo(struct device *dev, struct dma_buf_map *base_map)
- {
...
struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map, FIELD_OFFSET);
...
- }
- */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map) \
- { \
.vaddr = (map_)->vaddr + (offset_), \
.is_iomem = (map_)->is_iomem, \
- }
It's illegal to access .vaddr with raw pointer. Always use a dma_buf_memcpy_() interface. So why would you need this macro when you have dma_buf_memcpy_*() with an offset parameter?
I did a better job with an example in 20220127093332.wnkd2qy4tvwg5i5l@ldmartin-desk2
While doing this series I had code like this when using the API in a function to parse/update part of the struct mapped:
int bla_parse_foo(struct dma_buf_map *bla_map) { struct dma_buf_map foo_map = *bla_map; ...
dma_buf_map_incr(&foo_map, offsetof(struct bla, foo));
... }
Pasting the rest of the reply here:
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
This is useful when you have several small functions in charge of updating/reading inner struct members.
I've also been very careful to distinguish between .vaddr and .vaddr_iomem, even in places where I wouldn't have to. This macro breaks the assumption.
That's one reason I think if we have this macro, it should be in the dma_buf_map.h header (or whatever we rename these APIs to). It's the only place where we can safely add code that relies on the implementation of the "private" fields in struct dma_buf_map.
Lucas De Marchi
Best regards Thomas
/**
- dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory
- @map: The dma-buf mapping structure
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Hi
Am 27.01.22 um 16:59 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 03:33:12PM +0100, Thomas Zimmermann wrote:
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
When dma_buf_map struct is passed around, it's useful to be able to initialize a second map that takes care of reading/writing to an offset of the original map.
Add a helper that copies the struct and add the offset to the proper address.
Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 65e927d9ce33..3514a859f628 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -131,6 +131,35 @@ struct dma_buf_map { .is_iomem = false, \ } +/**
- DMA_BUF_MAP_INIT_OFFSET - Initializes struct dma_buf_map from
another dma_buf_map
- @map_: The dma-buf mapping structure to copy from
- @offset: Offset to add to the other mapping
- Initializes a new dma_buf_struct based on another. This is the
equivalent of doing:
- .. code-block: c
- * dma_buf_map map = other_map;
- * dma_buf_map_incr(&map, &offset);
- Example usage:
- .. code-block: c
- * void foo(struct device *dev, struct dma_buf_map *base_map)
- * {
- * ...
- * struct dma_buf_map = DMA_BUF_MAP_INIT_OFFSET(base_map,
FIELD_OFFSET);
- * ...
- * }
- */
+#define DMA_BUF_MAP_INIT_OFFSET(map_, offset_) (struct dma_buf_map) \ + { \ + .vaddr = (map_)->vaddr + (offset_), \ + .is_iomem = (map_)->is_iomem, \ + }
It's illegal to access .vaddr with raw pointer. Always use a dma_buf_memcpy_() interface. So why would you need this macro when you have dma_buf_memcpy_*() with an offset parameter?
I did a better job with an example in 20220127093332.wnkd2qy4tvwg5i5l@ldmartin-desk2
While doing this series I had code like this when using the API in a function to parse/update part of the struct mapped:
int bla_parse_foo(struct dma_buf_map *bla_map) { struct dma_buf_map foo_map = *bla_map; ...
dma_buf_map_incr(&foo_map, offsetof(struct bla, foo));
... }
Pasting the rest of the reply here:
I had exactly this code above, but after writting quite a few patches using it, particularly with functions that have to write to 2 maps (see patch 6 for example), it felt much better to have something to initialize correctly from the start
struct dma_buf_map other_map = *bla_map; /* poor Lucas forgetting dma_buf_map_incr(map, offsetof(...)); */
is error prone and hard to debug since you will be reading/writting from/to another location rather than exploding
Indeed. We have soem very specific use cases in graphics code, when dma_buf_map_incr() makes sense. But it's really bad for others. I guess that the docs should talk about this.
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
This is useful when you have several small functions in charge of updating/reading inner struct members.
You won't need an extra variable or the initializer macro if you add an offset parameter to dma_buf_memcpy_{from,to}. Simple pass offsetof(..) to that parameter and it will do the right thing.
It avoids the problems of the current macro and is even more flexible. On top of that, you can build whatever convenience macros you need for i915.
Best regards Thomas
I've also been very careful to distinguish between .vaddr and .vaddr_iomem, even in places where I wouldn't have to. This macro breaks the assumption.
That's one reason I think if we have this macro, it should be in the dma_buf_map.h header (or whatever we rename these APIs to). It's the only place where we can safely add code that relies on the implementation of the "private" fields in struct dma_buf_map.
Lucas De Marchi
Best regards Thomas
/** * dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory * @map: The dma-buf mapping structure
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Hi
Am 28.01.22 um 09:15 schrieb Thomas Zimmermann: ...
While with the construct below
other_map; ... other_map = INITIALIZER()
I can rely on the compiler complaining about uninitialized var. And in most of the cases I can just have this single line in the beggining of the function when the offset is constant:
struct dma_buf_map other_map = INITIALIZER(bla_map, offsetof(..));
This is useful when you have several small functions in charge of updating/reading inner struct members.
You won't need an extra variable or the initializer macro if you add an offset parameter to dma_buf_memcpy_{from,to}. Simple pass offsetof(..) to that parameter and it will do the right thing.
It avoids the problems of the current macro and is even more flexible. On top of that, you can build whatever convenience macros you need for i915.
And maybe put all changes to the dma_buf_map interface into a single patch. It makes it easier to review and discuss.
Best regards Thomas
Best regards Thomas
I've also been very careful to distinguish between .vaddr and .vaddr_iomem, even in places where I wouldn't have to. This macro breaks the assumption.
That's one reason I think if we have this macro, it should be in the dma_buf_map.h header (or whatever we rename these APIs to). It's the only place where we can safely add code that relies on the implementation of the "private" fields in struct dma_buf_map.
Lucas De Marchi
Best regards Thomas
/** * dma_buf_map_set_vaddr - Sets a dma-buf mapping structure to an address in system memory * @map: The dma-buf mapping structure
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Add a variant of shmem_read() that takes a dma_buf_map pointer rather than a plain pointer as argument. It's mostly a copy __shmem_rw() but adapting the api and removing the write support since there's currently only need to use dma_buf_map as destination.
Reworking __shmem_rw() to share the implementation was tempting, but finding a good balance between reuse and clarity pushed towards a little code duplication. Since the function is small, just add the similar function with a copy/paste/adapt approach.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: David Airlie airlied@linux.ie Cc: Daniel Vetter daniel@ffwll.ch Cc: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/shmem_utils.c | 32 +++++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/shmem_utils.h | 3 +++ 2 files changed, 35 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c index 0683b27a3890..d7968e68ccfb 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.c +++ b/drivers/gpu/drm/i915/gt/shmem_utils.c @@ -3,6 +3,7 @@ * Copyright © 2020 Intel Corporation */
+#include <linux/dma-buf-map.h> #include <linux/mm.h> #include <linux/pagemap.h> #include <linux/shmem_fs.h> @@ -123,6 +124,37 @@ static int __shmem_rw(struct file *file, loff_t off, return 0; }
+int shmem_read_to_dma_buf_map(struct file *file, loff_t off, + struct dma_buf_map *map, size_t len) +{ + struct dma_buf_map map_iter = *map; + unsigned long pfn; + + for (pfn = off >> PAGE_SHIFT; len; pfn++) { + unsigned int this = + min_t(size_t, PAGE_SIZE - offset_in_page(off), len); + struct page *page; + void *vaddr; + + page = shmem_read_mapping_page_gfp(file->f_mapping, pfn, + GFP_KERNEL); + if (IS_ERR(page)) + return PTR_ERR(page); + + vaddr = kmap(page); + dma_buf_map_memcpy_to(&map_iter, vaddr + offset_in_page(off), this); + mark_page_accessed(page); + kunmap(page); + put_page(page); + + len -= this; + dma_buf_map_incr(&map_iter, this); + off = 0; + } + + return 0; +} + int shmem_read(struct file *file, loff_t off, void *dst, size_t len) { return __shmem_rw(file, off, dst, len, false); diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.h b/drivers/gpu/drm/i915/gt/shmem_utils.h index c1669170c351..a3d4ce966f74 100644 --- a/drivers/gpu/drm/i915/gt/shmem_utils.h +++ b/drivers/gpu/drm/i915/gt/shmem_utils.h @@ -8,6 +8,7 @@
#include <linux/types.h>
+struct dma_buf_map; struct drm_i915_gem_object; struct file;
@@ -17,6 +18,8 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj); void *shmem_pin_map(struct file *file); void shmem_unpin_map(struct file *file, void *ptr);
+int shmem_read_to_dma_buf_map(struct file *file, loff_t off, + struct dma_buf_map *map, size_t len); int shmem_read(struct file *file, loff_t off, void *dst, size_t len); int shmem_write(struct file *file, loff_t off, void *src, size_t len);
Convert intel_guc_ads_create() and initialization to use dma_buf_map rather than plain pointer and save it in the guc struct. This will help with additional updates to the ads_blob after the creation/initialization by abstracting the IO vs system memory.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 4 +++- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 ++++++ 2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 697d9d66acef..e2e0df1c3d91 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -6,8 +6,9 @@ #ifndef _INTEL_GUC_H_ #define _INTEL_GUC_H_
-#include <linux/xarray.h> #include <linux/delay.h> +#include <linux/dma-buf.h> +#include <linux/xarray.h>
#include "intel_uncore.h" #include "intel_guc_fw.h" @@ -148,6 +149,7 @@ struct intel_guc { struct i915_vma *ads_vma; /** @ads_blob: contents of the GuC ADS */ struct __guc_ads_blob *ads_blob; + struct dma_buf_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 668bf4ac9b0c..c012858376f0 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -623,6 +623,11 @@ int intel_guc_ads_create(struct intel_guc *guc) if (ret) return ret;
+ if (i915_gem_object_is_lmem(guc->ads_vma->obj)) + dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)guc->ads_blob); + else + dma_buf_map_set_vaddr(&guc->ads_map, guc->ads_blob); + __guc_ads_init(guc);
return 0; @@ -644,6 +649,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc) { i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); guc->ads_blob = NULL; + dma_buf_map_clear(&guc->ads_map); }
static void guc_ads_private_data_reset(struct intel_guc *guc)
Add helpers on top of dma_buf_map_read_field() / dma_buf_map_write_field() functions so they always use the right arguments and make code easier to read.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index c012858376f0..01d2c1ead680 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -59,6 +59,14 @@ struct __guc_ads_blob { struct guc_mmio_reg regset[0]; } __packed;
+#define ads_blob_read(guc_, field_) \ + dma_buf_map_read_field(&(guc_)->ads_map, struct __guc_ads_blob, \ + field_) + +#define ads_blob_write(guc_, field_, val_) \ + dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\ + field_, val_) + static u32 guc_ads_regset_size(struct intel_guc *guc) { GEM_BUG_ON(!guc->ads_regset_size);
Now the map is saved during creation, so use it to initialize the golden context, reading from shmem and writing to either system or IO memory.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 +++++++++++----------- 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 01d2c1ead680..bcf52ac4fe35 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -473,18 +473,17 @@ static struct intel_engine_cs *find_engine_state(struct intel_gt *gt, u8 engine_
static void guc_init_golden_context(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; struct intel_engine_cs *engine; struct intel_gt *gt = guc_to_gt(guc); + struct dma_buf_map golden_context_map; u32 addr_ggtt, offset; u32 total_size = 0, alloc_size, real_size; u8 engine_class, guc_class; - u8 *ptr;
if (!intel_uc_uses_guc_submission(>->uc)) return;
- GEM_BUG_ON(!blob); + GEM_BUG_ON(dma_buf_map_is_null(&guc->ads_map));
/* * Go back and fill in the golden context data now that it is @@ -492,15 +491,15 @@ static void guc_init_golden_context(struct intel_guc *guc) */ offset = guc_ads_golden_ctxt_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - ptr = ((u8 *)blob) + offset; + + golden_context_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset);
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) { if (engine_class == OTHER_CLASS) continue;
guc_class = engine_class_to_guc_class(engine_class); - - if (!blob->system_info.engine_enabled_masks[guc_class]) + if (!ads_blob_read(guc, system_info.engine_enabled_masks[guc_class])) continue;
real_size = intel_engine_context_size(gt, engine_class); @@ -511,18 +510,20 @@ static void guc_init_golden_context(struct intel_guc *guc) if (!engine) { drm_err(>->i915->drm, "No engine state recorded for class %d!\n", engine_class); - blob->ads.eng_state_size[guc_class] = 0; - blob->ads.golden_context_lrca[guc_class] = 0; + ads_blob_write(guc, ads.eng_state_size[guc_class], 0); + ads_blob_write(guc, ads.golden_context_lrca[guc_class], 0); continue; }
- GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != + GEM_BUG_ON(ads_blob_read(guc, ads.eng_state_size[guc_class]) != real_size - LRC_SKIP_SIZE); - GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt); + GEM_BUG_ON(ads_blob_read(guc, ads.golden_context_lrca[guc_class]) != addr_ggtt); + addr_ggtt += alloc_size;
- shmem_read(engine->default_state, 0, ptr, real_size); - ptr += alloc_size; + shmem_read_to_dma_buf_map(engine->default_state, 0, + &golden_context_map, real_size); + dma_buf_map_incr(&golden_context_map, alloc_size); }
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
Use dma_buf_map to write the policies update so access to IO and system memory is abstracted away.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 41 ++++++++++++---------- 1 file changed, 23 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index bcf52ac4fe35..2ffe5836f95e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -130,33 +130,37 @@ static u32 guc_ads_blob_size(struct intel_guc *guc) guc_ads_private_data_size(guc); }
-static void guc_policies_init(struct intel_guc *guc, struct guc_policies *policies) +static void guc_policies_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; + u32 global_flags = 0;
- policies->dpc_promote_time = GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US; - policies->max_num_work_items = GLOBAL_POLICY_MAX_NUM_WI; + ads_blob_write(guc, policies.dpc_promote_time, + GLOBAL_POLICY_DEFAULT_DPC_PROMOTE_TIME_US); + ads_blob_write(guc, policies.max_num_work_items, + GLOBAL_POLICY_MAX_NUM_WI);
- policies->global_flags = 0; if (i915->params.reset < 2) - policies->global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET; + global_flags |= GLOBAL_POLICY_DISABLE_ENGINE_RESET;
- policies->is_valid = 1; + ads_blob_write(guc, policies.global_flags, global_flags); + ads_blob_write(guc, policies.is_valid, 1); }
void intel_guc_ads_print_policy_info(struct intel_guc *guc, struct drm_printer *dp) { - struct __guc_ads_blob *blob = guc->ads_blob; - - if (unlikely(!blob)) + if (unlikely(dma_buf_map_is_null(&guc->ads_map))) return;
drm_printf(dp, "Global scheduling policies:\n"); - drm_printf(dp, " DPC promote time = %u\n", blob->policies.dpc_promote_time); - drm_printf(dp, " Max num work items = %u\n", blob->policies.max_num_work_items); - drm_printf(dp, " Flags = %u\n", blob->policies.global_flags); + drm_printf(dp, " DPC promote time = %u\n", + ads_blob_read(guc, policies.dpc_promote_time)); + drm_printf(dp, " Max num work items = %u\n", + ads_blob_read(guc, policies.max_num_work_items)); + drm_printf(dp, " Flags = %u\n", + ads_blob_read(guc, policies.global_flags)); }
static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset) @@ -171,23 +175,24 @@ static int guc_action_policies_update(struct intel_guc *guc, u32 policy_offset)
int intel_guc_global_policies_update(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; struct intel_gt *gt = guc_to_gt(guc); + u32 scheduler_policies; intel_wakeref_t wakeref; int ret;
- if (!blob) + if (dma_buf_map_is_null(&guc->ads_map)) return -EOPNOTSUPP;
- GEM_BUG_ON(!blob->ads.scheduler_policies); + scheduler_policies = ads_blob_read(guc, ads.scheduler_policies); + GEM_BUG_ON(!scheduler_policies);
- guc_policies_init(guc, &blob->policies); + guc_policies_init(guc);
if (!intel_guc_is_ready(guc)) return 0;
with_intel_runtime_pm(>->i915->runtime_pm, wakeref) - ret = guc_action_policies_update(guc, blob->ads.scheduler_policies); + ret = guc_action_policies_update(guc, scheduler_policies);
return ret; } @@ -557,7 +562,7 @@ static void __guc_ads_init(struct intel_guc *guc) u32 base;
/* GuC scheduling policies */ - guc_policies_init(guc, &blob->policies); + guc_policies_init(guc);
/* System info */ fill_engine_enable_masks(gt, &blob->system_info);
Use dma_buf_map to read fields from the dma_blob so access to IO and system memory is abstracted away.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 14 ++++++-------- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h | 3 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 11 +++++++---- 3 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 2ffe5836f95e..fe1e71adfca1 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -698,18 +698,16 @@ void intel_guc_ads_reset(struct intel_guc *guc)
u32 intel_guc_engine_usage_offset(struct intel_guc *guc) { - struct __guc_ads_blob *blob = guc->ads_blob; - u32 base = intel_guc_ggtt_offset(guc, guc->ads_vma); - u32 offset = base + ptr_offset(blob, engine_usage); - - return offset; + return intel_guc_ggtt_offset(guc, guc->ads_vma) + + offsetof(struct __guc_ads_blob, engine_usage); }
-struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine) +struct dma_buf_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine) { struct intel_guc *guc = &engine->gt->uc.guc; - struct __guc_ads_blob *blob = guc->ads_blob; u8 guc_class = engine_class_to_guc_class(engine->class); + size_t offset = offsetof(struct __guc_ads_blob, + engine_usage.engines[guc_class][ilog2(engine->logical_mask)]);
- return &blob->engine_usage.engines[guc_class][ilog2(engine->logical_mask)]; + return DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset); } diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h index e74c110facff..27f5b1f9ddac 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h @@ -7,6 +7,7 @@ #define _INTEL_GUC_ADS_H_
#include <linux/types.h> +#include <linux/dma-buf-map.h>
struct intel_guc; struct drm_printer; @@ -18,7 +19,7 @@ void intel_guc_ads_init_late(struct intel_guc *guc); void intel_guc_ads_reset(struct intel_guc *guc); void intel_guc_ads_print_policy_info(struct intel_guc *guc, struct drm_printer *p); -struct guc_engine_usage_record *intel_guc_engine_usage(struct intel_engine_cs *engine); +struct dma_buf_map intel_guc_engine_usage_record_map(struct intel_engine_cs *engine); u32 intel_guc_engine_usage_offset(struct intel_guc *guc);
#endif diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index db9615dcb0ec..57bfb4ad0ab8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -1125,14 +1125,17 @@ __extend_last_switch(struct intel_guc *guc, u64 *prev_start, u32 new_start) *prev_start = ((u64)gt_stamp_hi << 32) | new_start; }
+#define record_read(map_, field_) \ + dma_buf_map_read_field(map_, struct guc_engine_usage_record, field_) + static void guc_update_engine_gt_clks(struct intel_engine_cs *engine) { - struct guc_engine_usage_record *rec = intel_guc_engine_usage(engine); + struct dma_buf_map rec_map = intel_guc_engine_usage_record_map(engine); struct intel_engine_guc_stats *stats = &engine->stats.guc; struct intel_guc *guc = &engine->gt->uc.guc; - u32 last_switch = rec->last_switch_in_stamp; - u32 ctx_id = rec->current_context_index; - u32 total = rec->total_runtime; + u32 last_switch = record_read(&rec_map, last_switch_in_stamp); + u32 ctx_id = record_read(&rec_map, current_context_index); + u32 total = record_read(&rec_map, total_runtime);
lockdep_assert_held(&guc->timestamp.lock);
Just like memcpy_toio(), there is also need to write a direct value to a memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()
Cc: Matt Roper matthew.d.roper@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- include/linux/dma-buf-map.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 3514a859f628..c9fb04264cd0 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *sr memcpy(dst->vaddr, src, len); }
+/** + * dma_buf_map_memset - Memset into dma-buf mapping + * @dst: The dma-buf mapping structure + * @value: The value to set + * @len: The number of bytes to set in dst + * + * Set value in dma-buf mapping. Depending on the buffer's location, the helper + * picks the correct method of accessing the memory. + */ +static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, size_t len) +{ + if (dst->is_iomem) + memset_io(dst->vaddr_iomem, value, len); + else + memset(dst->vaddr, value, len); +} + /** * dma_buf_map_incr - Increments the address stored in a dma-buf mapping * @map: The dma-buf mapping structure
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
Just like memcpy_toio(), there is also need to write a direct value to a memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()
Cc: Matt Roper matthew.d.roper@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 3514a859f628..c9fb04264cd0 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *sr memcpy(dst->vaddr, src, len); }
+/**
- dma_buf_map_memset - Memset into dma-buf mapping
- @dst: The dma-buf mapping structure
- @value: The value to set
- @len: The number of bytes to set in dst
- Set value in dma-buf mapping. Depending on the buffer's location, the helper
- picks the correct method of accessing the memory.
- */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, size_t len) +{
- if (dst->is_iomem)
memset_io(dst->vaddr_iomem, value, len);
- else
memset(dst->vaddr, value, len);
+}
Yeah, that's certainly a valid use case. But maybe directly add a dma_buf_map_memset_with_offset() variant as well when that helps to avoid patch #2.
Regards, Christian.
/**
- dma_buf_map_incr - Increments the address stored in a dma-buf mapping
- @map: The dma-buf mapping structure
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
Just like memcpy_toio(), there is also need to write a direct value to a memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()
Cc: Matt Roper matthew.d.roper@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 3514a859f628..c9fb04264cd0 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *sr memcpy(dst->vaddr, src, len); }
+/**
- dma_buf_map_memset - Memset into dma-buf mapping
- @dst: The dma-buf mapping structure
- @value: The value to set
- @len: The number of bytes to set in dst
- Set value in dma-buf mapping. Depending on the buffer's location, the helper
- picks the correct method of accessing the memory.
- */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, size_t len) +{
- if (dst->is_iomem)
memset_io(dst->vaddr_iomem, value, len);
- else
memset(dst->vaddr, value, len);
+}
Maybe add an offset parameter here.
Best regards Thomas
- /**
- dma_buf_map_incr - Increments the address stored in a dma-buf mapping
- @map: The dma-buf mapping structure
On Thu, Jan 27, 2022 at 03:54:21PM +0100, Thomas Zimmermann wrote:
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
Just like memcpy_toio(), there is also need to write a direct value to a memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()
Cc: Matt Roper matthew.d.roper@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 3514a859f628..c9fb04264cd0 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *sr memcpy(dst->vaddr, src, len); } +/**
- dma_buf_map_memset - Memset into dma-buf mapping
- @dst: The dma-buf mapping structure
- @value: The value to set
- @len: The number of bytes to set in dst
- Set value in dma-buf mapping. Depending on the buffer's location, the helper
- picks the correct method of accessing the memory.
- */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, size_t len) +{
- if (dst->is_iomem)
memset_io(dst->vaddr_iomem, value, len);
- else
memset(dst->vaddr, value, len);
+}
Maybe add an offset parameter here.
yep, on v2 I will have 2 APIs, one with and one without offset.
thanks Lucas De Marchi
Best regards Thomas
/**
- dma_buf_map_incr - Increments the address stored in a dma-buf mapping
- @map: The dma-buf mapping structure
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Am 27.01.22 um 16:38 schrieb Lucas De Marchi:
On Thu, Jan 27, 2022 at 03:54:21PM +0100, Thomas Zimmermann wrote:
Hi
Am 26.01.22 um 21:36 schrieb Lucas De Marchi:
Just like memcpy_toio(), there is also need to write a direct value to a memory block. Add dma_buf_map_memset() to abstract memset() vs memset_io()
Cc: Matt Roper matthew.d.roper@intel.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Christian König christian.koenig@amd.com Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
include/linux/dma-buf-map.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/include/linux/dma-buf-map.h b/include/linux/dma-buf-map.h index 3514a859f628..c9fb04264cd0 100644 --- a/include/linux/dma-buf-map.h +++ b/include/linux/dma-buf-map.h @@ -317,6 +317,23 @@ static inline void dma_buf_map_memcpy_to(struct dma_buf_map *dst, const void *sr memcpy(dst->vaddr, src, len); } +/**
- dma_buf_map_memset - Memset into dma-buf mapping
- @dst: The dma-buf mapping structure
- @value: The value to set
- @len: The number of bytes to set in dst
- Set value in dma-buf mapping. Depending on the buffer's location,
the helper
- picks the correct method of accessing the memory.
- */
+static inline void dma_buf_map_memset(struct dma_buf_map *dst, int value, size_t len) +{ + if (dst->is_iomem) + memset_io(dst->vaddr_iomem, value, len); + else + memset(dst->vaddr, value, len); +}
Maybe add an offset parameter here.
yep, on v2 I will have 2 APIs, one with and one without offset.
Please, no. Just add the parameter here and pass 0 if yo don't need it.
Best regards Thomas
thanks Lucas De Marchi
Best regards Thomas
/** * dma_buf_map_incr - Increments the address stored in a dma-buf mapping * @map: The dma-buf mapping structure
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev
Use dma_buf_map_memset() to zero the private data as ADS may be either on system or IO memory.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index fe1e71adfca1..15990c229b54 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -668,14 +668,15 @@ void intel_guc_ads_destroy(struct intel_guc *guc)
static void guc_ads_private_data_reset(struct intel_guc *guc) { + struct dma_buf_map map = + DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, guc_ads_private_data_offset(guc)); u32 size;
size = guc_ads_private_data_size(guc); if (!size) return;
- memset((void *)guc->ads_blob + guc_ads_private_data_offset(guc), 0, - size); + dma_buf_map_memset(&map, 0, size); }
/**
Use the saved ads_map to prepare the golden context. One difference from the init context is that this function can be called before there is a gem object (and thus the guc->ads_map) to calculare the size of the golden context that should be allocated for that object.
So in this case the function needs to be prepared for not having the system_info with enabled engines filled out. To accomplish that an info_map is prepared on the side to point either to the gem object or the local variable on the stack. This allows making fill_engine_enable_masks() operate always with a dma_buf_map argument.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 52 +++++++++++++--------- 1 file changed, 32 insertions(+), 20 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 15990c229b54..dd9ec47eed16 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -67,6 +67,12 @@ struct __guc_ads_blob { dma_buf_map_write_field(&(guc_)->ads_map, struct __guc_ads_blob,\ field_, val_)
+#define info_map_write(map_, field_, val_) \ + dma_buf_map_write_field(map_, struct guc_gt_system_info, field_, val_) + +#define info_map_read(map_, field_) \ + dma_buf_map_read_field(map_, struct guc_gt_system_info, field_) + static u32 guc_ads_regset_size(struct intel_guc *guc) { GEM_BUG_ON(!guc->ads_regset_size); @@ -378,24 +384,24 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc, }
static void fill_engine_enable_masks(struct intel_gt *gt, - struct guc_gt_system_info *info) + struct dma_buf_map *info_map) { - info->engine_enabled_masks[GUC_RENDER_CLASS] = 1; - info->engine_enabled_masks[GUC_BLITTER_CLASS] = 1; - info->engine_enabled_masks[GUC_VIDEO_CLASS] = VDBOX_MASK(gt); - info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt); + info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1); + info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt)); + info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], VEBOX_MASK(gt)); }
#define LR_HW_CONTEXT_SIZE (80 * sizeof(u32)) #define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE) -static int guc_prep_golden_context(struct intel_guc *guc, - struct __guc_ads_blob *blob) +static int guc_prep_golden_context(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); u32 addr_ggtt, offset; u32 total_size = 0, alloc_size, real_size; u8 engine_class, guc_class; - struct guc_gt_system_info *info, local_info; + struct guc_gt_system_info local_info; + struct dma_buf_map info_map;
/* * Reserve the memory for the golden contexts and point GuC at it but @@ -409,14 +415,15 @@ static int guc_prep_golden_context(struct intel_guc *guc, * GuC will also validate that the LRC base + size fall within the * allowed GGTT range. */ - if (blob) { + if (!dma_buf_map_is_null(&guc->ads_map)) { offset = guc_ads_golden_ctxt_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - info = &blob->system_info; + info_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, + offsetof(struct __guc_ads_blob, system_info)); } else { memset(&local_info, 0, sizeof(local_info)); - info = &local_info; - fill_engine_enable_masks(gt, info); + dma_buf_map_set_vaddr(&info_map, &local_info); + fill_engine_enable_masks(gt, &info_map); }
for (engine_class = 0; engine_class <= MAX_ENGINE_CLASS; ++engine_class) { @@ -425,14 +432,14 @@ static int guc_prep_golden_context(struct intel_guc *guc,
guc_class = engine_class_to_guc_class(engine_class);
- if (!info->engine_enabled_masks[guc_class]) + if (!info_map_read(&info_map, engine_enabled_masks[guc_class])) continue;
real_size = intel_engine_context_size(gt, engine_class); alloc_size = PAGE_ALIGN(real_size); total_size += alloc_size;
- if (!blob) + if (dma_buf_map_is_null(&guc->ads_map)) continue;
/* @@ -446,12 +453,15 @@ static int guc_prep_golden_context(struct intel_guc *guc, * what comes before it in the context image (which is identical * on all engines). */ - blob->ads.eng_state_size[guc_class] = real_size - LRC_SKIP_SIZE; - blob->ads.golden_context_lrca[guc_class] = addr_ggtt; + ads_blob_write(guc, ads.eng_state_size[guc_class], + real_size - LRC_SKIP_SIZE); + ads_blob_write(guc, ads.golden_context_lrca[guc_class], + addr_ggtt); + addr_ggtt += alloc_size; }
- if (!blob) + if (dma_buf_map_is_null(&guc->ads_map)) return total_size;
GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); @@ -559,13 +569,15 @@ static void __guc_ads_init(struct intel_guc *guc) struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; struct __guc_ads_blob *blob = guc->ads_blob; + struct dma_buf_map info_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, + offsetof(struct __guc_ads_blob, system_info)); u32 base;
/* GuC scheduling policies */ guc_policies_init(guc);
/* System info */ - fill_engine_enable_masks(gt, &blob->system_info); + fill_engine_enable_masks(gt, &info_map);
blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] = hweight8(gt->info.sseu.slice_mask); @@ -581,7 +593,7 @@ static void __guc_ads_init(struct intel_guc *guc) }
/* Golden contexts for re-initialising after a watchdog reset */ - guc_prep_golden_context(guc, blob); + guc_prep_golden_context(guc);
guc_mapping_table_init(guc_to_gt(guc), &blob->system_info);
@@ -624,7 +636,7 @@ int intel_guc_ads_create(struct intel_guc *guc) guc->ads_regset_size = ret;
/* Likewise the golden contexts: */ - ret = guc_prep_golden_context(guc, NULL); + ret = guc_prep_golden_context(guc); if (ret < 0) return ret; guc->ads_golden_ctxt_size = ret;
In the other places in this function, guc->ads_map is being protected from access when it's not yet set. However the last check is actually about guc->ads_golden_ctxt_size been set before. These checks should always match as the size is initialized on the first call to guc_prep_golden_context(), but it's clearer if we have a single return and check for guc->ads_golden_ctxt_size.
This is just a readability improvement, no change in behavior.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index dd9ec47eed16..8e4768289792 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -461,10 +461,10 @@ static int guc_prep_golden_context(struct intel_guc *guc) addr_ggtt += alloc_size; }
- if (dma_buf_map_is_null(&guc->ads_map)) - return total_size; + /* Make sure current size matches what we calculated previously */ + if (guc->ads_golden_ctxt_size) + GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size);
- GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); return total_size; }
Use dma_buf_map to write the fields system_info.mapping_table[][]. Since we already have the info_map around where needed, just use it instead of going through guc->ads_map.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 8e4768289792..dca7c3db9cdd 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -204,7 +204,7 @@ int intel_guc_global_policies_update(struct intel_guc *guc) }
static void guc_mapping_table_init(struct intel_gt *gt, - struct guc_gt_system_info *system_info) + struct dma_buf_map *info_map) { unsigned int i, j; struct intel_engine_cs *engine; @@ -213,14 +213,14 @@ static void guc_mapping_table_init(struct intel_gt *gt, /* Table must be set to invalid values for entries not used */ for (i = 0; i < GUC_MAX_ENGINE_CLASSES; ++i) for (j = 0; j < GUC_MAX_INSTANCES_PER_CLASS; ++j) - system_info->mapping_table[i][j] = - GUC_MAX_INSTANCES_PER_CLASS; + info_map_write(info_map, mapping_table[i][j], + GUC_MAX_INSTANCES_PER_CLASS);
for_each_engine(engine, gt, id) { u8 guc_class = engine_class_to_guc_class(engine->class);
- system_info->mapping_table[guc_class][ilog2(engine->logical_mask)] = - engine->instance; + info_map_write(info_map, mapping_table[guc_class][ilog2(engine->logical_mask)], + engine->instance); } }
@@ -595,7 +595,7 @@ static void __guc_ads_init(struct intel_guc *guc) /* Golden contexts for re-initialising after a watchdog reset */ guc_prep_golden_context(guc);
- guc_mapping_table_init(guc_to_gt(guc), &blob->system_info); + guc_mapping_table_init(guc_to_gt(guc), &info_map);
base = intel_guc_ggtt_offset(guc, guc->ads_vma);
Use dma_buf_map to write the fields ads.capture_*.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index dca7c3db9cdd..cad1e325656e 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -544,7 +544,7 @@ static void guc_init_golden_context(struct intel_guc *guc) GEM_BUG_ON(guc->ads_golden_ctxt_size != total_size); }
-static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob *blob) +static void guc_capture_list_init(struct intel_guc *guc) { int i, j; u32 addr_ggtt, offset; @@ -556,11 +556,11 @@ static void guc_capture_list_init(struct intel_guc *guc, struct __guc_ads_blob *
for (i = 0; i < GUC_CAPTURE_LIST_INDEX_MAX; i++) { for (j = 0; j < GUC_MAX_ENGINE_CLASSES; j++) { - blob->ads.capture_instance[i][j] = addr_ggtt; - blob->ads.capture_class[i][j] = addr_ggtt; + ads_blob_write(guc, ads.capture_instance[i][j], addr_ggtt); + ads_blob_write(guc, ads.capture_class[i][j], addr_ggtt); }
- blob->ads.capture_global[i] = addr_ggtt; + ads_blob_write(guc, ads.capture_global[i], addr_ggtt); } }
@@ -600,7 +600,7 @@ static void __guc_ads_init(struct intel_guc *guc) base = intel_guc_ggtt_offset(guc, guc->ads_vma);
/* Capture list for hang debug */ - guc_capture_list_init(guc, blob); + guc_capture_list_init(guc);
/* ADS */ blob->ads.scheduler_policies = base + ptr_offset(blob, policies);
Currently guc_mmio_reg_add() relies on having enough memory available in the array to add a new slot. It uses `GEM_BUG_ON(count >= regset->size);` to protect going above the threshold.
In order to allow guc_mmio_reg_add() to handle the memory allocation by itself, it must return an error in case of failures. Adjust return code so this error can be propagated to the callers of guc_mmio_reg_add() and guc_mmio_regset_init().
No intended change in behavior.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 31 +++++++++++++--------- 1 file changed, 18 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index cad1e325656e..73ca34de44f7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -244,8 +244,8 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; }
-static void guc_mmio_reg_add(struct temp_regset *regset, - u32 offset, u32 flags) +static long __must_check guc_mmio_reg_add(struct temp_regset *regset, + u32 offset, u32 flags) { u32 count = regset->used; struct guc_mmio_reg reg = { @@ -264,7 +264,7 @@ static void guc_mmio_reg_add(struct temp_regset *regset, */ if (bsearch(®, regset->registers, count, sizeof(reg), guc_mmio_reg_cmp)) - return; + return 0;
slot = ®set->registers[count]; regset->used++; @@ -277,6 +277,8 @@ static void guc_mmio_reg_add(struct temp_regset *regset,
swap(slot[1], slot[0]); } + + return 0; }
#define GUC_MMIO_REG_ADD(regset, reg, masked) \ @@ -284,32 +286,35 @@ static void guc_mmio_reg_add(struct temp_regset *regset, i915_mmio_reg_offset((reg)), \ (masked) ? GUC_REGSET_MASKED : 0)
-static void guc_mmio_regset_init(struct temp_regset *regset, - struct intel_engine_cs *engine) +static int guc_mmio_regset_init(struct temp_regset *regset, + struct intel_engine_cs *engine) { const u32 base = engine->mmio_base; struct i915_wa_list *wal = &engine->wa_list; struct i915_wa *wa; unsigned int i; + int ret = 0;
regset->used = 0;
- GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); - GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); - GUC_MMIO_REG_ADD(regset, RING_IMR(base), false); + ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); + ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); + ret |= GUC_MMIO_REG_ADD(regset, RING_IMR(base), false);
for (i = 0, wa = wal->list; i < wal->count; i++, wa++) - GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg); + ret |= GUC_MMIO_REG_ADD(regset, wa->reg, wa->masked_reg);
/* Be extra paranoid and include all whitelist registers. */ for (i = 0; i < RING_MAX_NONPRIV_SLOTS; i++) - GUC_MMIO_REG_ADD(regset, - RING_FORCE_TO_NONPRIV(base, i), - false); + ret |= GUC_MMIO_REG_ADD(regset, + RING_FORCE_TO_NONPRIV(base, i), + false);
/* add in local MOCS registers */ for (i = 0; i < GEN9_LNCFCMOCS_REG_COUNT; i++) - GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false); + ret |= GUC_MMIO_REG_ADD(regset, GEN9_LNCFCMOCS(i), false); + + return ret ? -1 : 0; }
static int guc_mmio_reg_state_query(struct intel_guc *guc)
The ADS initialitazion was using 2 passes to calculate the regset sent to GuC to initialize each engine: the first pass to just have the final object size and the second to set each register in place in the final gem object.
However in order to maintain an ordered set of registers to pass to guc, each register needs to be added and moved in the final array. The second phase may actually happen in IO memory rather than system memory and accessing IO memory by simply dereferencing the pointer doesn't work on all architectures. Other places of the ADS initializaition were converted to use the dma_buf_map API, but here there may be a lot more accesses to IO memory. So, instead of following that same approach, convert the regset initialization to calculate the final array in 1 pass and in the second pass that array is just copied to its final location, updating the pointers for each engine written to the ADS blob.
One important thing is that struct temp_regset now have different semantics: `registers` continues to track the registers of a single engine, however the other fields are updated together, according to the newly added `storage`, which tracks the memory allocated for all the registers. So rename some of these fields and add a __mmio_reg_add(): this function (possibly) allocates memory and operates on the storage pointer while guc_mmio_reg_add() continues to manage the registers pointer.
On a Tiger Lake system using enable_guc=3, the following log message is now seen:
[ 187.334310] i915 0000:00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset
This change has also been tested on an ARM64 host with DG2 and other discrete graphics cards.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 7 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 117 +++++++++++++-------- 2 files changed, 79 insertions(+), 45 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index e2e0df1c3d91..4c852eee3ad8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -152,6 +152,13 @@ struct intel_guc { struct dma_buf_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; + /** + * @ads_regset_count: number of save/restore registers in the ADS for + * each engine + */ + u32 ads_regset_count[I915_NUM_ENGINES]; + /** @ads_regset: save/restore regsets in the ADS */ + struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; /** @ads_engine_usage_size: size of engine usage in the ADS */ diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 73ca34de44f7..390101ee3661 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -226,14 +226,13 @@ static void guc_mapping_table_init(struct intel_gt *gt,
/* * The save/restore register list must be pre-calculated to a temporary - * buffer of driver defined size before it can be generated in place - * inside the ADS. + * buffer before it can be copied inside the ADS. */ -#define MAX_MMIO_REGS 128 /* Arbitrary size, increase as needed */ struct temp_regset { struct guc_mmio_reg *registers; - u32 used; - u32 size; + struct guc_mmio_reg *storage; + u32 storage_used; + u32 storage_max; };
static int guc_mmio_reg_cmp(const void *a, const void *b) @@ -244,18 +243,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; }
+static struct guc_mmio_reg * __must_check +__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) +{ + u32 pos = regset->storage_used; + struct guc_mmio_reg *slot; + + if (pos >= regset->storage_max) { + size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE); + struct guc_mmio_reg *r = krealloc(regset->storage, + size, GFP_KERNEL); + if (!r) { + WARN_ONCE(1, "Incomplete regset list: can't add register (%d)\n", + -ENOMEM); + return ERR_PTR(-ENOMEM); + } + + regset->registers = r + (regset->registers - regset->storage); + regset->storage = r; + regset->storage_max = size / sizeof(*slot); + } + + slot = ®set->storage[pos]; + regset->storage_used++; + *slot = *reg; + + return slot; +} + static long __must_check guc_mmio_reg_add(struct temp_regset *regset, u32 offset, u32 flags) { - u32 count = regset->used; + u32 count = regset->storage_used - (regset->registers - regset->storage); struct guc_mmio_reg reg = { .offset = offset, .flags = flags, }; struct guc_mmio_reg *slot;
- GEM_BUG_ON(count >= regset->size); - /* * The mmio list is built using separate lists within the driver. * It's possible that at some point we may attempt to add the same @@ -266,9 +291,9 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset, sizeof(reg), guc_mmio_reg_cmp)) return 0;
- slot = ®set->registers[count]; - regset->used++; - *slot = reg; + slot = __mmio_reg_add(regset, ®); + if (IS_ERR(slot)) + return PTR_ERR(slot);
while (slot-- > regset->registers) { GEM_BUG_ON(slot[0].offset == slot[1].offset); @@ -295,7 +320,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset, unsigned int i; int ret = 0;
- regset->used = 0; + /* + * Each engine's registers point to a new start relative to + * storage + */ + regset->registers = regset->storage + regset->storage_used;
ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false); @@ -317,32 +346,28 @@ static int guc_mmio_regset_init(struct temp_regset *regset, return ret ? -1 : 0; }
-static int guc_mmio_reg_state_query(struct intel_guc *guc) +static long guc_mmio_reg_state_create(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; enum intel_engine_id id; - struct temp_regset temp_set; - u32 total; + struct temp_regset temp_set = {}; + long total = 0;
- /* - * Need to actually build the list in order to filter out - * duplicates and other such data dependent constructions. - */ - temp_set.size = MAX_MMIO_REGS; - temp_set.registers = kmalloc_array(temp_set.size, - sizeof(*temp_set.registers), - GFP_KERNEL); - if (!temp_set.registers) - return -ENOMEM; - - total = 0; for_each_engine(engine, gt, id) { - guc_mmio_regset_init(&temp_set, engine); - total += temp_set.used; + u32 used = temp_set.storage_used; + + if (guc_mmio_regset_init(&temp_set, engine) < 0) + return -1; + + guc->ads_regset_count[id] = temp_set.storage_used - used; + total += guc->ads_regset_count[id]; }
- kfree(temp_set.registers); + guc->ads_regset = temp_set.storage; + + drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n", + (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
return total * sizeof(struct guc_mmio_reg); } @@ -352,40 +377,38 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc, { struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; + struct guc_mmio_reg *ads_registers; enum intel_engine_id id; - struct temp_regset temp_set; - struct guc_mmio_reg_set *ads_reg_set; u32 addr_ggtt, offset; - u8 guc_class;
offset = guc_ads_regset_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - temp_set.registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset); - temp_set.size = guc->ads_regset_size / sizeof(temp_set.registers[0]); + ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset); + + memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size);
for_each_engine(engine, gt, id) { + u32 count = guc->ads_regset_count[id]; + struct guc_mmio_reg_set *ads_reg_set; + u8 guc_class; + /* Class index is checked in class converter */ GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
guc_class = engine_class_to_guc_class(engine->class); ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance];
- guc_mmio_regset_init(&temp_set, engine); - if (!temp_set.used) { + if (!count) { ads_reg_set->address = 0; ads_reg_set->count = 0; continue; }
ads_reg_set->address = addr_ggtt; - ads_reg_set->count = temp_set.used; + ads_reg_set->count = count;
- temp_set.size -= temp_set.used; - temp_set.registers += temp_set.used; - addr_ggtt += temp_set.used * sizeof(struct guc_mmio_reg); + addr_ggtt += count * sizeof(struct guc_mmio_reg); } - - GEM_BUG_ON(temp_set.size); }
static void fill_engine_enable_masks(struct intel_gt *gt, @@ -634,8 +657,11 @@ int intel_guc_ads_create(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_vma);
- /* Need to calculate the reg state size dynamically: */ - ret = guc_mmio_reg_state_query(guc); + /* + * Create reg state size dynamically on system memory to be copied to + * the final ads blob on gt init/reset + */ + ret = guc_mmio_reg_state_create(guc); if (ret < 0) return ret; guc->ads_regset_size = ret; @@ -681,6 +707,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc) i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); guc->ads_blob = NULL; dma_buf_map_clear(&guc->ads_map); + kfree(guc->ads_regset); }
static void guc_ads_private_data_reset(struct intel_guc *guc)
Hi Lucas,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on drm-tip/drm-tip] [also build test WARNING on next-20220125] [cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next v5.17-rc1] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refact... base: git://anongit.freedesktop.org/drm/drm-tip drm-tip config: i386-randconfig-m021-20220124 (https://download.01.org/0day-ci/archive/20220127/202201270827.CLIhfdPe-lkp@i...) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efc... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912 git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc # save the config file to linux build tree mkdir build_dir make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
In file included from include/drm/drm_mm.h:51, from drivers/gpu/drm/i915/i915_vma.h:31, from drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h:13, from drivers/gpu/drm/i915/gt/uc/intel_guc.h:20, from drivers/gpu/drm/i915/gt/uc/intel_uc.h:9, from drivers/gpu/drm/i915/gt/intel_gt_types.h:18, from drivers/gpu/drm/i915/gt/intel_gt.h:10, from drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:9: drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c: In function 'guc_mmio_reg_state_create':
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:38: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Wformat=]
369 | drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 370 | (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | u32 {aka unsigned int} include/drm/drm_print.h:461:56: note: in definition of macro 'drm_dbg' 461 | drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, ##__VA_ARGS__) | ^~~ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:46: note: format string is defined here 369 | drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n", | ~~^ | | | long unsigned int | %u
vim +369 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
348 349 static long guc_mmio_reg_state_create(struct intel_guc *guc) 350 { 351 struct intel_gt *gt = guc_to_gt(guc); 352 struct intel_engine_cs *engine; 353 enum intel_engine_id id; 354 struct temp_regset temp_set = {}; 355 long total = 0; 356 357 for_each_engine(engine, gt, id) { 358 u32 used = temp_set.storage_used; 359 360 if (guc_mmio_regset_init(&temp_set, engine) < 0) 361 return -1; 362 363 guc->ads_regset_count[id] = temp_set.storage_used - used; 364 total += guc->ads_regset_count[id]; 365 } 366 367 guc->ads_regset = temp_set.storage; 368
369 drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n",
370 (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10); 371 372 return total * sizeof(struct guc_mmio_reg); 373 } 374
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Lucas,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on drm-tip/drm-tip] [also build test WARNING on next-20220125] [cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next v5.17-rc1] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refact... base: git://anongit.freedesktop.org/drm/drm-tip drm-tip config: i386-randconfig-a011 (https://download.01.org/0day-ci/archive/20220127/202201270902.HcRe2frP-lkp@i...) compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 2a1b7aa016c0f4b5598806205bdfbab1ea2d92c4) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efc... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912 git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/gpu/drm/i915/
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:370:3: warning: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Wformat]
(temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/drm/drm_print.h:461:63: note: expanded from macro 'drm_dbg' drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ 1 warning generated.
vim +370 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
348 349 static long guc_mmio_reg_state_create(struct intel_guc *guc) 350 { 351 struct intel_gt *gt = guc_to_gt(guc); 352 struct intel_engine_cs *engine; 353 enum intel_engine_id id; 354 struct temp_regset temp_set = {}; 355 long total = 0; 356 357 for_each_engine(engine, gt, id) { 358 u32 used = temp_set.storage_used; 359 360 if (guc_mmio_regset_init(&temp_set, engine) < 0) 361 return -1; 362 363 guc->ads_regset_count[id] = temp_set.storage_used - used; 364 total += guc->ads_regset_count[id]; 365 } 366 367 guc->ads_regset = temp_set.storage; 368 369 drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n",
370 (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
371 372 return total * sizeof(struct guc_mmio_reg); 373 } 374
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Lucas,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-tip/drm-tip] [also build test ERROR on next-20220125] [cannot apply to drm-intel/for-linux-next drm-exynos/exynos-drm-next drm/drm-next tegra-drm/drm/tegra/for-next linus/master airlied/drm-next v5.17-rc1] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Lucas-De-Marchi/drm-i915-guc-Refact... base: git://anongit.freedesktop.org/drm/drm-tip drm-tip config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20220127/202201271208.kELpe3Mn-lkp@i...) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): # https://github.com/0day-ci/linux/commit/313757d9ed833acea4ee2bb0e3f3565d6efc... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Lucas-De-Marchi/drm-i915-guc-Refactor-ADS-access-to-use-dma_buf_map/20220127-043912 git checkout 313757d9ed833acea4ee2bb0e3f3565d6efcf3cc # save the config file to linux build tree mkdir build_dir make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
In file included from include/drm/drm_mm.h:51, from drivers/gpu/drm/i915/i915_vma.h:31, from drivers/gpu/drm/i915/gt/uc/intel_uc_fw.h:13, from drivers/gpu/drm/i915/gt/uc/intel_guc.h:20, from drivers/gpu/drm/i915/gt/uc/intel_uc.h:9, from drivers/gpu/drm/i915/gt/intel_gt_types.h:18, from drivers/gpu/drm/i915/gt/intel_gt.h:10, from drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:9: drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c: In function 'guc_mmio_reg_state_create':
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:38: error: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u32' {aka 'unsigned int'} [-Werror=format=]
369 | drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 370 | (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | u32 {aka unsigned int} include/drm/drm_print.h:461:56: note: in definition of macro 'drm_dbg' 461 | drm_dev_dbg((drm) ? (drm)->dev : NULL, DRM_UT_DRIVER, fmt, ##__VA_ARGS__) | ^~~ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c:369:46: note: format string is defined here 369 | drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n", | ~~^ | | | long unsigned int | %u cc1: all warnings being treated as errors
vim +369 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
348 349 static long guc_mmio_reg_state_create(struct intel_guc *guc) 350 { 351 struct intel_gt *gt = guc_to_gt(guc); 352 struct intel_engine_cs *engine; 353 enum intel_engine_id id; 354 struct temp_regset temp_set = {}; 355 long total = 0; 356 357 for_each_engine(engine, gt, id) { 358 u32 used = temp_set.storage_used; 359 360 if (guc_mmio_regset_init(&temp_set, engine) < 0) 361 return -1; 362 363 guc->ads_regset_count[id] = temp_set.storage_used - used; 364 total += guc->ads_regset_count[id]; 365 } 366 367 guc->ads_regset = temp_set.storage; 368
369 drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n",
370 (temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10); 371 372 return total * sizeof(struct guc_mmio_reg); 373 } 374
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On 1/26/2022 12:36 PM, Lucas De Marchi wrote:
The ADS initialitazion was using 2 passes to calculate the regset sent to GuC to initialize each engine: the first pass to just have the final object size and the second to set each register in place in the final gem object.
However in order to maintain an ordered set of registers to pass to guc, each register needs to be added and moved in the final array. The second phase may actually happen in IO memory rather than system memory and accessing IO memory by simply dereferencing the pointer doesn't work on all architectures. Other places of the ADS initializaition were converted to use the dma_buf_map API, but here there may be a lot more accesses to IO memory. So, instead of following that same approach, convert the regset initialization to calculate the final array in 1 pass and in the second pass that array is just copied to its final location, updating the pointers for each engine written to the ADS blob.
One important thing is that struct temp_regset now have different semantics: `registers` continues to track the registers of a single engine, however the other fields are updated together, according to the newly added `storage`, which tracks the memory allocated for all the registers. So rename some of these fields and add a __mmio_reg_add(): this function (possibly) allocates memory and operates on the storage pointer while guc_mmio_reg_add() continues to manage the registers pointer.
On a Tiger Lake system using enable_guc=3, the following log message is now seen:
[ 187.334310] i915 0000:00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset
This change has also been tested on an ARM64 host with DG2 and other discrete graphics cards.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
drivers/gpu/drm/i915/gt/uc/intel_guc.h | 7 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 117 +++++++++++++-------- 2 files changed, 79 insertions(+), 45 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index e2e0df1c3d91..4c852eee3ad8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -152,6 +152,13 @@ struct intel_guc { struct dma_buf_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size;
- /**
* @ads_regset_count: number of save/restore registers in the ADS for
* each engine
*/
- u32 ads_regset_count[I915_NUM_ENGINES];
- /** @ads_regset: save/restore regsets in the ADS */
- struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; /** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 73ca34de44f7..390101ee3661 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -226,14 +226,13 @@ static void guc_mapping_table_init(struct intel_gt *gt,
/*
- The save/restore register list must be pre-calculated to a temporary
- buffer of driver defined size before it can be generated in place
- inside the ADS.
*/
- buffer before it can be copied inside the ADS.
-#define MAX_MMIO_REGS 128 /* Arbitrary size, increase as needed */ struct temp_regset { struct guc_mmio_reg *registers;
- u32 used;
- u32 size;
- struct guc_mmio_reg *storage;
I think this could use a comment to distinguish between registers and storage. Something like.:
/* ptr to the base of the allocated storage for all engines */ struct guc_mmio_reg *storage;
/* ptr to the section of the storage for the engine currently being worked on */ struct guc_mmio_reg *registers;
u32 storage_used;
u32 storage_max; };
static int guc_mmio_reg_cmp(const void *a, const void *b)
@@ -244,18 +243,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; }
+static struct guc_mmio_reg * __must_check +__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) +{
- u32 pos = regset->storage_used;
- struct guc_mmio_reg *slot;
- if (pos >= regset->storage_max) {
size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE);
struct guc_mmio_reg *r = krealloc(regset->storage,
size, GFP_KERNEL);
if (!r) {
WARN_ONCE(1, "Incomplete regset list: can't add register (%d)\n",
-ENOMEM);
return ERR_PTR(-ENOMEM);
}
regset->registers = r + (regset->registers - regset->storage);
regset->storage = r;
regset->storage_max = size / sizeof(*slot);
- }
- slot = ®set->storage[pos];
- regset->storage_used++;
- *slot = *reg;
- return slot;
+}
- static long __must_check guc_mmio_reg_add(struct temp_regset *regset, u32 offset, u32 flags) {
- u32 count = regset->used;
- u32 count = regset->storage_used - (regset->registers - regset->storage); struct guc_mmio_reg reg = { .offset = offset, .flags = flags, }; struct guc_mmio_reg *slot;
- GEM_BUG_ON(count >= regset->size);
- /*
- The mmio list is built using separate lists within the driver.
- It's possible that at some point we may attempt to add the same
@@ -266,9 +291,9 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset, sizeof(reg), guc_mmio_reg_cmp)) return 0;
- slot = ®set->registers[count];
- regset->used++;
- *slot = reg;
slot = __mmio_reg_add(regset, ®);
if (IS_ERR(slot))
return PTR_ERR(slot);
while (slot-- > regset->registers) { GEM_BUG_ON(slot[0].offset == slot[1].offset);
@@ -295,7 +320,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset, unsigned int i; int ret = 0;
- regset->used = 0;
/*
* Each engine's registers point to a new start relative to
* storage
*/
regset->registers = regset->storage + regset->storage_used;
ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
@@ -317,32 +346,28 @@ static int guc_mmio_regset_init(struct temp_regset *regset, return ret ? -1 : 0; }
-static int guc_mmio_reg_state_query(struct intel_guc *guc) +static long guc_mmio_reg_state_create(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; enum intel_engine_id id;
- struct temp_regset temp_set;
- u32 total;
- struct temp_regset temp_set = {};
- long total = 0;
- /*
* Need to actually build the list in order to filter out
* duplicates and other such data dependent constructions.
*/
- temp_set.size = MAX_MMIO_REGS;
- temp_set.registers = kmalloc_array(temp_set.size,
sizeof(*temp_set.registers),
GFP_KERNEL);
- if (!temp_set.registers)
return -ENOMEM;
- total = 0; for_each_engine(engine, gt, id) {
guc_mmio_regset_init(&temp_set, engine);
total += temp_set.used;
u32 used = temp_set.storage_used;
if (guc_mmio_regset_init(&temp_set, engine) < 0)
return -1;
If you fail here you're leaking temp_set.storage. Also, any reason not to just return the return code from guc_mmio_regset_init?
Apart from these minor comments, the change LGTM. IMO we could also merge this patch on its own ahead of the rest of the dma_buf code, because not having to recreate the regset on every reset/resume is still helpful.
Daniele
guc->ads_regset_count[id] = temp_set.storage_used - used;
}total += guc->ads_regset_count[id];
- kfree(temp_set.registers);
guc->ads_regset = temp_set.storage;
drm_dbg(&guc_to_gt(guc)->i915->drm, "Used %lu KB for temporary ADS regset\n",
(temp_set.storage_max * sizeof(struct guc_mmio_reg)) >> 10);
return total * sizeof(struct guc_mmio_reg); }
@@ -352,40 +377,38 @@ static void guc_mmio_reg_state_init(struct intel_guc *guc, { struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine;
- struct guc_mmio_reg *ads_registers; enum intel_engine_id id;
struct temp_regset temp_set;
struct guc_mmio_reg_set *ads_reg_set; u32 addr_ggtt, offset;
u8 guc_class;
offset = guc_ads_regset_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset;
temp_set.registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
temp_set.size = guc->ads_regset_size / sizeof(temp_set.registers[0]);
ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset);
memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size);
for_each_engine(engine, gt, id) {
u32 count = guc->ads_regset_count[id];
struct guc_mmio_reg_set *ads_reg_set;
u8 guc_class;
/* Class index is checked in class converter */ GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
guc_class = engine_class_to_guc_class(engine->class); ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance];
guc_mmio_regset_init(&temp_set, engine);
if (!temp_set.used) {
if (!count) { ads_reg_set->address = 0; ads_reg_set->count = 0; continue;
}
ads_reg_set->address = addr_ggtt;
ads_reg_set->count = temp_set.used;
ads_reg_set->count = count;
temp_set.size -= temp_set.used;
temp_set.registers += temp_set.used;
addr_ggtt += temp_set.used * sizeof(struct guc_mmio_reg);
}addr_ggtt += count * sizeof(struct guc_mmio_reg);
GEM_BUG_ON(temp_set.size); }
static void fill_engine_enable_masks(struct intel_gt *gt,
@@ -634,8 +657,11 @@ int intel_guc_ads_create(struct intel_guc *guc)
GEM_BUG_ON(guc->ads_vma);
- /* Need to calculate the reg state size dynamically: */
- ret = guc_mmio_reg_state_query(guc);
- /*
* Create reg state size dynamically on system memory to be copied to
* the final ads blob on gt init/reset
*/
- ret = guc_mmio_reg_state_create(guc); if (ret < 0) return ret; guc->ads_regset_size = ret;
@@ -681,6 +707,7 @@ void intel_guc_ads_destroy(struct intel_guc *guc) i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); guc->ads_blob = NULL; dma_buf_map_clear(&guc->ads_map);
kfree(guc->ads_regset); }
static void guc_ads_private_data_reset(struct intel_guc *guc)
On Tue, Feb 01, 2022 at 02:42:20PM -0800, Daniele Ceraolo Spurio wrote:
On 1/26/2022 12:36 PM, Lucas De Marchi wrote:
The ADS initialitazion was using 2 passes to calculate the regset sent to GuC to initialize each engine: the first pass to just have the final object size and the second to set each register in place in the final gem object.
However in order to maintain an ordered set of registers to pass to guc, each register needs to be added and moved in the final array. The second phase may actually happen in IO memory rather than system memory and accessing IO memory by simply dereferencing the pointer doesn't work on all architectures. Other places of the ADS initializaition were converted to use the dma_buf_map API, but here there may be a lot more accesses to IO memory. So, instead of following that same approach, convert the regset initialization to calculate the final array in 1 pass and in the second pass that array is just copied to its final location, updating the pointers for each engine written to the ADS blob.
One important thing is that struct temp_regset now have different semantics: `registers` continues to track the registers of a single engine, however the other fields are updated together, according to the newly added `storage`, which tracks the memory allocated for all the registers. So rename some of these fields and add a __mmio_reg_add(): this function (possibly) allocates memory and operates on the storage pointer while guc_mmio_reg_add() continues to manage the registers pointer.
On a Tiger Lake system using enable_guc=3, the following log message is now seen:
[ 187.334310] i915 0000:00:02.0: [drm:intel_guc_ads_create [i915]] Used 4 KB for temporary ADS regset
This change has also been tested on an ARM64 host with DG2 and other discrete graphics cards.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
drivers/gpu/drm/i915/gt/uc/intel_guc.h | 7 ++ drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 117 +++++++++++++-------- 2 files changed, 79 insertions(+), 45 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index e2e0df1c3d91..4c852eee3ad8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -152,6 +152,13 @@ struct intel_guc { struct dma_buf_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size;
- /**
* @ads_regset_count: number of save/restore registers in the ADS for
* each engine
*/
- u32 ads_regset_count[I915_NUM_ENGINES];
- /** @ads_regset: save/restore regsets in the ADS */
- struct guc_mmio_reg *ads_regset; /** @ads_golden_ctxt_size: size of the golden contexts in the ADS */ u32 ads_golden_ctxt_size; /** @ads_engine_usage_size: size of engine usage in the ADS */
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 73ca34de44f7..390101ee3661 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -226,14 +226,13 @@ static void guc_mapping_table_init(struct intel_gt *gt, /*
- The save/restore register list must be pre-calculated to a temporary
- buffer of driver defined size before it can be generated in place
- inside the ADS.
*/
- buffer before it can be copied inside the ADS.
-#define MAX_MMIO_REGS 128 /* Arbitrary size, increase as needed */ struct temp_regset { struct guc_mmio_reg *registers;
- u32 used;
- u32 size;
- struct guc_mmio_reg *storage;
I think this could use a comment to distinguish between registers and storage. Something like.:
/* ptr to the base of the allocated storage for all engines */ struct guc_mmio_reg *storage;
/* ptr to the section of the storage for the engine currently being worked on */ struct guc_mmio_reg *registers;
agreed, I will add that
- u32 storage_used;
- u32 storage_max;
}; static int guc_mmio_reg_cmp(const void *a, const void *b) @@ -244,18 +243,44 @@ static int guc_mmio_reg_cmp(const void *a, const void *b) return (int)ra->offset - (int)rb->offset; } +static struct guc_mmio_reg * __must_check +__mmio_reg_add(struct temp_regset *regset, struct guc_mmio_reg *reg) +{
- u32 pos = regset->storage_used;
- struct guc_mmio_reg *slot;
- if (pos >= regset->storage_max) {
size_t size = ALIGN((pos + 1) * sizeof(*slot), PAGE_SIZE);
struct guc_mmio_reg *r = krealloc(regset->storage,
size, GFP_KERNEL);
if (!r) {
WARN_ONCE(1, "Incomplete regset list: can't add register (%d)\n",
-ENOMEM);
return ERR_PTR(-ENOMEM);
}
regset->registers = r + (regset->registers - regset->storage);
regset->storage = r;
regset->storage_max = size / sizeof(*slot);
- }
- slot = ®set->storage[pos];
- regset->storage_used++;
- *slot = *reg;
- return slot;
+}
static long __must_check guc_mmio_reg_add(struct temp_regset *regset, u32 offset, u32 flags) {
- u32 count = regset->used;
- u32 count = regset->storage_used - (regset->registers - regset->storage); struct guc_mmio_reg reg = { .offset = offset, .flags = flags, }; struct guc_mmio_reg *slot;
- GEM_BUG_ON(count >= regset->size);
- /*
- The mmio list is built using separate lists within the driver.
- It's possible that at some point we may attempt to add the same
@@ -266,9 +291,9 @@ static long __must_check guc_mmio_reg_add(struct temp_regset *regset, sizeof(reg), guc_mmio_reg_cmp)) return 0;
- slot = ®set->registers[count];
- regset->used++;
- *slot = reg;
- slot = __mmio_reg_add(regset, ®);
- if (IS_ERR(slot))
while (slot-- > regset->registers) { GEM_BUG_ON(slot[0].offset == slot[1].offset);return PTR_ERR(slot);
@@ -295,7 +320,11 @@ static int guc_mmio_regset_init(struct temp_regset *regset, unsigned int i; int ret = 0;
- regset->used = 0;
- /*
* Each engine's registers point to a new start relative to
* storage
*/
- regset->registers = regset->storage + regset->storage_used; ret |= GUC_MMIO_REG_ADD(regset, RING_MODE_GEN7(base), true); ret |= GUC_MMIO_REG_ADD(regset, RING_HWS_PGA(base), false);
@@ -317,32 +346,28 @@ static int guc_mmio_regset_init(struct temp_regset *regset, return ret ? -1 : 0; } -static int guc_mmio_reg_state_query(struct intel_guc *guc) +static long guc_mmio_reg_state_create(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; enum intel_engine_id id;
- struct temp_regset temp_set;
- u32 total;
- struct temp_regset temp_set = {};
- long total = 0;
- /*
* Need to actually build the list in order to filter out
* duplicates and other such data dependent constructions.
*/
- temp_set.size = MAX_MMIO_REGS;
- temp_set.registers = kmalloc_array(temp_set.size,
sizeof(*temp_set.registers),
GFP_KERNEL);
- if (!temp_set.registers)
return -ENOMEM;
- total = 0; for_each_engine(engine, gt, id) {
guc_mmio_regset_init(&temp_set, engine);
total += temp_set.used;
u32 used = temp_set.storage_used;
if (guc_mmio_regset_init(&temp_set, engine) < 0)
return -1;
If you fail here you're leaking temp_set.storage.
thanks for catching this. I fixed it for next version
Also, any reason not to just return the return code from guc_mmio_regset_init?
no
Apart from these minor comments, the change LGTM. IMO we could also merge this patch on its own ahead of the rest of the dma_buf code, because not having to recreate the regset on every reset/resume is still helpful.
now the dma_buf_map (renamed to iosys_map) is more settled. I will send everything together once more and than probably split on next versions if needed.
thanks Lucas De Marchi
Now that the regset list is prepared, convert guc_mmio_reg_state_init() to use dma_buf_map to copy the array to the final location and initialize additional fields in ads.reg_state_list.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 30 +++++++++++++--------- 1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 390101ee3661..cb0f543b0e86 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -372,40 +372,46 @@ static long guc_mmio_reg_state_create(struct intel_guc *guc) return total * sizeof(struct guc_mmio_reg); }
-static void guc_mmio_reg_state_init(struct intel_guc *guc, - struct __guc_ads_blob *blob) +static void guc_mmio_reg_state_init(struct intel_guc *guc) { + struct dma_buf_map ads_regset_map; struct intel_gt *gt = guc_to_gt(guc); struct intel_engine_cs *engine; - struct guc_mmio_reg *ads_registers; enum intel_engine_id id; u32 addr_ggtt, offset;
offset = guc_ads_regset_offset(guc); addr_ggtt = intel_guc_ggtt_offset(guc, guc->ads_vma) + offset; - ads_registers = (struct guc_mmio_reg *)(((u8 *)blob) + offset); + ads_regset_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offset);
- memcpy(ads_registers, guc->ads_regset, guc->ads_regset_size); + dma_buf_map_memcpy_to(&ads_regset_map, guc->ads_regset, + guc->ads_regset_size);
for_each_engine(engine, gt, id) { u32 count = guc->ads_regset_count[id]; - struct guc_mmio_reg_set *ads_reg_set; u8 guc_class;
/* Class index is checked in class converter */ GEM_BUG_ON(engine->instance >= GUC_MAX_INSTANCES_PER_CLASS);
guc_class = engine_class_to_guc_class(engine->class); - ads_reg_set = &blob->ads.reg_state_list[guc_class][engine->instance];
if (!count) { - ads_reg_set->address = 0; - ads_reg_set->count = 0; + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].address, + 0); + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].count, + 0); continue; }
- ads_reg_set->address = addr_ggtt; - ads_reg_set->count = count; + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].address, + addr_ggtt); + ads_blob_write(guc, + ads.reg_state_list[guc_class][engine->instance].count, + count);
addr_ggtt += count * sizeof(struct guc_mmio_reg); } @@ -635,7 +641,7 @@ static void __guc_ads_init(struct intel_guc *guc) blob->ads.gt_system_info = base + ptr_offset(blob, system_info);
/* MMIO save/restore list */ - guc_mmio_reg_state_init(guc, blob); + guc_mmio_reg_state_init(guc);
/* Private Data */ blob->ads.private_data = base + guc_ads_private_data_offset(guc);
Now that all the called functions from __guc_ads_init() are converted to use ads_map, stop using ads_blob in __guc_ads_init().
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 25 ++++++++++++---------- 1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index cb0f543b0e86..30edac93afbf 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -602,7 +602,6 @@ static void __guc_ads_init(struct intel_guc *guc) { struct intel_gt *gt = guc_to_gt(guc); struct drm_i915_private *i915 = gt->i915; - struct __guc_ads_blob *blob = guc->ads_blob; struct dma_buf_map info_map = DMA_BUF_MAP_INIT_OFFSET(&guc->ads_map, offsetof(struct __guc_ads_blob, system_info)); u32 base; @@ -613,17 +612,18 @@ static void __guc_ads_init(struct intel_guc *guc) /* System info */ fill_engine_enable_masks(gt, &info_map);
- blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED] = - hweight8(gt->info.sseu.slice_mask); - blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK] = - gt->info.vdbox_sfc_access; + ads_blob_write(guc, system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_SLICE_ENABLED], + hweight8(gt->info.sseu.slice_mask)); + ads_blob_write(guc, system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_VDBOX_SFC_SUPPORT_MASK], + gt->info.vdbox_sfc_access);
if (GRAPHICS_VER(i915) >= 12 && !IS_DGFX(i915)) { u32 distdbreg = intel_uncore_read(gt->uncore, GEN12_DIST_DBS_POPULATED); - blob->system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI] = - ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) & - GEN12_DOORBELLS_PER_SQIDI) + 1; + ads_blob_write(guc, + system_info.generic_gt_sysinfo[GUC_GENERIC_GT_SYSINFO_DOORBELL_COUNT_PER_SQIDI], + ((distdbreg >> GEN12_DOORBELLS_PER_SQIDI_SHIFT) + & GEN12_DOORBELLS_PER_SQIDI) + 1); }
/* Golden contexts for re-initialising after a watchdog reset */ @@ -637,14 +637,17 @@ static void __guc_ads_init(struct intel_guc *guc) guc_capture_list_init(guc);
/* ADS */ - blob->ads.scheduler_policies = base + ptr_offset(blob, policies); - blob->ads.gt_system_info = base + ptr_offset(blob, system_info); + ads_blob_write(guc, ads.scheduler_policies, base + + offsetof(struct __guc_ads_blob, policies)); + ads_blob_write(guc, ads.gt_system_info, base + + offsetof(struct __guc_ads_blob, system_info));
/* MMIO save/restore list */ guc_mmio_reg_state_init(guc);
/* Private Data */ - blob->ads.private_data = base + guc_ads_private_data_offset(guc); + ads_blob_write(guc, ads.private_data, base + + guc_ads_private_data_offset(guc));
i915_gem_object_flush_map(guc->ads_vma->obj); }
Now we have the access to content of GuC ADS either using dma_buf_map API or using a temporary buffer. Remove guc->ads_blob as there shouldn't be updates using the bare pointer anymore.
Cc: Matt Roper matthew.d.roper@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch Cc: John Harrison John.C.Harrison@Intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/uc/intel_guc.h | 3 +-- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 8 ++++---- 2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 4c852eee3ad8..7349483d0e35 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -147,8 +147,7 @@ struct intel_guc {
/** @ads_vma: object allocated to hold the GuC ADS */ struct i915_vma *ads_vma; - /** @ads_blob: contents of the GuC ADS */ - struct __guc_ads_blob *ads_blob; + /** @ads_map: contents of the GuC ADS */ struct dma_buf_map ads_map; /** @ads_regset_size: size of the save/restore regsets in the ADS */ u32 ads_regset_size; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c index 30edac93afbf..b87269081650 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c @@ -661,6 +661,7 @@ static void __guc_ads_init(struct intel_guc *guc) */ int intel_guc_ads_create(struct intel_guc *guc) { + void *ads_blob; u32 size; int ret;
@@ -685,14 +686,14 @@ int intel_guc_ads_create(struct intel_guc *guc) size = guc_ads_blob_size(guc);
ret = intel_guc_allocate_and_map_vma(guc, size, &guc->ads_vma, - (void **)&guc->ads_blob); + &ads_blob); if (ret) return ret;
if (i915_gem_object_is_lmem(guc->ads_vma->obj)) - dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)guc->ads_blob); + dma_buf_map_set_vaddr_iomem(&guc->ads_map, (void __iomem *)ads_blob); else - dma_buf_map_set_vaddr(&guc->ads_map, guc->ads_blob); + dma_buf_map_set_vaddr(&guc->ads_map, ads_blob);
__guc_ads_init(guc);
@@ -714,7 +715,6 @@ void intel_guc_ads_init_late(struct intel_guc *guc) void intel_guc_ads_destroy(struct intel_guc *guc) { i915_vma_unpin_and_release(&guc->ads_vma, I915_VMA_RELEASE_MAP); - guc->ads_blob = NULL; dma_buf_map_clear(&guc->ads_map); kfree(guc->ads_regset); }
dri-devel@lists.freedesktop.org