original: https://patchwork.freedesktop.org/series/99378/ v2: https://patchwork.freedesktop.org/series/99711/#rev1, https://patchwork.freedesktop.org/series/99711/#rev2
Main changes from previous version:
- Unrelated patches to iosys-map conversion have landed - Remove unecessary kernel.h include from iosys-map.h - Rebase on latest drm-tip
Original cover letter:
While porting i915 to arm64 we noticed some issues accessing lmem. Some writes were getting corrupted and the final state of the buffer didn't have exactly what we wrote. This became evident when enabling GuC submission: depending on the number of engines the ADS struct was being corrupted and GuC would reject it, refusin to initialize.
From Documentation/core-api/bus-virt-phys-mapping.rst:
This memory is called "PCI memory" or "shared memory" or "IO memory" or whatever, and there is only one way to access it: the readb/writeb and related functions. You should never take the address of such memory, because there is really nothing you can do with such an address: it's not conceptually in the same memory space as "real memory" at all, so you cannot just dereference a pointer. (Sadly, on x86 it **is** in the same memory space, so on x86 it actually works to just deference a pointer, but it's not portable).
When reading or writing words directly to IO memory, in order to be portable the Linux kernel provides the abstraction detailed in section "Differences between I/O access functions" of Documentation/driver-api/device-io.rst.
This limits our ability to simply overlay our structs on top a buffer and directly access it since that buffer may come from IO memory rather than system memory. Hence the approach taken in intel_guc_ads.c needs to be refactored. This is not the only place in i915 that need to be changed, but the one causing the most problems, with a real reproducer. This first set of patch focuses on fixing the gem object to pass the ADS
After the addition of a few helpers in the dma_buf_map API, most of intel_guc_ads.c can be converted to use it. The exception is the regset initialization: we'd incur into a lot of extra indirection when reading/writing each register. So the regset is converted to use a temporary buffer allocated on probe, which is then copied to its final location when finishing the initialization or on gt reset. [v3: the part unrelated to iosys-map has already landed]
Testing on some discrete cards, after this change we can correctly pass the ADS struct to GuC and have it initialized correctly.
thanks Lucas De Marchi
Lucas De Marchi (16): iosys-map: Add offset to iosys_map_memcpy_to() iosys-map: Add a few more helpers drm/i915/gt: Add helper for shmem copy to iosys_map drm/i915/guc: Keep iosys_map of ads_blob around drm/i915/guc: Add read/write helpers for ADS blob drm/i915/guc: Convert golden context init to iosys_map drm/i915/guc: Convert policies update to iosys_map drm/i915/guc: Convert engine record to iosys_map drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map drm/i915/guc: Convert golden context prep to iosys_map drm/i915/guc: Replace check for golden context size drm/i915/guc: Convert mapping table to iosys_map drm/i915/guc: Convert capture list to iosys_map drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map drm/i915/guc: Convert __guc_ads_init to iosys_map drm/i915/guc: Remove plain ads_blob pointer
drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_fb_helper.c | 2 +- drivers/gpu/drm/i915/gt/shmem_utils.c | 32 +++ drivers/gpu/drm/i915/gt/shmem_utils.h | 3 + drivers/gpu/drm/i915/gt/uc/intel_guc.h | 7 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 233 ++++++++++-------- drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h | 3 +- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 17 +- include/linux/iosys-map.h | 218 +++++++++++++++- 9 files changed, 396 insertions(+), 121 deletions(-)