This series provides some of the initial enablement patches for two upcoming discrete GPUs: * XeHP SDV: Xe_HP (version 12.50) graphics IP, no display IP * DG2: Xe_HPG (version 12.55) graphics IP, Xe_LPD (version 13) display IP
Both platforms will need additional enablement patches beyond what's present in this series before they're truly usable, including various LMEM and GuC work that's already happening separately. The new features/functionality that these platforms bring (such as multi-tile support, dedicated compute engines, etc.) may be referenced in passing in some of these patches but will be fully enabled in future series.
Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: James Ausmus james.ausmus@intel.com
Akeem G Abodunrin (1): drm/i915/dg2: Add new LRI reg offsets
Animesh Manna (1): drm/i915/dg2: Update to bigjoiner path
Ankit Nautiyal (1): drm/i915/dg2: Configure PCON in DP pre-enable path
Anusha Srivatsa (2): drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable drm/i915/display/dsc: Set BPP in the kernel
Daniele Ceraolo Spurio (1): drm/i915/xehp: handle new steering options
Gwan-gyeong Mun (1): drm/i915/dg2: Update lane disable power state during PSR
John Harrison (4): drm/i915/selftests: Allow for larger engine counts drm/i915/xehp: Extra media engines - Part 1 (engine definitions) drm/i915/xehp: Extra media engines - Part 2 (interrupts) drm/i915/xehp: Extra media engines - Part 3 (reset)
José Roberto de Souza (1): drm/i915/dg2: Add DG2 to the PSR2 defeature list
Lucas De Marchi (5): drm/i915: Add "release id" version drm/i915: Add XE_HP initial definitions drm/i915/xehpsdv: add initial XeHP SDV definitions drm/i915/xehpsdv: Define MOCS table for XeHP SDV drm/i915/xehpsdv: factor out function to read RP_STATE_CAP
Matt Roper (29): drm/i915/xehp: Xe_HP forcewake support drm/i915/xehp: Define multicast register ranges drm/i915/xehp: Loop over all gslices for INSTDONE processing drm/i915/xehpsdv: Add maximum sseu limits drm/i915/xehpsdv: Define steering tables drm/i915/xehpsdv: Read correct RP_STATE_CAP register drm/i915/dg2: add DG2 platform info drm/i915/dg2: DG2 uses the same sseu limits as XeHP SDV drm/i915/dg2: Add forcewake table drm/i915/dg2: Update LNCF steering ranges drm/i915/dg2: Add SQIDI steering drm/i915/dg2: Maintain backward-compatible nested batch behavior drm/i915/dg2: Report INSTDONE_GEOM values in error state drm/i915/dg2: Define MOCS table for DG2 drm/i915/dg2: Add fake PCH drm/i915/dg2: Add cdclk table and reference clock drm/i915/dg2: Skip shared DPLL handling drm/i915/dg2: Don't wait for AUX power well enable ACKs drm/i915/dg2: Setup display outputs drm/i915/dg2: Add dbuf programming drm/i915/dg2: Don't program BW_BUDDY registers drm/i915/dg2: Don't read DRAM info drm/i915/dg2: DG2 has fixed memory bandwidth drm/i915/dg2: Add MPLLB programming for SNPS PHY drm/i915/dg2: Add MPLLB programming for HDMI drm/i915/dg2: Add vswing programming for SNPS phys drm/i915/dg2: Update modeset sequences drm/i915/dg2: Classify DG2 PHY types drm/i915/dg2: Wait for SNPS PHY calibration during display init
Matthew Auld (1): drm/i915/xehp: Changes to ss/eu definitions
Paulo Zanoni (1): drm/i915: Fork DG1 interrupt handler
Prathap Kumar Valsan (1): drm/i915/xehp: New engine context offsets
Stuart Summers (2): drm/i915/xehp: Handle new device context ID format drm/i915/xehpsdv: Add compute DSS type
Tvrtko Ursulin (1): drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based
Venkata Sandeep Dhanalakota (1): drm/i915/gen12: Use fuse info to enable SFC
drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_bw.c | 24 +- drivers/gpu/drm/i915/display/intel_cdclk.c | 24 +- drivers/gpu/drm/i915/display/intel_ddi.c | 165 +++- drivers/gpu/drm/i915/display/intel_display.c | 94 +- drivers/gpu/drm/i915/display/intel_display.h | 1 + .../drm/i915/display/intel_display_debugfs.c | 103 ++- .../drm/i915/display/intel_display_power.c | 25 + .../drm/i915/display/intel_display_power.h | 10 + .../drm/i915/display/intel_display_types.h | 18 +- drivers/gpu/drm/i915/display/intel_dp.c | 23 +- drivers/gpu/drm/i915/display/intel_dpll.c | 12 +- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 5 +- drivers/gpu/drm/i915/display/intel_hdmi.c | 11 + drivers/gpu/drm/i915/display/intel_psr.c | 10 +- drivers/gpu/drm/i915/display/intel_snps_phy.c | 862 ++++++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 35 + drivers/gpu/drm/i915/gt/debugfs_gt_pm.c | 8 +- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 144 ++- drivers/gpu/drm/i915/gt/intel_engine_types.h | 29 +- .../drm/i915/gt/intel_execlists_submission.c | 78 +- drivers/gpu/drm/i915/gt/intel_gt.c | 66 +- drivers/gpu/drm/i915/gt/intel_gt.h | 1 + drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 +- drivers/gpu/drm/i915/gt/intel_gt_types.h | 7 + drivers/gpu/drm/i915/gt/intel_lrc.c | 156 +++- drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 2 + drivers/gpu/drm/i915/gt/intel_mocs.c | 66 +- drivers/gpu/drm/i915/gt/intel_region_lmem.c | 1 + drivers/gpu/drm/i915/gt/intel_reset.c | 6 + drivers/gpu/drm/i915/gt/intel_rps.c | 19 +- drivers/gpu/drm/i915/gt/intel_rps.h | 1 + drivers/gpu/drm/i915/gt/intel_sseu.c | 116 ++- drivers/gpu/drm/i915/gt/intel_sseu.h | 20 +- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 175 +++- drivers/gpu/drm/i915/gt/selftest_execlists.c | 10 +- .../gpu/drm/i915/gt/selftest_workarounds.c | 32 +- drivers/gpu/drm/i915/i915_debugfs.c | 8 +- drivers/gpu/drm/i915/i915_drv.h | 48 +- drivers/gpu/drm/i915/i915_getparam.c | 6 +- drivers/gpu/drm/i915/i915_gpu_error.c | 36 +- drivers/gpu/drm/i915/i915_irq.c | 141 ++- drivers/gpu/drm/i915/i915_pci.c | 63 +- drivers/gpu/drm/i915/i915_perf.c | 29 +- drivers/gpu/drm/i915/i915_reg.h | 109 ++- drivers/gpu/drm/i915/intel_device_info.c | 4 + drivers/gpu/drm/i915/intel_device_info.h | 10 +- drivers/gpu/drm/i915/intel_dram.c | 6 +- drivers/gpu/drm/i915/intel_pch.c | 3 + drivers/gpu/drm/i915/intel_pch.h | 2 + drivers/gpu/drm/i915/intel_pm.c | 120 ++- drivers/gpu/drm/i915/intel_step.c | 20 +- drivers/gpu/drm/i915/intel_step.h | 1 + drivers/gpu/drm/i915/intel_uncore.c | 367 ++++++-- drivers/gpu/drm/i915/intel_uncore.h | 14 +- drivers/gpu/drm/i915/selftests/intel_uncore.c | 2 + include/uapi/drm/i915_drm.h | 3 - 59 files changed, 3085 insertions(+), 289 deletions(-) create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.c create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.h
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release)) + #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver) +#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \ + INTEL_INFO(i915)->graphics_ver_release) #define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver) +#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \ + INTEL_INFO(i915)->media_ver_release) #define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver); + drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release); drm_printf(p, "media_ver: %u\n", info->media_ver); + drm_printf(p, "media_ver_release: %u\n", info->media_ver_release); drm_printf(p, "display_ver: %u\n", info->display.ver); drm_printf(p, "gt: %d\n", info->gt); drm_printf(p, "iommu: %s\n", iommu_name()); diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b326aff65cd6..944a5ff4df49 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -162,7 +162,9 @@ enum intel_ppgtt_type {
struct intel_device_info { u8 graphics_ver; + u8 graphics_ver_release; u8 media_ver; + u8 media_ver_release;
u8 gt; /* GT number, 0 if undefined */ intel_engine_mask_t platform_engine_mask; /* Engines supported by the HW */
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release))
- #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \
INTEL_INFO(i915)->graphics_ver_release)
#define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \
#define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))INTEL_INFO(i915)->media_ver_release)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
- drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before.
Regards,
Tvrtko
drm_printf(p, "media_ver: %u\n", info->media_ver);
- drm_printf(p, "media_ver_release: %u\n", info->media_ver_release); drm_printf(p, "display_ver: %u\n", info->display.ver); drm_printf(p, "gt: %d\n", info->gt); drm_printf(p, "iommu: %s\n", iommu_name());
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b326aff65cd6..944a5ff4df49 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -162,7 +162,9 @@ enum intel_ppgtt_type {
struct intel_device_info { u8 graphics_ver;
u8 graphics_ver_release; u8 media_ver;
u8 media_ver_release;
u8 gt; /* GT number, 0 if undefined */ intel_engine_mask_t platform_engine_mask; /* Engines supported by the HW */
On Fri, 02 Jul 2021, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release))
- #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \
INTEL_INFO(i915)->graphics_ver_release)
#define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \
#define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))INTEL_INFO(i915)->media_ver_release)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
- drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before.
Just bikeshedding here, but I thought of:
if (info->grapics_ver_release) drm_printf(p, "graphics_ver: %u.%u\n", info->graphics_ver, info->graphics_ver_release); else drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
Also, I thought "x_ver" and "x_ver_release" sounds a bit odd, perhaps having "x_ver" and "x_rel" is more natural?
Ultimately I think we've historically always sucked at trying to figure this out up front, so maybe the right answer is merging something and fixing later...
BR, Jani.
Regards,
Tvrtko
drm_printf(p, "media_ver: %u\n", info->media_ver);
- drm_printf(p, "media_ver_release: %u\n", info->media_ver_release); drm_printf(p, "display_ver: %u\n", info->display.ver); drm_printf(p, "gt: %d\n", info->gt); drm_printf(p, "iommu: %s\n", iommu_name());
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b326aff65cd6..944a5ff4df49 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -162,7 +162,9 @@ enum intel_ppgtt_type {
struct intel_device_info { u8 graphics_ver;
u8 graphics_ver_release; u8 media_ver;
u8 media_ver_release;
u8 gt; /* GT number, 0 if undefined */ intel_engine_mask_t platform_engine_mask; /* Engines supported by the HW */
On Mon, Jul 05, 2021 at 02:52:31PM +0300, Jani Nikula wrote:
On Fri, 02 Jul 2021, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release))
- #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \
INTEL_INFO(i915)->graphics_ver_release)
#define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \
#define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))INTEL_INFO(i915)->media_ver_release)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
- drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before.
Just bikeshedding here, but I thought of:
if (info->grapics_ver_release) drm_printf(p, "graphics_ver: %u.%u\n", info->graphics_ver, info->graphics_ver_release); else drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
humn... a suggestion that I got internally a few week ago and I forgot to update this was that this doesn't need to be abbreviated in debugfs and could very well be:
drm_printf(p, "graphics version: %u\n", info->graphics_ver); drm_printf(p, "graphics release: %u\n", info->graphics_ver_release);
Also, I thought "x_ver" and "x_ver_release" sounds a bit odd, perhaps having "x_ver" and "x_rel" is more natural?
Not sure what direction to go now though. Maybe trying to put all suggestions together:
if (info->graphics_rel) drm_printf(p, "graphics version: %u.%u\n", info->graphics_ver, info->graphics_rel); else drm_printf(p, "graphics version: %u\n", info->graphics_ver);
One thing I like is that doing `| grep "graphics version"` gives all info you are searching for.
thanks Lucas De Marchi
On Tue, 06 Jul 2021, Lucas De Marchi lucas.demarchi@intel.com wrote:
On Mon, Jul 05, 2021 at 02:52:31PM +0300, Jani Nikula wrote:
On Fri, 02 Jul 2021, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release))
- #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \
INTEL_INFO(i915)->graphics_ver_release)
#define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \
#define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))INTEL_INFO(i915)->media_ver_release)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
- drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before.
Just bikeshedding here, but I thought of:
if (info->grapics_ver_release) drm_printf(p, "graphics_ver: %u.%u\n", info->graphics_ver, info->graphics_ver_release); else drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
humn... a suggestion that I got internally a few week ago and I forgot to update this was that this doesn't need to be abbreviated in debugfs and could very well be:
drm_printf(p, "graphics version: %u\n", info->graphics_ver); drm_printf(p, "graphics release: %u\n", info->graphics_ver_release);
Also, I thought "x_ver" and "x_ver_release" sounds a bit odd, perhaps having "x_ver" and "x_rel" is more natural?
Not sure what direction to go now though. Maybe trying to put all suggestions together:
if (info->graphics_rel) drm_printf(p, "graphics version: %u.%u\n", info->graphics_ver, info->graphics_rel); else drm_printf(p, "graphics version: %u\n", info->graphics_ver);
One thing I like is that doing `| grep "graphics version"` gives all info you are searching for.
I'd like that, but this is all bikeshedding, really.
BR, Jani.
thanks Lucas De Marchi
On Wed, Jul 07, 2021 at 11:34:36AM +0300, Jani Nikula wrote:
On Tue, 06 Jul 2021, Lucas De Marchi lucas.demarchi@intel.com wrote:
On Mon, Jul 05, 2021 at 02:52:31PM +0300, Jani Nikula wrote:
On Fri, 02 Jul 2021, Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
However the release id number is not defined by hardware until we start using the GMD_ID register. For the platforms before that register is useful we will set the values in software and we can set them as we please. So the plan is to set them so we can group different features under a single GRAPHICS_VER_FULL() check.
After GMD_ID is used, the usefulness of a "full version check" will be greatly reduced and will be mostly used for deciding workarounds and a few code paths. So it makes sense to keep it as a separate field from graphics_ver.
Also, currently there is not much use for the release id in media and display, so keep them out.
This is a mix of 2 independent changes: one by me and the other by Matt Roper.
Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 6 ++++++ drivers/gpu/drm/i915/intel_device_info.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 2 ++ 3 files changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..9639800485b9 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1258,11 +1258,17 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) */ #define IS_GEN(dev_priv, n) (GRAPHICS_VER(dev_priv) == (n))
+#define IP_VER(ver, release) ((ver) << 8 | (release))
- #define GRAPHICS_VER(i915) (INTEL_INFO(i915)->graphics_ver)
+#define GRAPHICS_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->graphics_ver, \
INTEL_INFO(i915)->graphics_ver_release)
#define IS_GRAPHICS_VER(i915, from, until) \ (GRAPHICS_VER(i915) >= (from) && GRAPHICS_VER(i915) <= (until))
#define MEDIA_VER(i915) (INTEL_INFO(i915)->media_ver)
+#define MEDIA_VER_FULL(i915) IP_VER(INTEL_INFO(i915)->media_ver, \
#define IS_MEDIA_VER(i915, from, until) \ (MEDIA_VER(i915) >= (from) && MEDIA_VER(i915) <= (until))INTEL_INFO(i915)->media_ver_release)
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7eaa92fee421..e8ad14f002c1 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -97,7 +97,9 @@ void intel_device_info_print_static(const struct intel_device_info *info, struct drm_printer *p) { drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
- drm_printf(p, "graphics_ver_release: %u\n", info->graphics_ver_release);
I get the VER and VER_FULL in the macros but could 'ver' and 'ver_release' here and in the code simply be renamed to 'ver'/'version' and 'release'? Maybe it is just me but don't think I encountered the term "version release" before.
Just bikeshedding here, but I thought of:
if (info->grapics_ver_release) drm_printf(p, "graphics_ver: %u.%u\n", info->graphics_ver, info->graphics_ver_release); else drm_printf(p, "graphics_ver: %u\n", info->graphics_ver);
humn... a suggestion that I got internally a few week ago and I forgot to update this was that this doesn't need to be abbreviated in debugfs and could very well be:
drm_printf(p, "graphics version: %u\n", info->graphics_ver); drm_printf(p, "graphics release: %u\n", info->graphics_ver_release);
Also, I thought "x_ver" and "x_ver_release" sounds a bit odd, perhaps having "x_ver" and "x_rel" is more natural?
Not sure what direction to go now though. Maybe trying to put all suggestions together:
if (info->graphics_rel) drm_printf(p, "graphics version: %u.%u\n", info->graphics_ver, info->graphics_rel); else drm_printf(p, "graphics version: %u\n", info->graphics_ver);
One thing I like is that doing `| grep "graphics version"` gives all info you are searching for.
I'd like that, but this is all bikeshedding, really.
I already updated our local copy with all of this, so I'm fine including that in the next version... since this is the first patch, I can also send it standalone.
Lucas De Marchi
On Fri, Jul 02, 2021 at 01:33:50PM +0100, Tvrtko Ursulin wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
Besides the arch version returned by GRAPHICS_VER(), new platforms contain a "release id" to make clear the difference from one platform to another. Although for the first ones we may use them as if they were a
What does "first ones" refer to here?
XeHP-SDV and DG2. The additional register that commit 01eb15c9165e ("drm/i915: Add DISPLAY_VER() and related macros") talks about is not available for these platforms. Here we hardcode them in software to something that makes sense and it allows the driver to be properly prepared for future platforms.
major/minor version, that is not true for all platforms: we may have a `release_id == n` that is closer to `n - 2` than to `n - 1`.
Hm this is a bit confusing. Is the sentence simply trying to say that, as the release id number is growing, hw capabilities are not simply accumulating but can be removed as well? Otherwise I am not sure how the user of these macros is supposed to act on this sentence.
this is explaining why those numbers can't be interpreted as major/minor and hence why here it's called "release" rather than "minor".
Your interpretation is correct, except that a feature not being there doesn't necessarily mean it got removed at some point. I might just had never been in that particular release: i.e. GRAPHICS_VER_FULL() == 14.5 doesn't necessarily mean it comes after 14.3, with additional features/fixes.
Lucas De Marchi
From: Lucas De Marchi lucas.demarchi@intel.com
Our _FEATURES macro went back to GEN7, extending each other, making it difficult to grasp what was really enabled/disabled. Take the opportunity of the GEN -> XE_HP name break and also break with the feature inheritance.
For XE_HP this basically goes from GEN12 back to GEN7 coalescing the features making sure the overrides remain, remove all the display-specific features and sort it.
Then also remove the definitions that would be overridden by DGFX_FEATURES and those that were 0 (since that is the default). Exception here is has_master_unit_irq: although it is a feature that started with DG1 and is true for all DGFX platforms, it's also true for XE_HP in general.
Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/i915_pci.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index a7bfdd827bc8..dc0883bad9cf 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -995,6 +995,30 @@ static const struct intel_device_info adl_p_info = { };
#undef GEN + +#define XE_HP_PAGE_SIZES \ + .page_sizes = I915_GTT_PAGE_SIZE_4K | \ + I915_GTT_PAGE_SIZE_64K | \ + I915_GTT_PAGE_SIZE_2M + +#define XE_HP_FEATURES \ + .graphics_ver = 12, \ + .graphics_ver_release = 50, \ + XE_HP_PAGE_SIZES, \ + .dma_mask_size = 46, \ + .has_64bit_reloc = 1, \ + .has_global_mocs = 1, \ + .has_gt_uc = 1, \ + .has_llc = 1, \ + .has_logical_ring_contexts = 1, \ + .has_logical_ring_elsq = 1, \ + .has_rc6 = 1, \ + .has_reset_engine = 1, \ + .has_rps = 1, \ + .has_runtime_pm = 1, \ + .ppgtt_size = 48, \ + .ppgtt_type = INTEL_PPGTT_FULL + #undef PLATFORM
/*
From: Paulo Zanoni paulo.r.zanoni@intel.com
The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes.
Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test.
Bspec: 50875 Cc: Daniele Spurio Ceraolo daniele.ceraolospurio@intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Tomasz Lis tomasz.lis@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 2 - drivers/gpu/drm/i915/i915_irq.c | 139 +++++++++++++++-------- drivers/gpu/drm/i915/i915_pci.c | 2 +- drivers/gpu/drm/i915/i915_reg.h | 4 +- drivers/gpu/drm/i915/intel_device_info.h | 1 - 5 files changed, 95 insertions(+), 53 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9639800485b9..519cce702f4b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1601,8 +1601,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_LOGICAL_RING_ELSQ(dev_priv) \ (INTEL_INFO(dev_priv)->has_logical_ring_elsq)
-#define HAS_MASTER_UNIT_IRQ(dev_priv) (INTEL_INFO(dev_priv)->has_master_unit_irq) - #define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
#define INTEL_PPGTT(dev_priv) (INTEL_INFO(dev_priv)->ppgtt_type) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7d0ce8b9f8ed..9d47ffa39093 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2699,11 +2699,9 @@ gen11_display_irq_handler(struct drm_i915_private *i915) enable_rpm_wakeref_asserts(&i915->runtime_pm); }
-static __always_inline irqreturn_t -__gen11_irq_handler(struct drm_i915_private * const i915, - u32 (*intr_disable)(void __iomem * const regs), - void (*intr_enable)(void __iomem * const regs)) +static irqreturn_t gen11_irq_handler(int irq, void *arg) { + struct drm_i915_private *i915 = arg; void __iomem * const regs = i915->uncore.regs; struct intel_gt *gt = &i915->gt; u32 master_ctl; @@ -2712,9 +2710,9 @@ __gen11_irq_handler(struct drm_i915_private * const i915, if (!intel_irqs_enabled(i915)) return IRQ_NONE;
- master_ctl = intr_disable(regs); + master_ctl = gen11_master_intr_disable(regs); if (!master_ctl) { - intr_enable(regs); + gen11_master_intr_enable(regs); return IRQ_NONE; }
@@ -2727,7 +2725,7 @@ __gen11_irq_handler(struct drm_i915_private * const i915,
gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl);
- intr_enable(regs); + gen11_master_intr_enable(regs);
gen11_gu_misc_irq_handler(gt, gu_misc_iir);
@@ -2736,51 +2734,69 @@ __gen11_irq_handler(struct drm_i915_private * const i915, return IRQ_HANDLED; }
-static irqreturn_t gen11_irq_handler(int irq, void *arg) -{ - return __gen11_irq_handler(arg, - gen11_master_intr_disable, - gen11_master_intr_enable); -} - -static u32 dg1_master_intr_disable_and_ack(void __iomem * const regs) +static inline u32 dg1_master_intr_disable(void __iomem * const regs) { u32 val;
/* First disable interrupts */ - raw_reg_write(regs, DG1_MSTR_UNIT_INTR, 0); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, 0);
/* Get the indication levels and ack the master unit */ - val = raw_reg_read(regs, DG1_MSTR_UNIT_INTR); + val = raw_reg_read(regs, DG1_MSTR_TILE_INTR); if (unlikely(!val)) return 0;
- raw_reg_write(regs, DG1_MSTR_UNIT_INTR, val); - - /* - * Now with master disabled, get a sample of level indications - * for this interrupt and ack them right away - we keep GEN11_MASTER_IRQ - * out as this bit doesn't exist anymore for DG1 - */ - val = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ) & ~GEN11_MASTER_IRQ; - if (unlikely(!val)) - return 0; - - raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, val); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, val);
return val; }
static inline void dg1_master_intr_enable(void __iomem * const regs) { - raw_reg_write(regs, DG1_MSTR_UNIT_INTR, DG1_MSTR_IRQ); + raw_reg_write(regs, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ); }
static irqreturn_t dg1_irq_handler(int irq, void *arg) { - return __gen11_irq_handler(arg, - dg1_master_intr_disable_and_ack, - dg1_master_intr_enable); + struct drm_i915_private * const i915 = arg; + struct intel_gt *gt = &i915->gt; + void __iomem * const regs = i915->uncore.regs; + u32 master_tile_ctl, master_ctl; + u32 gu_misc_iir; + + if (!intel_irqs_enabled(i915)) + return IRQ_NONE; + + master_tile_ctl = dg1_master_intr_disable(regs); + if (!master_tile_ctl) { + dg1_master_intr_enable(regs); + return IRQ_NONE; + } + + /* FIXME: we only support tile 0 for now. */ + if (master_tile_ctl & DG1_MSTR_TILE(0)) { + master_ctl = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ); + raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, master_ctl); + } else { + DRM_ERROR("Tile not supported: 0x%08x\n", master_tile_ctl); + dg1_master_intr_enable(regs); + return IRQ_NONE; + } + + gen11_gt_irq_handler(gt, master_ctl); + + if (master_ctl & GEN11_DISPLAY_IRQ) + gen11_display_irq_handler(i915); + + gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl); + + dg1_master_intr_enable(regs); + + gen11_gu_misc_irq_handler(gt, gu_misc_iir); + + pmu_irq_stats(i915, IRQ_HANDLED); + + return IRQ_HANDLED; }
/* Called from drm generic code, passed 'crtc' which @@ -3168,10 +3184,20 @@ static void gen11_irq_reset(struct drm_i915_private *dev_priv) { struct intel_uncore *uncore = &dev_priv->uncore;
- if (HAS_MASTER_UNIT_IRQ(dev_priv)) - dg1_master_intr_disable_and_ack(dev_priv->uncore.regs); - else - gen11_master_intr_disable(dev_priv->uncore.regs); + gen11_master_intr_disable(dev_priv->uncore.regs); + + gen11_gt_irq_reset(&dev_priv->gt); + gen11_display_irq_reset(dev_priv); + + GEN3_IRQ_RESET(uncore, GEN11_GU_MISC_); + GEN3_IRQ_RESET(uncore, GEN8_PCU_); +} + +static void dg1_irq_reset(struct drm_i915_private *dev_priv) +{ + struct intel_uncore *uncore = &dev_priv->uncore; + + dg1_master_intr_disable(dev_priv->uncore.regs);
gen11_gt_irq_reset(&dev_priv->gt); gen11_display_irq_reset(dev_priv); @@ -3863,13 +3889,28 @@ static void gen11_irq_postinstall(struct drm_i915_private *dev_priv)
GEN3_IRQ_INIT(uncore, GEN11_GU_MISC_, ~gu_misc_masked, gu_misc_masked);
- if (HAS_MASTER_UNIT_IRQ(dev_priv)) { - dg1_master_intr_enable(uncore->regs); - intel_uncore_posting_read(&dev_priv->uncore, DG1_MSTR_UNIT_INTR); - } else { - gen11_master_intr_enable(uncore->regs); - intel_uncore_posting_read(&dev_priv->uncore, GEN11_GFX_MSTR_IRQ); + gen11_master_intr_enable(uncore->regs); + intel_uncore_posting_read(&dev_priv->uncore, GEN11_GFX_MSTR_IRQ); +} + +static void dg1_irq_postinstall(struct drm_i915_private *dev_priv) +{ + struct intel_uncore *uncore = &dev_priv->uncore; + u32 gu_misc_masked = GEN11_GU_MISC_GSE; + + gen11_gt_irq_postinstall(&dev_priv->gt); + + GEN3_IRQ_INIT(uncore, GEN11_GU_MISC_, ~gu_misc_masked, gu_misc_masked); + + if (HAS_DISPLAY(dev_priv)) { + icp_irq_postinstall(dev_priv); + gen8_de_irq_postinstall(dev_priv); + intel_uncore_write(&dev_priv->uncore, GEN11_DISPLAY_INT_CTL, + GEN11_DISPLAY_IRQ_ENABLE); } + + dg1_master_intr_enable(dev_priv->uncore.regs); + intel_uncore_posting_read(&dev_priv->uncore, DG1_MSTR_TILE_INTR); }
static void cherryview_irq_postinstall(struct drm_i915_private *dev_priv) @@ -4408,9 +4449,9 @@ static irq_handler_t intel_irq_handler(struct drm_i915_private *dev_priv) else return i8xx_irq_handler; } else { - if (HAS_MASTER_UNIT_IRQ(dev_priv)) + if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10)) return dg1_irq_handler; - if (GRAPHICS_VER(dev_priv) >= 11) + else if (GRAPHICS_VER(dev_priv) >= 11) return gen11_irq_handler; else if (GRAPHICS_VER(dev_priv) >= 8) return gen8_irq_handler; @@ -4433,7 +4474,9 @@ static void intel_irq_reset(struct drm_i915_private *dev_priv) else i8xx_irq_reset(dev_priv); } else { - if (GRAPHICS_VER(dev_priv) >= 11) + if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10)) + dg1_irq_reset(dev_priv); + else if (GRAPHICS_VER(dev_priv) >= 11) gen11_irq_reset(dev_priv); else if (GRAPHICS_VER(dev_priv) >= 8) gen8_irq_reset(dev_priv); @@ -4456,7 +4499,9 @@ static void intel_irq_postinstall(struct drm_i915_private *dev_priv) else i8xx_irq_postinstall(dev_priv); } else { - if (GRAPHICS_VER(dev_priv) >= 11) + if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10)) + dg1_irq_postinstall(dev_priv); + else if (GRAPHICS_VER(dev_priv) >= 11) gen11_irq_postinstall(dev_priv); else if (GRAPHICS_VER(dev_priv) >= 8) gen8_irq_postinstall(dev_priv); diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index dc0883bad9cf..9684b647fd04 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -907,7 +907,6 @@ static const struct intel_device_info rkl_info = {
#define DGFX_FEATURES \ .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \ - .has_master_unit_irq = 1, \ .has_llc = 0, \ .has_snoop = 1, \ .is_dgfx = 1 @@ -915,6 +914,7 @@ static const struct intel_device_info rkl_info = { static const struct intel_device_info dg1_info __maybe_unused = { GEN12_FEATURES, DGFX_FEATURES, + .graphics_ver_release = 10, PLATFORM(INTEL_DG1), .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), .require_force_probe = 1, diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 16a19239d86d..f7dad8541417 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7985,9 +7985,9 @@ enum { #define GEN11_GT_DW1_IRQ (1 << 1) #define GEN11_GT_DW0_IRQ (1 << 0)
-#define DG1_MSTR_UNIT_INTR _MMIO(0x190008) +#define DG1_MSTR_TILE_INTR _MMIO(0x190008) #define DG1_MSTR_IRQ REG_BIT(31) -#define DG1_MSTR_UNIT(u) REG_BIT(u) +#define DG1_MSTR_TILE(t) REG_BIT(t)
#define GEN11_DISPLAY_INT_CTL _MMIO(0x44200) #define GEN11_DISPLAY_IRQ_ENABLE (1 << 31) diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 944a5ff4df49..b00249906c28 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -127,7 +127,6 @@ enum intel_ppgtt_type { func(has_llc); \ func(has_logical_ring_contexts); \ func(has_logical_ring_elsq); \ - func(has_master_unit_irq); \ func(has_pooled_eu); \ func(has_rc6); \ func(has_rc6p); \
On Thu, Jul 1, 2021 at 10:26 PM Matt Roper matthew.d.roper@intel.com wrote:
From: Paulo Zanoni paulo.r.zanoni@intel.com
The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes.
Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test.
Bspec: 50875 Cc: Daniele Spurio Ceraolo daniele.ceraolospurio@intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Tomasz Lis tomasz.lis@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
So I know DG1 upstream is decidedly non-smooth, but basic infrastructure we've had since forever ...
Why was this prep work not upstreamed earlier with some benign commit message that doesn't mention DG2? There's zero DG2 stuff in here, this could have landed months/years ago even. Bringing this up since right this moment we have an internal chat about trees diverging a bit much. -Daniel
drivers/gpu/drm/i915/i915_drv.h | 2 - drivers/gpu/drm/i915/i915_irq.c | 139 +++++++++++++++-------- drivers/gpu/drm/i915/i915_pci.c | 2 +- drivers/gpu/drm/i915/i915_reg.h | 4 +- drivers/gpu/drm/i915/intel_device_info.h | 1 - 5 files changed, 95 insertions(+), 53 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 9639800485b9..519cce702f4b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1601,8 +1601,6 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_LOGICAL_RING_ELSQ(dev_priv) \ (INTEL_INFO(dev_priv)->has_logical_ring_elsq)
-#define HAS_MASTER_UNIT_IRQ(dev_priv) (INTEL_INFO(dev_priv)->has_master_unit_irq)
#define HAS_EXECLISTS(dev_priv) HAS_LOGICAL_RING_CONTEXTS(dev_priv)
#define INTEL_PPGTT(dev_priv) (INTEL_INFO(dev_priv)->ppgtt_type) diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7d0ce8b9f8ed..9d47ffa39093 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2699,11 +2699,9 @@ gen11_display_irq_handler(struct drm_i915_private *i915) enable_rpm_wakeref_asserts(&i915->runtime_pm); }
-static __always_inline irqreturn_t -__gen11_irq_handler(struct drm_i915_private * const i915,
u32 (*intr_disable)(void __iomem * const regs),
void (*intr_enable)(void __iomem * const regs))
+static irqreturn_t gen11_irq_handler(int irq, void *arg) {
struct drm_i915_private *i915 = arg; void __iomem * const regs = i915->uncore.regs; struct intel_gt *gt = &i915->gt; u32 master_ctl;
@@ -2712,9 +2710,9 @@ __gen11_irq_handler(struct drm_i915_private * const i915, if (!intel_irqs_enabled(i915)) return IRQ_NONE;
master_ctl = intr_disable(regs);
master_ctl = gen11_master_intr_disable(regs); if (!master_ctl) {
intr_enable(regs);
gen11_master_intr_enable(regs); return IRQ_NONE; }
@@ -2727,7 +2725,7 @@ __gen11_irq_handler(struct drm_i915_private * const i915,
gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl);
intr_enable(regs);
gen11_master_intr_enable(regs); gen11_gu_misc_irq_handler(gt, gu_misc_iir);
@@ -2736,51 +2734,69 @@ __gen11_irq_handler(struct drm_i915_private * const i915, return IRQ_HANDLED; }
-static irqreturn_t gen11_irq_handler(int irq, void *arg) -{
return __gen11_irq_handler(arg,
gen11_master_intr_disable,
gen11_master_intr_enable);
-}
-static u32 dg1_master_intr_disable_and_ack(void __iomem * const regs) +static inline u32 dg1_master_intr_disable(void __iomem * const regs) { u32 val;
/* First disable interrupts */
raw_reg_write(regs, DG1_MSTR_UNIT_INTR, 0);
raw_reg_write(regs, DG1_MSTR_TILE_INTR, 0); /* Get the indication levels and ack the master unit */
val = raw_reg_read(regs, DG1_MSTR_UNIT_INTR);
val = raw_reg_read(regs, DG1_MSTR_TILE_INTR); if (unlikely(!val)) return 0;
raw_reg_write(regs, DG1_MSTR_UNIT_INTR, val);
/*
* Now with master disabled, get a sample of level indications
* for this interrupt and ack them right away - we keep GEN11_MASTER_IRQ
* out as this bit doesn't exist anymore for DG1
*/
val = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ) & ~GEN11_MASTER_IRQ;
if (unlikely(!val))
return 0;
raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, val);
raw_reg_write(regs, DG1_MSTR_TILE_INTR, val); return val;
}
static inline void dg1_master_intr_enable(void __iomem * const regs) {
raw_reg_write(regs, DG1_MSTR_UNIT_INTR, DG1_MSTR_IRQ);
raw_reg_write(regs, DG1_MSTR_TILE_INTR, DG1_MSTR_IRQ);
}
static irqreturn_t dg1_irq_handler(int irq, void *arg) {
return __gen11_irq_handler(arg,
dg1_master_intr_disable_and_ack,
dg1_master_intr_enable);
struct drm_i915_private * const i915 = arg;
struct intel_gt *gt = &i915->gt;
void __iomem * const regs = i915->uncore.regs;
u32 master_tile_ctl, master_ctl;
u32 gu_misc_iir;
if (!intel_irqs_enabled(i915))
return IRQ_NONE;
master_tile_ctl = dg1_master_intr_disable(regs);
if (!master_tile_ctl) {
dg1_master_intr_enable(regs);
return IRQ_NONE;
}
/* FIXME: we only support tile 0 for now. */
if (master_tile_ctl & DG1_MSTR_TILE(0)) {
master_ctl = raw_reg_read(regs, GEN11_GFX_MSTR_IRQ);
raw_reg_write(regs, GEN11_GFX_MSTR_IRQ, master_ctl);
} else {
DRM_ERROR("Tile not supported: 0x%08x\n", master_tile_ctl);
dg1_master_intr_enable(regs);
return IRQ_NONE;
}
gen11_gt_irq_handler(gt, master_ctl);
if (master_ctl & GEN11_DISPLAY_IRQ)
gen11_display_irq_handler(i915);
gu_misc_iir = gen11_gu_misc_irq_ack(gt, master_ctl);
dg1_master_intr_enable(regs);
gen11_gu_misc_irq_handler(gt, gu_misc_iir);
pmu_irq_stats(i915, IRQ_HANDLED);
return IRQ_HANDLED;
}
/* Called from drm generic code, passed 'crtc' which @@ -3168,10 +3184,20 @@ static void gen11_irq_reset(struct drm_i915_private *dev_priv) { struct intel_uncore *uncore = &dev_priv->uncore;
if (HAS_MASTER_UNIT_IRQ(dev_priv))
dg1_master_intr_disable_and_ack(dev_priv->uncore.regs);
else
gen11_master_intr_disable(dev_priv->uncore.regs);
gen11_master_intr_disable(dev_priv->uncore.regs);
gen11_gt_irq_reset(&dev_priv->gt);
gen11_display_irq_reset(dev_priv);
GEN3_IRQ_RESET(uncore, GEN11_GU_MISC_);
GEN3_IRQ_RESET(uncore, GEN8_PCU_);
+}
+static void dg1_irq_reset(struct drm_i915_private *dev_priv) +{
struct intel_uncore *uncore = &dev_priv->uncore;
dg1_master_intr_disable(dev_priv->uncore.regs); gen11_gt_irq_reset(&dev_priv->gt); gen11_display_irq_reset(dev_priv);
@@ -3863,13 +3889,28 @@ static void gen11_irq_postinstall(struct drm_i915_private *dev_priv)
GEN3_IRQ_INIT(uncore, GEN11_GU_MISC_, ~gu_misc_masked, gu_misc_masked);
if (HAS_MASTER_UNIT_IRQ(dev_priv)) {
dg1_master_intr_enable(uncore->regs);
intel_uncore_posting_read(&dev_priv->uncore, DG1_MSTR_UNIT_INTR);
} else {
gen11_master_intr_enable(uncore->regs);
intel_uncore_posting_read(&dev_priv->uncore, GEN11_GFX_MSTR_IRQ);
gen11_master_intr_enable(uncore->regs);
intel_uncore_posting_read(&dev_priv->uncore, GEN11_GFX_MSTR_IRQ);
+}
+static void dg1_irq_postinstall(struct drm_i915_private *dev_priv) +{
struct intel_uncore *uncore = &dev_priv->uncore;
u32 gu_misc_masked = GEN11_GU_MISC_GSE;
gen11_gt_irq_postinstall(&dev_priv->gt);
GEN3_IRQ_INIT(uncore, GEN11_GU_MISC_, ~gu_misc_masked, gu_misc_masked);
if (HAS_DISPLAY(dev_priv)) {
icp_irq_postinstall(dev_priv);
gen8_de_irq_postinstall(dev_priv);
intel_uncore_write(&dev_priv->uncore, GEN11_DISPLAY_INT_CTL,
GEN11_DISPLAY_IRQ_ENABLE); }
dg1_master_intr_enable(dev_priv->uncore.regs);
intel_uncore_posting_read(&dev_priv->uncore, DG1_MSTR_TILE_INTR);
}
static void cherryview_irq_postinstall(struct drm_i915_private *dev_priv) @@ -4408,9 +4449,9 @@ static irq_handler_t intel_irq_handler(struct drm_i915_private *dev_priv) else return i8xx_irq_handler; } else {
if (HAS_MASTER_UNIT_IRQ(dev_priv))
if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10)) return dg1_irq_handler;
if (GRAPHICS_VER(dev_priv) >= 11)
else if (GRAPHICS_VER(dev_priv) >= 11) return gen11_irq_handler; else if (GRAPHICS_VER(dev_priv) >= 8) return gen8_irq_handler;
@@ -4433,7 +4474,9 @@ static void intel_irq_reset(struct drm_i915_private *dev_priv) else i8xx_irq_reset(dev_priv); } else {
if (GRAPHICS_VER(dev_priv) >= 11)
if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10))
dg1_irq_reset(dev_priv);
else if (GRAPHICS_VER(dev_priv) >= 11) gen11_irq_reset(dev_priv); else if (GRAPHICS_VER(dev_priv) >= 8) gen8_irq_reset(dev_priv);
@@ -4456,7 +4499,9 @@ static void intel_irq_postinstall(struct drm_i915_private *dev_priv) else i8xx_irq_postinstall(dev_priv); } else {
if (GRAPHICS_VER(dev_priv) >= 11)
if (GRAPHICS_VER_FULL(dev_priv) >= IP_VER(12, 10))
dg1_irq_postinstall(dev_priv);
else if (GRAPHICS_VER(dev_priv) >= 11) gen11_irq_postinstall(dev_priv); else if (GRAPHICS_VER(dev_priv) >= 8) gen8_irq_postinstall(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index dc0883bad9cf..9684b647fd04 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -907,7 +907,6 @@ static const struct intel_device_info rkl_info = {
#define DGFX_FEATURES \ .memory_regions = REGION_SMEM | REGION_LMEM | REGION_STOLEN_LMEM, \
.has_master_unit_irq = 1, \ .has_llc = 0, \ .has_snoop = 1, \ .is_dgfx = 1
@@ -915,6 +914,7 @@ static const struct intel_device_info rkl_info = { static const struct intel_device_info dg1_info __maybe_unused = { GEN12_FEATURES, DGFX_FEATURES,
.graphics_ver_release = 10, PLATFORM(INTEL_DG1), .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), .require_force_probe = 1,
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 16a19239d86d..f7dad8541417 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7985,9 +7985,9 @@ enum { #define GEN11_GT_DW1_IRQ (1 << 1) #define GEN11_GT_DW0_IRQ (1 << 0)
-#define DG1_MSTR_UNIT_INTR _MMIO(0x190008) +#define DG1_MSTR_TILE_INTR _MMIO(0x190008) #define DG1_MSTR_IRQ REG_BIT(31) -#define DG1_MSTR_UNIT(u) REG_BIT(u) +#define DG1_MSTR_TILE(t) REG_BIT(t)
#define GEN11_DISPLAY_INT_CTL _MMIO(0x44200) #define GEN11_DISPLAY_IRQ_ENABLE (1 << 31) diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 944a5ff4df49..b00249906c28 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -127,7 +127,6 @@ enum intel_ppgtt_type { func(has_llc); \ func(has_logical_ring_contexts); \ func(has_logical_ring_elsq); \
func(has_master_unit_irq); \ func(has_pooled_eu); \ func(has_rc6); \ func(has_rc6p); \
-- 2.25.4
On Fri, Jul 02, 2021 at 11:21:10AM +0200, Daniel Vetter wrote:
On Thu, Jul 1, 2021 at 10:26 PM Matt Roper matthew.d.roper@intel.com wrote:
From: Paulo Zanoni paulo.r.zanoni@intel.com
The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes.
Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test.
Bspec: 50875 Cc: Daniele Spurio Ceraolo daniele.ceraolospurio@intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Tomasz Lis tomasz.lis@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
So I know DG1 upstream is decidedly non-smooth, but basic infrastructure we've had since forever ...
Why was this prep work not upstreamed earlier with some benign commit message that doesn't mention DG2? There's zero DG2 stuff in here, this could have landed months/years ago even. Bringing this up since right this moment we have an internal chat about trees diverging a bit much.
history isn't linear and this commit, the way it is now, didn't exist 1 month ago, so your timescale is misleading. has_master_unit_irq was what we thought we would need to share as much code as possible.
The biggest reason to fork the irq handler is actually not DG1 nor DG2, but XEHPSDV and without those changes it would basically be a 95% copy of the gen11 handler... for someone not looking to what is in the pipeline, it can be a perfect argument to "consolidate these into a single handler".
Lucas De Marchi
On Wed, Jul 7, 2021 at 12:48 AM Lucas De Marchi lucas.demarchi@intel.com wrote:
On Fri, Jul 02, 2021 at 11:21:10AM +0200, Daniel Vetter wrote:
On Thu, Jul 1, 2021 at 10:26 PM Matt Roper matthew.d.roper@intel.com wrote:
From: Paulo Zanoni paulo.r.zanoni@intel.com
The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes.
Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test.
Bspec: 50875 Cc: Daniele Spurio Ceraolo daniele.ceraolospurio@intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Tomasz Lis tomasz.lis@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
So I know DG1 upstream is decidedly non-smooth, but basic infrastructure we've had since forever ...
Why was this prep work not upstreamed earlier with some benign commit message that doesn't mention DG2? There's zero DG2 stuff in here, this could have landed months/years ago even. Bringing this up since right this moment we have an internal chat about trees diverging a bit much.
history isn't linear and this commit, the way it is now, didn't exist 1 month ago, so your timescale is misleading. has_master_unit_irq was what we thought we would need to share as much code as possible.
The biggest reason to fork the irq handler is actually not DG1 nor DG2, but XEHPSDV and without those changes it would basically be a 95% copy of the gen11 handler... for someone not looking to what is in the pipeline, it can be a perfect argument to "consolidate these into a single handler".
At least in the past we've done tons of upstream refactor prep for exactly just these "prep for future platform" reasons. Everyone understand that's necessary and generally trusts us we're not just moving code for fun. But then 1-2 years ago we just kinda stopped pushing prep work to upstream because everyone got way too busy with other things, and now we're paying the price.
I mean even if the reason to fork it is a platform we can't even talk about, the fork patch should go upstream way ahead so that there's less patches in internal. Ideally platform enabling is zero code shuffling, 100% just plugging code into existing neat places. -Daniel
On Wed, Jul 07, 2021 at 09:39:03AM +0200, Daniel Vetter wrote:
On Wed, Jul 7, 2021 at 12:48 AM Lucas De Marchi lucas.demarchi@intel.com wrote:
On Fri, Jul 02, 2021 at 11:21:10AM +0200, Daniel Vetter wrote:
On Thu, Jul 1, 2021 at 10:26 PM Matt Roper matthew.d.roper@intel.com wrote:
From: Paulo Zanoni paulo.r.zanoni@intel.com
The current interrupt handler is getting increasingly complicated and Xe_HP changes will bring even more complexity. Let's split off a new interrupt handler starting with DG1 (i.e., when the master tile interrupt register was added to the design) and use that as the basis for the new Xe_HP changes.
Now that we track the hardware IP's release number as well as the version number, we can also properly define DG1 has version "12.10" and replace the has_master_unit_irq feature flag with an IP version test.
Bspec: 50875 Cc: Daniele Spurio Ceraolo daniele.ceraolospurio@intel.com Cc: Stuart Summers stuart.summers@intel.com Signed-off-by: Paulo Zanoni paulo.r.zanoni@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Tomasz Lis tomasz.lis@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
So I know DG1 upstream is decidedly non-smooth, but basic infrastructure we've had since forever ...
Why was this prep work not upstreamed earlier with some benign commit message that doesn't mention DG2? There's zero DG2 stuff in here, this could have landed months/years ago even. Bringing this up since right this moment we have an internal chat about trees diverging a bit much.
history isn't linear and this commit, the way it is now, didn't exist 1 month ago, so your timescale is misleading. has_master_unit_irq was what we thought we would need to share as much code as possible.
The biggest reason to fork the irq handler is actually not DG1 nor DG2, but XEHPSDV and without those changes it would basically be a 95% copy of the gen11 handler... for someone not looking to what is in the pipeline, it can be a perfect argument to "consolidate these into a single handler".
At least in the past we've done tons of upstream refactor prep for exactly just these "prep for future platform" reasons. Everyone understand that's necessary and generally trusts us we're not just moving code for fun. But then 1-2 years ago we just kinda stopped pushing prep work to upstream because everyone got way too busy with other things, and now we're paying the price.
we both know this is not the only reason and looking to the people in Cc here my impression is you're preaching to the choir. Because the people in Cc here either moved to other teams before there was something working related to irq to share or continued to do prep work in upstream as much as they can.
Lucas De Marchi
I mean even if the reason to fork it is a platform we can't even talk about, the fork patch should go upstream way ahead so that there's less patches in internal. Ideally platform enabling is zero code shuffling, 100% just plugging code into existing neat places.
-Daniel
Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
On Xe_HP the fusing register is renamed and changed to have the "enable" semantics, but otherwise remains compatible (mmio address, bitmask ranges) with older platforms.
To simplify things we do not add a new register definition but just stop inverting the fusing masks before processing them.
Bspec: 33288 Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 88694822716a..151870d8fdd3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) if (GRAPHICS_VER(i915) < 11) return info->engine_mask;
- media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + /* + * On newer platforms the fusing register is called 'enable' and has + * enable semantics, while on older platforms it is called 'disable' + * and bits have disable semantices. + */ + media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE); + if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) + media_fuse = ~media_fuse;
vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK; vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
On Thu, Jul 01, 2021 at 01:23:38PM -0700, Matt Roper wrote:
From: Tvrtko Ursulin tvrtko.ursulin@intel.com
On Xe_HP the fusing register is renamed and changed to have the "enable" semantics, but otherwise remains compatible (mmio address, bitmask ranges) with older platforms.
To simplify things we do not add a new register definition but just stop inverting the fusing masks before processing them.
Bspec: 33288
This is now:
Bspec: 52615
Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
this change above,
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 88694822716a..151870d8fdd3 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -468,7 +468,14 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) if (GRAPHICS_VER(i915) < 11) return info->engine_mask;
- media_fuse = ~intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
/*
* On newer platforms the fusing register is called 'enable' and has
* enable semantics, while on older platforms it is called 'disable'
* and bits have disable semantices.
*/
media_fuse = intel_uncore_read(uncore, GEN11_GT_VEBOX_VDBOX_DISABLE);
if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
media_fuse = ~media_fuse;
vdbox_mask = media_fuse & GEN11_GT_VDBOX_DISABLE_MASK; vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
-- 2.25.4
From: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd numbered physical instances have SFC only if previous even instance is fused off.
Bspec: 48028 Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++++++++++++++++++----- 1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } }
+static inline +bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox, + unsigned int logical_vdbox, u16 vdbox_mask) +{ + /* + * In Gen11, only even numbered logical VDBOXes are hooked + * up to an SFC (Scaler & Format Converter) unit. + * In Gen12, Even numbered phyical instance always are connected + * to an SFC. Odd numbered physical instances have SFC only if + * previous even instance is fused off. + */ + if (GRAPHICS_VER(i915) == 12) { + return (physical_vdbox % 2 == 0) || + !(BIT(physical_vdbox - 1) & vdbox_mask); + } else if (GRAPHICS_VER(i915) == 11) { + return logical_vdbox % 2 == 0; + } + + MISSING_CASE(GRAPHICS_VER(i915)); + return false; +} + /* * Determine which engines are fused off in our particular hardware. * Note that we have a catch-22 situation where we need to be able to access @@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; }
- /* - * In Gen11, only even numbered logical VDBOXes are - * hooked up to an SFC (Scaler & Format Converter) unit. - * In TGL each VDBOX has access to an SFC. - */ - if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0) + if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i); + logical_vdbox++; } drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt));
On Thu, Jul 01, 2021 at 01:23:39PM -0700, Matt Roper wrote:
From: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd numbered physical instances have SFC only if previous even instance is fused off.
Bspec: 48028
considering that in TGL we have physical instances 0 and 2 (both even), we can use this logic, so it's correct correct for GRAPHICS_VER(i915) == 12. Although I wonder ifwe should be using MEDIA_VER(i915) here.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++++++++++++++++++----- 1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } }
+static inline +bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox,
unsigned int logical_vdbox, u16 vdbox_mask)
+{
- /*
* In Gen11, only even numbered logical VDBOXes are hooked
* up to an SFC (Scaler & Format Converter) unit.
* In Gen12, Even numbered phyical instance always are connected
* to an SFC. Odd numbered physical instances have SFC only if
* previous even instance is fused off.
*/
- if (GRAPHICS_VER(i915) == 12) {
return (physical_vdbox % 2 == 0) ||
!(BIT(physical_vdbox - 1) & vdbox_mask);
- } else if (GRAPHICS_VER(i915) == 11) {
return logical_vdbox % 2 == 0;
- }
- MISSING_CASE(GRAPHICS_VER(i915));
- return false;
+}
/*
- Determine which engines are fused off in our particular hardware.
- Note that we have a catch-22 situation where we need to be able to access
@@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; }
/*
* In Gen11, only even numbered logical VDBOXes are
* hooked up to an SFC (Scaler & Format Converter) unit.
* In TGL each VDBOX has access to an SFC.
*/
if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i);
} drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt));logical_vdbox++;
-- 2.25.4
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 01/07/2021 21:23, Matt Roper wrote:
From: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com
In Gen12 there are various fuse combinations and in each configuration vdbox engine may be connected to SFC depending on which engines are available, so we need to set the SFC capability based on fuse value from the hardware. Even numbered phyical instance always have SFC, odd
physical
numbered physical instances have SFC only if previous even instance is fused off.
Just a few nits.
Bspec: 48028 Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Venkata Sandeep Dhanalakota venkata.s.dhanalakota@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 30 ++++++++++++++++++----- 1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 151870d8fdd3..4ab2c9abb943 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -442,6 +442,28 @@ void intel_engines_free(struct intel_gt *gt) } }
+static inline
Inline is not desired here.
+bool vdbox_has_sfc(struct drm_i915_private *i915, unsigned int physical_vdbox,
unsigned int logical_vdbox, u16 vdbox_mask)
+{
I'd be tempted to prefix the function name with gen11_ so it is clearer it does not apply to earlier gens. Because if looking just at the diff out of context below, one can wonder if there is a functional change or not. There isn't, because there is a bailout for gen < 11 early in init_engine_mask(), but perhaps gen11 function name prefix would make this a bit more self-documenting.
- /*
* In Gen11, only even numbered logical VDBOXes are hooked
* up to an SFC (Scaler & Format Converter) unit.
* In Gen12, Even numbered phyical instance always are connected
physical
* to an SFC. Odd numbered physical instances have SFC only if
* previous even instance is fused off.
*/
- if (GRAPHICS_VER(i915) == 12) {
return (physical_vdbox % 2 == 0) ||
!(BIT(physical_vdbox - 1) & vdbox_mask);
- } else if (GRAPHICS_VER(i915) == 11) {
return logical_vdbox % 2 == 0;
- }
Not need for curlies on these branches.
- MISSING_CASE(GRAPHICS_VER(i915));
- return false;
+}
- /*
- Determine which engines are fused off in our particular hardware.
- Note that we have a catch-22 situation where we need to be able to access
@@ -493,13 +515,9 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt) continue; }
/*
* In Gen11, only even numbered logical VDBOXes are
* hooked up to an SFC (Scaler & Format Converter) unit.
* In TGL each VDBOX has access to an SFC.
*/
if (GRAPHICS_VER(i915) >= 12 || logical_vdbox++ % 2 == 0)
if (vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask)) gt->info.vdbox_sfc_access |= BIT(i);
} drm_dbg(&i915->drm, "vdbox enable: %04x, instances: %04lx\n", vdbox_mask, VDBOX_MASK(gt));logical_vdbox++;
Regards,
Tvrtko
From: John Harrison John.C.Harrison@Intel.com
Increasing the engine count causes a couple of local array variables to exceed the kernel stack limit. So make them dynamic allocations instead.
Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/selftest_execlists.c | 10 ++++-- .../gpu/drm/i915/gt/selftest_workarounds.c | 32 ++++++++++++------- 2 files changed, 29 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 08896ae027d5..1e7fe2222479 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) #define BATCH BIT(0) { struct task_struct *tsk[I915_NUM_ENGINES] = {}; - struct preempt_smoke arg[I915_NUM_ENGINES]; + struct preempt_smoke *arg; struct intel_engine_cs *engine; enum intel_engine_id id; unsigned long count; int err = 0;
+ arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL); + if (!arg) + return -ENOMEM; + for_each_engine(engine, smoke->gt, id) { arg[id] = *smoke; arg[id].engine = engine; @@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) arg[id].batch = NULL; arg[id].count = 0;
- tsk[id] = kthread_run(smoke_crescendo_thread, &arg, + tsk[id] = kthread_run(smoke_crescendo_thread, arg, "igt/smoke:%d", id); if (IS_ERR(tsk[id])) { err = PTR_ERR(tsk[id]); @@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n", count, flags, smoke->gt->info.num_engines, smoke->ncontext); + + kfree(arg); return 0; }
diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c b/drivers/gpu/drm/i915/gt/selftest_workarounds.c index 7ebc4edb8ecf..7a38ce40feb2 100644 --- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c +++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c @@ -1175,31 +1175,36 @@ live_gpu_reset_workarounds(void *arg) { struct intel_gt *gt = arg; intel_wakeref_t wakeref; - struct wa_lists lists; + struct wa_lists *lists; bool ok;
if (!intel_has_gpu_reset(gt)) return 0;
+ lists = kzalloc(sizeof(*lists), GFP_KERNEL); + if (!lists) + return -ENOMEM; + pr_info("Verifying after GPU reset...\n");
igt_global_reset_lock(gt); wakeref = intel_runtime_pm_get(gt->uncore->rpm);
- reference_lists_init(gt, &lists); + reference_lists_init(gt, lists);
- ok = verify_wa_lists(gt, &lists, "before reset"); + ok = verify_wa_lists(gt, lists, "before reset"); if (!ok) goto out;
intel_gt_reset(gt, ALL_ENGINES, "live_workarounds");
- ok = verify_wa_lists(gt, &lists, "after reset"); + ok = verify_wa_lists(gt, lists, "after reset");
out: - reference_lists_fini(gt, &lists); + reference_lists_fini(gt, lists); intel_runtime_pm_put(gt->uncore->rpm, wakeref); igt_global_reset_unlock(gt); + kfree(lists);
return ok ? 0 : -ESRCH; } @@ -1214,16 +1219,20 @@ live_engine_reset_workarounds(void *arg) struct igt_spinner spin; struct i915_request *rq; intel_wakeref_t wakeref; - struct wa_lists lists; + struct wa_lists *lists; int ret = 0;
if (!intel_has_reset_engine(gt)) return 0;
+ lists = kzalloc(sizeof(*lists), GFP_KERNEL); + if (!lists) + return -ENOMEM; + igt_global_reset_lock(gt); wakeref = intel_runtime_pm_get(gt->uncore->rpm);
- reference_lists_init(gt, &lists); + reference_lists_init(gt, lists);
for_each_engine(engine, gt, id) { bool ok; @@ -1235,7 +1244,7 @@ live_engine_reset_workarounds(void *arg) break; }
- ok = verify_wa_lists(gt, &lists, "before reset"); + ok = verify_wa_lists(gt, lists, "before reset"); if (!ok) { ret = -ESRCH; goto err; @@ -1247,7 +1256,7 @@ live_engine_reset_workarounds(void *arg) goto err; }
- ok = verify_wa_lists(gt, &lists, "after idle reset"); + ok = verify_wa_lists(gt, lists, "after idle reset"); if (!ok) { ret = -ESRCH; goto err; @@ -1282,7 +1291,7 @@ live_engine_reset_workarounds(void *arg) igt_spinner_end(&spin); igt_spinner_fini(&spin);
- ok = verify_wa_lists(gt, &lists, "after busy reset"); + ok = verify_wa_lists(gt, lists, "after busy reset"); if (!ok) { ret = -ESRCH; goto err; @@ -1294,9 +1303,10 @@ live_engine_reset_workarounds(void *arg) break; }
- reference_lists_fini(gt, &lists); + reference_lists_fini(gt, lists); intel_runtime_pm_put(gt->uncore->rpm, wakeref); igt_global_reset_unlock(gt); + kfree(lists);
igt_flush_test(gt->i915);
On Thu, Jul 01, 2021 at 01:23:40PM -0700, Matt Roper wrote:
From: John Harrison John.C.Harrison@Intel.com
Increasing the engine count causes a couple of local array variables to exceed the kernel stack limit. So make them dynamic allocations instead.
Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/selftest_execlists.c | 10 ++++-- .../gpu/drm/i915/gt/selftest_workarounds.c | 32 ++++++++++++------- 2 files changed, 29 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 08896ae027d5..1e7fe2222479 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -3561,12 +3561,16 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) #define BATCH BIT(0) { struct task_struct *tsk[I915_NUM_ENGINES] = {};
- struct preempt_smoke arg[I915_NUM_ENGINES];
struct preempt_smoke *arg; struct intel_engine_cs *engine; enum intel_engine_id id; unsigned long count; int err = 0;
arg = kmalloc_array(I915_NUM_ENGINES, sizeof(*arg), GFP_KERNEL);
if (!arg)
return -ENOMEM;
for_each_engine(engine, smoke->gt, id) { arg[id] = *smoke; arg[id].engine = engine;
@@ -3574,7 +3578,7 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags) arg[id].batch = NULL; arg[id].count = 0;
tsk[id] = kthread_run(smoke_crescendo_thread, &arg,
if (IS_ERR(tsk[id])) { err = PTR_ERR(tsk[id]);tsk[id] = kthread_run(smoke_crescendo_thread, arg, "igt/smoke:%d", id);
@@ -3603,6 +3607,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
pr_info("Submitted %lu crescendo:%x requests across %d engines and %d contexts\n", count, flags, smoke->gt->info.num_engines, smoke->ncontext);
- kfree(arg); return 0;
this looks correctly, but apparently this test doesn't test anything as `err` is write-only - there is only one read, but basically to avoid overriding an earlier error.
looks like this should be `return err;` ? +Chris
This patch itself looks good.
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the basic definitions for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 ++- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 50 ++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 ++++-- drivers/gpu/drm/i915/i915_reg.h | 6 +++ 4 files changed, 69 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 87b06572fd2e..35edc55720f4 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) if (mode & EMIT_INVALIDATE) aux_inv = rq->engine->mask & ~BIT(BCS0); if (aux_inv) - cmd += 2 * hweight8(aux_inv) + 2; + cmd += 2 * hweight32(aux_inv) + 2;
cs = intel_ring_begin(rq, cmd); if (IS_ERR(cs)) @@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) struct intel_engine_cs *engine; unsigned int tmp;
- *cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv)); - for_each_engine_masked(engine, rq->engine->gt, - aux_inv, tmp) { + *cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv)); + for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) { *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine)); *cs++ = AUX_INV; } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4ab2c9abb943..6e2aa1acc4d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE } }, }, + [VCS4] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 4, + .mmio_bases = { + { .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE } + }, + }, + [VCS5] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 5, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE } + }, + }, + [VCS6] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 6, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE } + }, + }, + [VCS7] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_DECODE_CLASS, + .instance = 7, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE } + }, + }, [VECS0] = { .hw_id = VECS0_HW, .class = VIDEO_ENHANCEMENT_CLASS, @@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE } }, }, + [VECS2] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 2, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE } + }, + }, + [VECS3] = { + .hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */ + .class = VIDEO_ENHANCEMENT_CLASS, + .instance = 3, + .mmio_bases = { + { .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE } + }, + }, };
/** @@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH)); BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH)); + BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1)); + BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine))) return -EINVAL; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 5b91068ab277..b25f594a7e4b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -46,7 +46,7 @@ #define COPY_ENGINE_CLASS 3 #define OTHER_CLASS 4 #define MAX_ENGINE_CLASS 4 -#define MAX_ENGINE_INSTANCE 3 +#define MAX_ENGINE_INSTANCE 7
#define I915_MAX_SLICES 3 #define I915_MAX_SUBSLICES 8 @@ -64,7 +64,7 @@ struct intel_gt; struct intel_ring; struct intel_uncore;
-typedef u8 intel_engine_mask_t; +typedef u32 intel_engine_mask_t; #define ALL_ENGINES ((intel_engine_mask_t)~0ul)
struct intel_hw_status_page { @@ -101,8 +101,8 @@ struct i915_ctx_workarounds { struct i915_vma *vma; };
-#define I915_MAX_VCS 4 -#define I915_MAX_VECS 2 +#define I915_MAX_VCS 8 +#define I915_MAX_VECS 4
/* * Engine IDs definitions. @@ -115,9 +115,15 @@ enum intel_engine_id { VCS1, VCS2, VCS3, + VCS4, + VCS5, + VCS6, + VCS7, #define _VCS(n) (VCS0 + (n)) VECS0, VECS1, + VECS2, + VECS3, #define _VECS(n) (VECS0 + (n)) I915_NUM_ENGINES #define INVALID_ENGINE ((enum intel_engine_id)-1) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index f7dad8541417..d4546e871833 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2516,9 +2516,15 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN11_BSD2_RING_BASE 0x1c4000 #define GEN11_BSD3_RING_BASE 0x1d0000 #define GEN11_BSD4_RING_BASE 0x1d4000 +#define XEHP_BSD5_RING_BASE 0x1e0000 +#define XEHP_BSD6_RING_BASE 0x1e4000 +#define XEHP_BSD7_RING_BASE 0x1f0000 +#define XEHP_BSD8_RING_BASE 0x1f4000 #define VEBOX_RING_BASE 0x1a000 #define GEN11_VEBOX_RING_BASE 0x1c8000 #define GEN11_VEBOX2_RING_BASE 0x1d8000 +#define XEHP_VEBOX3_RING_BASE 0x1e8000 +#define XEHP_VEBOX4_RING_BASE 0x1f8000 #define BLT_RING_BASE 0x22000 #define RING_TAIL(base) _MMIO((base) + 0x30) #define RING_HEAD(base) _MMIO((base) + 0x34)
On 01/07/2021 21:23, Matt Roper wrote:
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the basic definitions for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 ++- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 50 ++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 14 ++++-- drivers/gpu/drm/i915/i915_reg.h | 6 +++ 4 files changed, 69 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c index 87b06572fd2e..35edc55720f4 100644 --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c @@ -279,7 +279,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) if (mode & EMIT_INVALIDATE) aux_inv = rq->engine->mask & ~BIT(BCS0); if (aux_inv)
cmd += 2 * hweight8(aux_inv) + 2;
cmd += 2 * hweight32(aux_inv) + 2;
cs = intel_ring_begin(rq, cmd); if (IS_ERR(cs))
@@ -313,9 +313,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode) struct intel_engine_cs *engine; unsigned int tmp;
*cs++ = MI_LOAD_REGISTER_IMM(hweight8(aux_inv));
for_each_engine_masked(engine, rq->engine->gt,
aux_inv, tmp) {
*cs++ = MI_LOAD_REGISTER_IMM(hweight32(aux_inv));
}for_each_engine_masked(engine, rq->engine->gt, aux_inv, tmp) { *cs++ = i915_mmio_reg_offset(aux_inv_reg(engine)); *cs++ = AUX_INV;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 4ab2c9abb943..6e2aa1acc4d4 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -104,6 +104,38 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_BSD4_RING_BASE } }, },
- [VCS4] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_DECODE_CLASS,
.instance = 4,
.mmio_bases = {
{ .graphics_ver = 11, .base = XEHP_BSD5_RING_BASE }
},
- },
- [VCS5] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_DECODE_CLASS,
.instance = 5,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHP_BSD6_RING_BASE }
},
- },
- [VCS6] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_DECODE_CLASS,
.instance = 6,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHP_BSD7_RING_BASE }
},
- },
- [VCS7] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_DECODE_CLASS,
.instance = 7,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHP_BSD8_RING_BASE }
},
- }, [VECS0] = { .hw_id = VECS0_HW, .class = VIDEO_ENHANCEMENT_CLASS,
@@ -121,6 +153,22 @@ static const struct engine_info intel_engines[] = { { .graphics_ver = 11, .base = GEN11_VEBOX2_RING_BASE } }, },
[VECS2] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_ENHANCEMENT_CLASS,
.instance = 2,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHP_VEBOX3_RING_BASE }
},
},
[VECS3] = {
.hw_id = 0, /* not used in GEN12+, see MI_SEMAPHORE_SIGNAL */
.class = VIDEO_ENHANCEMENT_CLASS,
.instance = 3,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHP_VEBOX4_RING_BASE }
},
}, };
/**
@@ -269,6 +317,8 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
BUILD_BUG_ON(MAX_ENGINE_CLASS >= BIT(GEN11_ENGINE_CLASS_WIDTH)); BUILD_BUG_ON(MAX_ENGINE_INSTANCE >= BIT(GEN11_ENGINE_INSTANCE_WIDTH));
BUILD_BUG_ON(I915_MAX_VCS > (MAX_ENGINE_INSTANCE + 1));
BUILD_BUG_ON(I915_MAX_VECS > (MAX_ENGINE_INSTANCE + 1));
if (GEM_DEBUG_WARN_ON(id >= ARRAY_SIZE(gt->engine))) return -EINVAL;
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 5b91068ab277..b25f594a7e4b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -46,7 +46,7 @@ #define COPY_ENGINE_CLASS 3 #define OTHER_CLASS 4 #define MAX_ENGINE_CLASS 4 -#define MAX_ENGINE_INSTANCE 3 +#define MAX_ENGINE_INSTANCE 7
#define I915_MAX_SLICES 3 #define I915_MAX_SUBSLICES 8 @@ -64,7 +64,7 @@ struct intel_gt; struct intel_ring; struct intel_uncore;
-typedef u8 intel_engine_mask_t; +typedef u32 intel_engine_mask_t;
u16 would be enough but it's probably pointless unless it makes it better considering what I'll write next.
What I'd do is reorder the fields in struct intel_gt_info to avoid padding, probably just pulling l3bank_mask up to be second is best.
Similar for struct intel_device_info because there's a ton of those and so historically we were actually laying it out with care. A perfect solution while keeping logical grouping might not be possible but worth having a look.
Regards,
Tvrtko
#define ALL_ENGINES ((intel_engine_mask_t)~0ul)
struct intel_hw_status_page { @@ -101,8 +101,8 @@ struct i915_ctx_workarounds { struct i915_vma *vma; };
-#define I915_MAX_VCS 4 -#define I915_MAX_VECS 2 +#define I915_MAX_VCS 8 +#define I915_MAX_VECS 4
/*
- Engine IDs definitions.
@@ -115,9 +115,15 @@ enum intel_engine_id { VCS1, VCS2, VCS3,
- VCS4,
- VCS5,
- VCS6,
- VCS7, #define _VCS(n) (VCS0 + (n)) VECS0, VECS1,
- VECS2,
- VECS3, #define _VECS(n) (VECS0 + (n)) I915_NUM_ENGINES #define INVALID_ENGINE ((enum intel_engine_id)-1)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index f7dad8541417..d4546e871833 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2516,9 +2516,15 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN11_BSD2_RING_BASE 0x1c4000 #define GEN11_BSD3_RING_BASE 0x1d0000 #define GEN11_BSD4_RING_BASE 0x1d4000 +#define XEHP_BSD5_RING_BASE 0x1e0000 +#define XEHP_BSD6_RING_BASE 0x1e4000 +#define XEHP_BSD7_RING_BASE 0x1f0000 +#define XEHP_BSD8_RING_BASE 0x1f4000 #define VEBOX_RING_BASE 0x1a000 #define GEN11_VEBOX_RING_BASE 0x1c8000 #define GEN11_VEBOX2_RING_BASE 0x1d8000 +#define XEHP_VEBOX3_RING_BASE 0x1e8000 +#define XEHP_VEBOX4_RING_BASE 0x1f8000 #define BLT_RING_BASE 0x22000 #define RING_TAIL(base) _MMIO((base) + 0x30) #define RING_HEAD(base) _MMIO((base) + 0x34)
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 ++++++++++++- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0); + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); @@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) + intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask); + if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) + intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask); - + if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) + intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask); /* * RPS interrupts will get enabled/disabled on demand when RPS itself * is enabled/disabled. diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d4546e871833..cb1716b6ce72 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8076,7 +8076,10 @@ enum { #define GEN11_BCS_RSVD_INTR_MASK _MMIO(0x1900a0) #define GEN11_VCS0_VCS1_INTR_MASK _MMIO(0x1900a8) #define GEN11_VCS2_VCS3_INTR_MASK _MMIO(0x1900ac) +#define GEN12_VCS4_VCS5_INTR_MASK _MMIO(0x1900b0) +#define GEN12_VCS6_VCS7_INTR_MASK _MMIO(0x1900b4) #define GEN11_VECS0_VECS1_INTR_MASK _MMIO(0x1900d0) +#define GEN12_VECS2_VECS3_INTR_MASK _MMIO(0x1900d4) #define GEN11_GUC_SG_INTR_MASK _MMIO(0x1900e8) #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec) #define GEN11_CRYPTO_RSVD_INTR_MASK _MMIO(0x1900f0)
On 01/07/2021 21:23, Matt Roper wrote:
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 ++++++++++++- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0);
if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);
if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
@@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
- if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
- if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
Poor 0-1 sandwiched between 4-7 and 2-3. ;) With hopefully order restored:
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
- if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
/*intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~dmask);
- RPS interrupts will get enabled/disabled on demand when RPS itself
- is enabled/disabled.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d4546e871833..cb1716b6ce72 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8076,7 +8076,10 @@ enum { #define GEN11_BCS_RSVD_INTR_MASK _MMIO(0x1900a0) #define GEN11_VCS0_VCS1_INTR_MASK _MMIO(0x1900a8) #define GEN11_VCS2_VCS3_INTR_MASK _MMIO(0x1900ac) +#define GEN12_VCS4_VCS5_INTR_MASK _MMIO(0x1900b0) +#define GEN12_VCS6_VCS7_INTR_MASK _MMIO(0x1900b4) #define GEN11_VECS0_VECS1_INTR_MASK _MMIO(0x1900d0) +#define GEN12_VECS2_VECS3_INTR_MASK _MMIO(0x1900d4) #define GEN11_GUC_SG_INTR_MASK _MMIO(0x1900e8) #define GEN11_GPM_WGBOXPERF_INTR_MASK _MMIO(0x1900ec) #define GEN11_CRYPTO_RSVD_INTR_MASK _MMIO(0x1900f0)
On Fri, Jul 02, 2021 at 01:42:59PM +0100, Tvrtko Ursulin wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 ++++++++++++- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
- if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~0);
- if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~0);intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~0);
- if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3))
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0); intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);
@@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt) intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask); intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask); intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
- if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask);
- if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7))
intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);
Poor 0-1 sandwiched between 4-7 and 2-3. ;) With hopefully order restored:
not sure I understand this, order looks correct to me. It handles all (possible) VCS engines, and later VECS
Lucas De Marchi
On 06/07/2021 22:15, Lucas De Marchi wrote:
On Fri, Jul 02, 2021 at 01:42:59PM +0100, Tvrtko Ursulin wrote:
On 01/07/2021 21:23, Matt Roper wrote:
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the interrupt handler support for them.
Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/gt/intel_gt_irq.c | 13 ++++++++++++-  drivers/gpu/drm/i915/i915_reg.h       | 3 +++  2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_irq.c b/drivers/gpu/drm/i915/gt/intel_gt_irq.c index c13462274fe8..b2de83be4d97 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_irq.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_irq.c @@ -184,7 +184,13 @@ void gen11_gt_irq_reset(struct intel_gt *gt)     intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK,   ~0);     intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK,   ~0);     intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK,   ~0); +   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) +       intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK,  ~0); +   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) +       intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK,  ~0);     intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK,   ~0); +   if (HAS_ENGINE(gt, VECS2) || HAS_ENGINE(gt, VECS3)) +       intel_uncore_write(uncore, GEN12_VECS2_VECS3_INTR_MASK, ~0);     intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_ENABLE, 0);     intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0); @@ -218,8 +224,13 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)     intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);     intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);     intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask); +   if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5)) +       intel_uncore_write(uncore, GEN12_VCS4_VCS5_INTR_MASK, ~dmask); +   if (HAS_ENGINE(gt, VCS6) || HAS_ENGINE(gt, VCS7)) +       intel_uncore_write(uncore, GEN12_VCS6_VCS7_INTR_MASK, ~dmask);     intel_uncore_write(uncore, GEN11_VECS0_VECS1_INTR_MASK, ~dmask);
Poor 0-1 sandwiched between 4-7 and 2-3. ;) With hopefully order restored:
not sure I understand this, order looks correct to me. It handles all (possible) VCS engines, and later VECS
Oops my bad, my eyes confused VCS and VECS blocks.
Regards,
Tvrtko
From: John Harrison John.C.Harrison@Intel.com
Xe_HP can have a lot of extra media engines. This patch adds the reset support for them.
Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_reset.c | 6 ++++++ drivers/gpu/drm/i915/i915_reg.h | 8 ++++++++ 2 files changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c b/drivers/gpu/drm/i915/gt/intel_reset.c index 72251638d4ea..9586613ee399 100644 --- a/drivers/gpu/drm/i915/gt/intel_reset.c +++ b/drivers/gpu/drm/i915/gt/intel_reset.c @@ -515,8 +515,14 @@ static int gen11_reset_engines(struct intel_gt *gt, [VCS1] = GEN11_GRDOM_MEDIA2, [VCS2] = GEN11_GRDOM_MEDIA3, [VCS3] = GEN11_GRDOM_MEDIA4, + [VCS4] = GEN11_GRDOM_MEDIA5, + [VCS5] = GEN11_GRDOM_MEDIA6, + [VCS6] = GEN11_GRDOM_MEDIA7, + [VCS7] = GEN11_GRDOM_MEDIA8, [VECS0] = GEN11_GRDOM_VECS, [VECS1] = GEN11_GRDOM_VECS2, + [VECS2] = GEN11_GRDOM_VECS3, + [VECS3] = GEN11_GRDOM_VECS4, }; struct intel_engine_cs *engine; intel_engine_mask_t tmp; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index cb1716b6ce72..dbc233442dd0 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -395,10 +395,18 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN11_GRDOM_MEDIA2 (1 << 6) #define GEN11_GRDOM_MEDIA3 (1 << 7) #define GEN11_GRDOM_MEDIA4 (1 << 8) +#define GEN11_GRDOM_MEDIA5 (1 << 9) +#define GEN11_GRDOM_MEDIA6 (1 << 10) +#define GEN11_GRDOM_MEDIA7 (1 << 11) +#define GEN11_GRDOM_MEDIA8 (1 << 12) #define GEN11_GRDOM_VECS (1 << 13) #define GEN11_GRDOM_VECS2 (1 << 14) +#define GEN11_GRDOM_VECS3 (1 << 15) +#define GEN11_GRDOM_VECS4 (1 << 16) #define GEN11_GRDOM_SFC0 (1 << 17) #define GEN11_GRDOM_SFC1 (1 << 18) +#define GEN11_GRDOM_SFC2 (1 << 19) +#define GEN11_GRDOM_SFC3 (1 << 20)
#define GEN11_VCS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << ((instance) >> 1)) #define GEN11_VECS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << (instance))
Implement Xe_HP forcewake handling. While we're at it, let's reorder to the forcewake assignment if/else ladder to match our usual driver conventions.
Co-authored-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../drm/i915/gt/intel_execlists_submission.c | 4 + drivers/gpu/drm/i915/intel_uncore.c | 336 +++++++++++++++--- drivers/gpu/drm/i915/intel_uncore.h | 14 +- drivers/gpu/drm/i915/selftests/intel_uncore.c | 2 + 4 files changed, 302 insertions(+), 54 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index cdb2126a159a..15ba0d83151a 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3318,6 +3318,10 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) i915_mmio_reg_offset(RING_EXECLIST_SQ_CONTENTS(base)); execlists->ctrl_reg = uncore->regs + i915_mmio_reg_offset(RING_EXECLIST_CONTROL(base)); + + engine->fw_domain = intel_uncore_forcewake_for_reg(engine->uncore, + RING_EXECLIST_CONTROL(engine->mmio_base), + FW_REG_WRITE); } else { execlists->submit_reg = uncore->regs + i915_mmio_reg_offset(RING_ELSP(base)); diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index d067524f9162..676b0052f01e 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -24,6 +24,8 @@ #include <linux/pm_runtime.h> #include <asm/iosf_mbi.h>
+#include "gt/intel_lrc_reg.h" /* for shadow reg list */ + #include "i915_drv.h" #include "i915_trace.h" #include "i915_vgpu.h" @@ -68,8 +70,14 @@ static const char * const forcewake_domain_names[] = { "vdbox1", "vdbox2", "vdbox3", + "vdbox4", + "vdbox5", + "vdbox6", + "vdbox7", "vebox0", "vebox1", + "vebox2", + "vebox3", };
const char * @@ -952,30 +960,80 @@ static const i915_reg_t gen8_shadowed_regs[] = { };
static const i915_reg_t gen11_shadowed_regs[] = { - RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */ - GEN6_RPNSWREQ, /* 0xA008 */ - GEN6_RC_VIDEO_FREQ, /* 0xA00C */ - RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ - RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */ - RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */ - RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ - RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */ - RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */ - RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */ + RING_EXECLIST_CONTROL(RENDER_RING_BASE), /* 0x2550 */ + GEN6_RPNSWREQ, /* 0xA008 */ + GEN6_RC_VIDEO_FREQ, /* 0xA00C */ + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ + RING_EXECLIST_CONTROL(BLT_RING_BASE), /* 0x22550 */ + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD_RING_BASE), /* 0x1C0550 */ + RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD2_RING_BASE), /* 0x1C4550 */ + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX_RING_BASE), /* 0x1C8550 */ + RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD3_RING_BASE), /* 0x1D0550 */ + RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD4_RING_BASE), /* 0x1D4550 */ + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX2_RING_BASE), /* 0x1D8550 */ /* TODO: Other registers are not yet used */ };
static const i915_reg_t gen12_shadowed_regs[] = { - RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */ - GEN6_RPNSWREQ, /* 0xA008 */ - GEN6_RC_VIDEO_FREQ, /* 0xA00C */ - RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ - RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */ - RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */ - RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ - RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */ - RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */ - RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */ + RING_EXECLIST_CONTROL(RENDER_RING_BASE), /* 0x2550 */ + GEN6_RPNSWREQ, /* 0xA008 */ + GEN6_RC_VIDEO_FREQ, /* 0xA00C */ + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ + RING_EXECLIST_CONTROL(BLT_RING_BASE), /* 0x22550 */ + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD_RING_BASE), /* 0x1C0550 */ + RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD2_RING_BASE), /* 0x1C4550 */ + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX_RING_BASE), /* 0x1C8550 */ + RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD3_RING_BASE), /* 0x1D0550 */ + RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD4_RING_BASE), /* 0x1D4550 */ + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX2_RING_BASE), /* 0x1D8550 */ + /* TODO: Other registers are not yet used */ +}; + +static const i915_reg_t xehp_shadowed_regs[] = { + RING_TAIL(RENDER_RING_BASE), /* 0x2000 (base) */ + RING_EXECLIST_CONTROL(RENDER_RING_BASE), /* 0x2550 */ + GEN6_RPNSWREQ, /* 0xA008 */ + GEN6_RC_VIDEO_FREQ, /* 0xA00C */ + RING_TAIL(BLT_RING_BASE), /* 0x22000 (base) */ + RING_EXECLIST_CONTROL(BLT_RING_BASE), /* 0x22550 */ + RING_TAIL(GEN11_BSD_RING_BASE), /* 0x1C0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD_RING_BASE), /* 0x1C0550 */ + RING_TAIL(GEN11_BSD2_RING_BASE), /* 0x1C4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD2_RING_BASE), /* 0x1C4550 */ + RING_TAIL(GEN11_VEBOX_RING_BASE), /* 0x1C8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX_RING_BASE), /* 0x1C8550 */ + RING_TAIL(GEN11_BSD3_RING_BASE), /* 0x1D0000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD3_RING_BASE), /* 0x1D0550 */ + RING_TAIL(GEN11_BSD4_RING_BASE), /* 0x1D4000 (base) */ + RING_EXECLIST_CONTROL(GEN11_BSD4_RING_BASE), /* 0x1D4550 */ + RING_TAIL(GEN11_VEBOX2_RING_BASE), /* 0x1D8000 (base) */ + RING_EXECLIST_CONTROL(GEN11_VEBOX2_RING_BASE), /* 0x1D8550 */ + RING_TAIL(XEHP_BSD5_RING_BASE), /* 0x1E0000 (base) */ + RING_EXECLIST_CONTROL(XEHP_BSD5_RING_BASE), /* 0x1E0550 */ + RING_TAIL(XEHP_BSD6_RING_BASE), /* 0x1E4000 (base) */ + RING_EXECLIST_CONTROL(XEHP_BSD6_RING_BASE), /* 0x1E4550 */ + RING_TAIL(XEHP_VEBOX3_RING_BASE), /* 0x1E8000 (base) */ + RING_EXECLIST_CONTROL(XEHP_VEBOX3_RING_BASE), /* 0x1E8550 */ + RING_TAIL(XEHP_BSD7_RING_BASE), /* 0x1F0000 (base) */ + RING_EXECLIST_CONTROL(XEHP_BSD7_RING_BASE), /* 0x1F0550 */ + RING_TAIL(XEHP_BSD8_RING_BASE), /* 0x1F4000 (base) */ + RING_EXECLIST_CONTROL(XEHP_BSD8_RING_BASE), /* 0x1F4550 */ + RING_TAIL(XEHP_VEBOX4_RING_BASE), /* 0x1F8000 (base) */ + RING_EXECLIST_CONTROL(XEHP_VEBOX4_RING_BASE), /* 0x1F8550 */ /* TODO: Other registers are not yet used */ };
@@ -991,17 +1049,18 @@ static int mmio_reg_cmp(u32 key, const i915_reg_t *reg) return 0; }
-#define __is_genX_shadowed(x) \ -static bool is_gen##x##_shadowed(u32 offset) \ +#define __is_X_shadowed(x) \ +static bool is_##x##_shadowed(u32 offset) \ { \ - const i915_reg_t *regs = gen##x##_shadowed_regs; \ - return BSEARCH(offset, regs, ARRAY_SIZE(gen##x##_shadowed_regs), \ + const i915_reg_t *regs = x##_shadowed_regs; \ + return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \ mmio_reg_cmp); \ }
-__is_genX_shadowed(8) -__is_genX_shadowed(11) -__is_genX_shadowed(12) +__is_X_shadowed(gen8) +__is_X_shadowed(gen11) +__is_X_shadowed(gen12) +__is_X_shadowed(xehp)
static enum forcewake_domains gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) @@ -1065,6 +1124,15 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { __fwd; \ })
+#define __xehp_fwtable_reg_write_fw_domains(uncore, offset) \ +({ \ + enum forcewake_domains __fwd = 0; \ + const u32 __offset = (offset); \ + if (!is_xehp_shadowed(__offset)) \ + __fwd = find_fw_domain(uncore, __offset); \ + __fwd; \ +}) + /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ static const struct intel_forcewake_range __gen9_fw_ranges[] = { GEN_FW_RANGE(0x0, 0xaff, FORCEWAKE_GT), @@ -1249,6 +1317,145 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = { 0x1d3f00 - 0x1d3fff: VD2 */ };
+/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ +static const struct intel_forcewake_range __xehp_fw_ranges[] = { + GEN_FW_RANGE(0x0, 0x1fff, 0), /* + 0x0 - 0xaff: reserved + 0xb00 - 0x1fff: always on */ + GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER), + GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT), + GEN_FW_RANGE(0x4b00, 0x51ff, 0), /* + 0x4b00 - 0x4fff: reserved + 0x5000 - 0x51ff: always on */ + GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER), + GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT), + GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER), + GEN_FW_RANGE(0x8160, 0x81ff, 0), /* + 0x8160 - 0x817f: reserved + 0x8180 - 0x81ff: always on */ + GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT), + GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER), + GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /* + 0x8500 - 0x87ff: gt + 0x8800 - 0x8fff: reserved + 0x9000 - 0x947f: gt + 0x9480 - 0x94cf: reserved */ + GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER), + GEN_FW_RANGE(0x9560, 0x97ff, 0), /* + 0x9560 - 0x95ff: always on + 0x9600 - 0x97ff: reserved */ + GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /* + 0x9800 - 0xb4ff: gt + 0xb500 - 0xbfff: reserved + 0xc000 - 0xcfff: gt */ + GEN_FW_RANGE(0xd000, 0xd7ff, 0), + GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT), + GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER), + GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /* + 0xdd00 - 0xddff: gt + 0xde00 - 0xde7f: reserved */ + GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /* + 0xde80 - 0xdfff: render + 0xe000 - 0xe0ff: reserved + 0xe100 - 0xe8ff: render */ + GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /* + 0xe900 - 0xe9ff: gt + 0xea00 - 0xefff: reserved + 0xf000 - 0xffff: gt */ + GEN_FW_RANGE(0x10000, 0x13fff, 0), /* + 0x10000 - 0x11fff: reserved + 0x12000 - 0x127ff: always on + 0x12800 - 0x13fff: reserved */ + GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), + GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), + GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), + GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), + GEN_FW_RANGE(0x14800, 0x1ffff, FORCEWAKE_RENDER), /* + 0x14800 - 0x14fff: render + 0x15000 - 0x16dff: reserved + 0x16e00 - 0x1ffff: render */ + GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /* + 0x20000 - 0x20fff: VD0 + 0x21000 - 0x21fff: reserved */ + GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT), + GEN_FW_RANGE(0x24000, 0x2417f, 0), /* + 0x24000 - 0x2407f: always on + 0x24080 - 0x2417f: reserved */ + GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /* + 0x24180 - 0x241ff: gt + 0x24200 - 0x249ff: reserved */ + GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /* + 0x24a00 - 0x24a7f: render + 0x24a80 - 0x251ff: reserved */ + GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /* + 0x25200 - 0x252ff: gt + 0x25300 - 0x25fff: reserved */ + GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /* + 0x26000 - 0x27fff: render + 0x28000 - 0x29fff: reserved + 0x2a000 - 0x2ffff: undocumented */ + GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT), + GEN_FW_RANGE(0x40000, 0x1bffff, 0), + GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /* + 0x1c0000 - 0x1c2bff: VD0 + 0x1c2c00 - 0x1c2cff: reserved + 0x1c2d00 - 0x1c2dff: VD0 + 0x1c2e00 - 0x1c3eff: reserved + 0x1c3f00 - 0x1c3fff: VD0 */ + GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /* + 0x1c4000 - 0x1c6bff: VD1 + 0x1c6c00 - 0x1c6cff: reserved + 0x1c6d00 - 0x1c6dff: VD1 + 0x1c6e00 - 0x1c7fff: reserved */ + GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /* + 0x1c8000 - 0x1ca0ff: VE0 + 0x1ca100 - 0x1cbfff: reserved */ + GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0), + GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2), + GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4), + GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6), + GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /* + 0x1d0000 - 0x1d2bff: VD2 + 0x1d2c00 - 0x1d2cff: reserved + 0x1d2d00 - 0x1d2dff: VD2 + 0x1d2e00 - 0x1d3eff: reserved + 0x1d3f00 - 0x1d3fff: VD2 */ + GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /* + 0x1d4000 - 0x1d6bff: VD3 + 0x1d6c00 - 0x1d6cff: reserved + 0x1d6d00 - 0x1d6dff: VD3 + 0x1d6e00 - 0x1d7fff: reserved */ + GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /* + 0x1d8000 - 0x1da0ff: VE1 + 0x1da100 - 0x1dffff: reserved */ + GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /* + 0x1e0000 - 0x1e2bff: VD4 + 0x1e2c00 - 0x1e2cff: reserved + 0x1e2d00 - 0x1e2dff: VD4 + 0x1e2e00 - 0x1e3eff: reserved + 0x1e3f00 - 0x1e3fff: VD4 */ + GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /* + 0x1e4000 - 0x1e6bff: VD5 + 0x1e6c00 - 0x1e6cff: reserved + 0x1e6d00 - 0x1e6dff: VD5 + 0x1e6e00 - 0x1e7fff: reserved */ + GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /* + 0x1e8000 - 0x1ea0ff: VE2 + 0x1ea100 - 0x1effff: reserved */ + GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /* + 0x1f0000 - 0x1f2bff: VD6 + 0x1f2c00 - 0x1f2cff: reserved + 0x1f2d00 - 0x1f2dff: VD6 + 0x1f2e00 - 0x1f3eff: reserved + 0x1f3f00 - 0x1f3fff: VD6 */ + GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /* + 0x1f4000 - 0x1f6bff: VD7 + 0x1f6c00 - 0x1f6cff: reserved + 0x1f6d00 - 0x1f6dff: VD7 + 0x1f6e00 - 0x1f7fff: reserved */ + GEN_FW_RANGE(0x1f8000, 0x1fa0ff, FORCEWAKE_MEDIA_VEBOX3), +}; + static void ilk_dummy_write(struct intel_uncore *uncore) { @@ -1502,6 +1709,7 @@ __gen_write(func, 8) \ __gen_write(func, 16) \ __gen_write(func, 32)
+__gen_reg_write_funcs(xehp_fwtable); __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); @@ -1582,8 +1790,14 @@ static int __fw_domain_init(struct intel_uncore *uncore, BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX1)); BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX2)); BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX3)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX4 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX4)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX5 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX5)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX6 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX6)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VDBOX7 != (1 << FW_DOMAIN_ID_MEDIA_VDBOX7)); BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX0 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX0)); BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX1 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX1)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX2 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX2)); + BUILD_BUG_ON(FORCEWAKE_MEDIA_VEBOX3 != (1 << FW_DOMAIN_ID_MEDIA_VEBOX3));
d->mask = BIT(domain_id);
@@ -1870,36 +2084,36 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) return ret; forcewake_early_sanitize(uncore, 0);
- if (IS_GRAPHICS_VER(i915, 6, 7)) { - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); - - if (IS_VALLEYVIEW(i915)) { - ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); - ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); - } else { - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); - } - } else if (GRAPHICS_VER(i915) == 8) { - if (IS_CHERRYVIEW(i915)) { - ASSIGN_FW_DOMAINS_TABLE(uncore, __chv_fw_ranges); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); - } else { - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); - } - } else if (IS_GRAPHICS_VER(i915, 9, 10)) { - ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); - } else if (GRAPHICS_VER(i915) == 11) { - ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); - } else { + } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen12_fwtable); + } else if (GRAPHICS_VER(i915) == 11) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + } else if (IS_GRAPHICS_VER(i915, 9, 10)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); + } else if (IS_CHERRYVIEW(i915)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __chv_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); + } else if (GRAPHICS_VER(i915) == 8) { + ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); + ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + } else if (IS_VALLEYVIEW(i915)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); + } else if (IS_GRAPHICS_VER(i915, 6, 7)) { + ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); }
uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier; @@ -1988,6 +2202,22 @@ void intel_uncore_prune_engine_fw_domains(struct intel_uncore *uncore, if (HAS_ENGINE(gt, _VCS(i))) continue;
+ /* + * Starting with XeHP, the power well for an even-numbered + * VDBOX is also used for shared units within the + * media slice such as SFC. So even if the engine + * itself is fused off, we still need to initialize + * the forcewake domain if any of the other engines + * in the same media slice are present. + */ + if (GRAPHICS_VER_FULL(uncore->i915) >= IP_VER(12, 50) && i % 2 == 0) { + if ((i + 1 < I915_MAX_VCS) && HAS_ENGINE(gt, _VCS(i + 1))) + continue; + + if (HAS_ENGINE(gt, _VECS(i / 2))) + continue; + } + if (fw_domains & BIT(domain_id)) fw_domain_fini(uncore, domain_id); } diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h index a18bdb57af7b..3c0b0a8b5250 100644 --- a/drivers/gpu/drm/i915/intel_uncore.h +++ b/drivers/gpu/drm/i915/intel_uncore.h @@ -52,8 +52,14 @@ enum forcewake_domain_id { FW_DOMAIN_ID_MEDIA_VDBOX1, FW_DOMAIN_ID_MEDIA_VDBOX2, FW_DOMAIN_ID_MEDIA_VDBOX3, + FW_DOMAIN_ID_MEDIA_VDBOX4, + FW_DOMAIN_ID_MEDIA_VDBOX5, + FW_DOMAIN_ID_MEDIA_VDBOX6, + FW_DOMAIN_ID_MEDIA_VDBOX7, FW_DOMAIN_ID_MEDIA_VEBOX0, FW_DOMAIN_ID_MEDIA_VEBOX1, + FW_DOMAIN_ID_MEDIA_VEBOX2, + FW_DOMAIN_ID_MEDIA_VEBOX3,
FW_DOMAIN_ID_COUNT }; @@ -66,10 +72,16 @@ enum forcewake_domains { FORCEWAKE_MEDIA_VDBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX1), FORCEWAKE_MEDIA_VDBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX2), FORCEWAKE_MEDIA_VDBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX3), + FORCEWAKE_MEDIA_VDBOX4 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX4), + FORCEWAKE_MEDIA_VDBOX5 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX5), + FORCEWAKE_MEDIA_VDBOX6 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX6), + FORCEWAKE_MEDIA_VDBOX7 = BIT(FW_DOMAIN_ID_MEDIA_VDBOX7), FORCEWAKE_MEDIA_VEBOX0 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX0), FORCEWAKE_MEDIA_VEBOX1 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX1), + FORCEWAKE_MEDIA_VEBOX2 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX2), + FORCEWAKE_MEDIA_VEBOX3 = BIT(FW_DOMAIN_ID_MEDIA_VEBOX3),
- FORCEWAKE_ALL = BIT(FW_DOMAIN_ID_COUNT) - 1 + FORCEWAKE_ALL = BIT(FW_DOMAIN_ID_COUNT) - 1, };
struct intel_uncore_funcs { diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c index 8ef9e6a4ad05..720b60853f8b 100644 --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c @@ -68,6 +68,7 @@ static int intel_shadow_table_check(void) { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) }, { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) }, { gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) }, + { xehp_shadowed_regs, ARRAY_SIZE(xehp_shadowed_regs) }, }; const i915_reg_t *reg; unsigned int i, j; @@ -103,6 +104,7 @@ int intel_uncore_mock_selftests(void) { __gen9_fw_ranges, ARRAY_SIZE(__gen9_fw_ranges), true }, { __gen11_fw_ranges, ARRAY_SIZE(__gen11_fw_ranges), true }, { __gen12_fw_ranges, ARRAY_SIZE(__gen12_fw_ranges), true }, + { __xehp_fw_ranges, ARRAY_SIZE(__xehp_fw_ranges), true }, }; int err, i;
Since we can't steer multicast register reads during ring-based workaround verification, we need to define the multicast ranges where failure to steer could potentially cause us to read back from a fused-off register instance.
As with gen12, we can ignore the multicast ranges that the bspec describes as 'SQIDI' since all instances of those registers will always be present and we'll always be able to read back a workaround value that was written with multicast.
Bspec: 66534 Cc: José Roberto de Souza jose.souza@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index d9a5a445ceec..20c6ca28e407 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2089,12 +2089,30 @@ static const struct mcr_range mcr_ranges_gen12[] = { {}, };
+static const struct mcr_range mcr_ranges_xehp[] = { + { .start = 0x4000, .end = 0x4aff }, + { .start = 0x5200, .end = 0x52ff }, + { .start = 0x5400, .end = 0x7fff }, + { .start = 0x8140, .end = 0x815f }, + { .start = 0x8c80, .end = 0x8dff }, + { .start = 0x94d0, .end = 0x955f }, + { .start = 0x9680, .end = 0x96ff }, + { .start = 0xb000, .end = 0xb3ff }, + { .start = 0xc800, .end = 0xcfff }, + { .start = 0xd800, .end = 0xd8ff }, + { .start = 0xdc00, .end = 0xffff }, + { .start = 0x17000, .end = 0x17fff }, + { .start = 0x24a00, .end = 0x24a7f }, +}; + static bool mcr_range(struct drm_i915_private *i915, u32 offset) { const struct mcr_range *mcr_ranges; int i;
- if (GRAPHICS_VER(i915) >= 12) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + mcr_ranges = mcr_ranges_xehp; + else if (GRAPHICS_VER(i915) >= 12) mcr_ranges = mcr_ranges_gen12; else if (GRAPHICS_VER(i915) >= 8) mcr_ranges = mcr_ranges_gen8;
From: Stuart Summers stuart.summers@intel.com
Xe_HP changes the format of the context ID from past platforms.
Cc: Robert M. Fosha robert.m.fosha@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../drm/i915/gt/intel_execlists_submission.c | 74 ++++++++++++++++--- drivers/gpu/drm/i915/gt/intel_lrc.c | 8 ++ drivers/gpu/drm/i915/gt/intel_lrc_reg.h | 2 + drivers/gpu/drm/i915/i915_perf.c | 29 +++++--- drivers/gpu/drm/i915/i915_reg.h | 5 ++ 5 files changed, 97 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 15ba0d83151a..3a9d99a69ed4 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -153,6 +153,12 @@ #define GEN12_CSB_CTX_VALID(csb_dw) \ (FIELD_GET(GEN12_CSB_SW_CTX_ID_MASK, csb_dw) != GEN12_IDLE_CTX_ID)
+#define XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE BIT(1) /* upper csb dword */ +#define XEHP_CSB_SW_CTX_ID_MASK GENMASK(31, 10) +#define XEHP_IDLE_CTX_ID 0xFFFF +#define XEHP_CSB_CTX_VALID(csb_dw) \ + (FIELD_GET(XEHP_CSB_SW_CTX_ID_MASK, csb_dw) != XEHP_IDLE_CTX_ID) + /* Typical size of the average request (2 pipecontrols and a MI_BB) */ #define EXECLISTS_REQUEST_SIZE 64 /* bytes */
@@ -490,6 +496,16 @@ __execlists_schedule_in(struct i915_request *rq) /* Use a fixed tag for OA and friends */ GEM_BUG_ON(ce->tag <= BITS_PER_LONG); ce->lrc.ccid = ce->tag; + } else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) { + /* We don't need a strict matching tag, just different values */ + unsigned int tag = ffs(READ_ONCE(engine->context_tag)); + + GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG); + clear_bit(tag - 1, &engine->context_tag); + ce->lrc.ccid = tag << (XEHP_SW_CTX_ID_SHIFT - 32); + + BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID); + } else { /* We don't need a strict matching tag, just different values */ unsigned int tag = __ffs(engine->context_tag); @@ -600,8 +616,14 @@ static void __execlists_schedule_out(struct i915_request * const rq, intel_engine_add_retire(engine, ce->timeline);
ccid = ce->lrc.ccid; - ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; - ccid &= GEN12_MAX_CONTEXT_HW_ID; + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) { + ccid >>= XEHP_SW_CTX_ID_SHIFT - 32; + ccid &= XEHP_MAX_CONTEXT_HW_ID; + } else { + ccid >>= GEN11_SW_CTX_ID_SHIFT - 32; + ccid &= GEN12_MAX_CONTEXT_HW_ID; + } + if (ccid < BITS_PER_LONG) { GEM_BUG_ON(ccid == 0); GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag)); @@ -1660,13 +1682,24 @@ static void invalidate_csb_entries(const u64 *first, const u64 *last) * bits 44-46: reserved * bits 47-57: sw context id of the lrc the GT switched away from * bits 58-63: sw counter of the lrc the GT switched away from + * + * Xe_HP csb shuffles things around compared to TGL: + * + * bits 0-3: context switch detail (same possible values as TGL) + * bits 4-9: engine instance + * bits 10-25: sw context id of the lrc the GT switched to + * bits 26-31: sw counter of the lrc the GT switched to + * bit 32: semaphore wait mode (poll or signal), Only valid when + * switch detail is set to "wait on semaphore" + * bit 33: switched to new queue + * bits 34-41: wait detail (for switch detail 1 to 4) + * bits 42-57: sw context id of the lrc the GT switched away from + * bits 58-63: sw counter of the lrc the GT switched away from */ -static bool gen12_csb_parse(const u64 csb) +static inline bool +__gen12_csb_parse(bool ctx_to_valid, bool ctx_away_valid, bool new_queue, + u8 switch_detail) { - bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(csb)); - bool new_queue = - lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE; - /* * The context switch detail is not guaranteed to be 5 when a preemption * occurs, so we can't just check for that. The check below works for @@ -1675,7 +1708,7 @@ static bool gen12_csb_parse(const u64 csb) * would require some extra handling, but we don't support that. */ if (!ctx_away_valid || new_queue) { - GEM_BUG_ON(!GEN12_CSB_CTX_VALID(lower_32_bits(csb))); + GEM_BUG_ON(!ctx_to_valid); return true; }
@@ -1684,10 +1717,26 @@ static bool gen12_csb_parse(const u64 csb) * context switch on an unsuccessful wait instruction since we always * use polling mode. */ - GEM_BUG_ON(GEN12_CTX_SWITCH_DETAIL(upper_32_bits(csb))); + GEM_BUG_ON(switch_detail); return false; }
+static bool xehp_csb_parse(const u64 csb) +{ + return __gen12_csb_parse(XEHP_CSB_CTX_VALID(lower_32_bits(csb)), /* cxt to */ + XEHP_CSB_CTX_VALID(upper_32_bits(csb)), /* cxt away */ + upper_32_bits(csb) & XEHP_CTX_STATUS_SWITCHED_TO_NEW_QUEUE, + GEN12_CTX_SWITCH_DETAIL(lower_32_bits(csb))); +} + +static bool gen12_csb_parse(const u64 csb) +{ + return __gen12_csb_parse(GEN12_CSB_CTX_VALID(lower_32_bits(csb)), /* cxt to */ + GEN12_CSB_CTX_VALID(upper_32_bits(csb)), /* cxt away */ + lower_32_bits(csb) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE, + GEN12_CTX_SWITCH_DETAIL(upper_32_bits(csb))); +} + static bool gen8_csb_parse(const u64 csb) { return csb & (GEN8_CTX_STATUS_IDLE_ACTIVE | GEN8_CTX_STATUS_PREEMPTED); @@ -1852,7 +1901,9 @@ process_csb(struct intel_engine_cs *engine, struct i915_request **inactive) ENGINE_TRACE(engine, "csb[%d]: status=0x%08x:0x%08x\n", head, upper_32_bits(csb), lower_32_bits(csb));
- if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + promote = xehp_csb_parse(csb); + else if (GRAPHICS_VER(engine->i915) >= 12) promote = gen12_csb_parse(csb); else promote = gen8_csb_parse(csb); @@ -3339,7 +3390,8 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine) execlists->csb_size = GEN11_CSB_ENTRIES;
engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0); - if (GRAPHICS_VER(engine->i915) >= 11) { + if (GRAPHICS_VER(engine->i915) >= 11 && + GRAPHICS_VER_FULL(engine->i915) < IP_VER(12, 50)) { execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32); execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32); } diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index a27bac0a4bfb..e1c80e2c06d8 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -1101,6 +1101,14 @@ setup_indirect_ctx_bb(const struct intel_context *ce, * bits 55-60: SW counter * bits 61-63: engine class * + * On Xe_HP, the upper dword of the descriptor has a new format: + * + * bits 32-37: virtual function number + * bit 38: mbz, reserved for use by hardware + * bits 39-54: SW context ID + * bits 55-57: reserved + * bits 58-63: SW counter + * * engine info, SW context ID and SW counter need to form a unique number * (Context ID) per lrc. */ diff --git a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h index 41e5350a7a05..9548f4ade068 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc_reg.h +++ b/drivers/gpu/drm/i915/gt/intel_lrc_reg.h @@ -91,5 +91,7 @@ #define GEN11_MAX_CONTEXT_HW_ID (1 << 11) /* exclusive */ /* in Gen12 ID 0x7FF is reserved to indicate idle */ #define GEN12_MAX_CONTEXT_HW_ID (GEN11_MAX_CONTEXT_HW_ID - 1) +/* in Xe_HP ID 0xFFFF is reserved to indicate "invalid context" */ +#define XEHP_MAX_CONTEXT_HW_ID 0xFFFF
#endif /* _INTEL_LRC_REG_H_ */ diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9f94914958c3..1f8bf00526a1 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1284,17 +1284,26 @@ static int oa_get_render_ctx_id(struct i915_perf_stream *stream) break;
case 11: - case 12: { - stream->specific_ctx_id_mask = - ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); - /* - * Pick an unused context id - * 0 - BITS_PER_LONG are used by other contexts - * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context - */ - stream->specific_ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); + case 12: + if (GRAPHICS_VER_FULL(ce->engine->i915) >= IP_VER(12, 50)) { + stream->specific_ctx_id_mask = + ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + stream->specific_ctx_id = + (XEHP_MAX_CONTEXT_HW_ID - 1) << + (XEHP_SW_CTX_ID_SHIFT - 32); + } else { + stream->specific_ctx_id_mask = + ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); + /* + * Pick an unused context id + * 0 - BITS_PER_LONG are used by other contexts + * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context + */ + stream->specific_ctx_id = + (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32); + } break; - }
default: MISSING_CASE(GRAPHICS_VER(ce->engine->i915)); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index dbc233442dd0..c361a8f30531 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4172,6 +4172,11 @@ enum { #define GEN11_ENGINE_INSTANCE_SHIFT 48 #define GEN11_ENGINE_INSTANCE_WIDTH 6
+#define XEHP_SW_CTX_ID_SHIFT 39 +#define XEHP_SW_CTX_ID_WIDTH 16 +#define XEHP_SW_COUNTER_SHIFT 58 +#define XEHP_SW_COUNTER_WIDTH 6 + #define CHV_CLK_CTL1 _MMIO(0x101100) #define VLV_CLK_CTL2 _MMIO(0x101104) #define CLK_CTL2_CZCOUNT_30NS_SHIFT 28
From: Prathap Kumar Valsan prathap.kumar.valsan@intel.com
The layout of some engine contexts has changed on Xe_HP. Define the new offsets.
Bspec: 45585, 46256 Signed-off-by: Prathap Kumar Valsan prathap.kumar.valsan@intel.com Signed-off-by: Ramalingam C ramalingam.c@intel.com Signed-off-by: Venkata Ramana Nayana venkata.ramana.nayana@intel.com Signed-off-by: Akeem G Abodunrin akeem.g.abodunrin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_lrc.c | 65 ++++++++++++++++++++++++++--- 1 file changed, 59 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index e1c80e2c06d8..fee735e2a524 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -484,6 +484,47 @@ static const u8 gen12_rcs_offsets[] = { END };
+static const u8 xehp_rcs_offsets[] = { + NOP(1), + LRI(13, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + + NOP(5), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + LRI(3, POSTED), + REG(0x1b0), + REG16(0x5a8), + REG16(0x5ac), + + NOP(6), + LRI(1, 0), + REG(0x0c8), + + END +}; + #undef END #undef REG16 #undef REG @@ -502,7 +543,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) !intel_engine_has_relative_mmio(engine));
if (engine->class == RENDER_CLASS) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return xehp_rcs_offsets; + else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_rcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 11) return gen11_rcs_offsets; @@ -522,7 +565,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
static int lrc_ring_mi_mode(const struct intel_engine_cs *engine) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return 0x70; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0x60; else if (GRAPHICS_VER(engine->i915) >= 9) return 0x54; @@ -534,7 +579,9 @@ static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
static int lrc_ring_gpr0(const struct intel_engine_cs *engine) { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + return 0x84; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0x74; else if (GRAPHICS_VER(engine->i915) >= 9) return 0x68; @@ -578,10 +625,16 @@ static int lrc_ring_indirect_offset(const struct intel_engine_cs *engine)
static int lrc_ring_cmd_buf_cctl(const struct intel_engine_cs *engine) { - if (engine->class != RENDER_CLASS) - return -1;
- if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + /* + * Note that the CSFE context has a dummy slot for CMD_BUF_CCTL + * simply to match the RCS context image layout. + */ + return 0xc6; + else if (engine->class != RENDER_CLASS) + return -1; + else if (GRAPHICS_VER(engine->i915) >= 12) return 0xb6; else if (GRAPHICS_VER(engine->i915) >= 11) return 0xaa;
From: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
Xe_HP is more modular then its predecessors and as a consequence it has more types of replicated registers. As with l3bank regions on previous platforms, we may need to explicitly re-steer accesses to these new types of ranges at runtime if we can't find a single default steering value that satisfies the fusing of all types.
Bspec: 66534 Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_gt.c | 40 ++++++++- drivers/gpu/drm/i915/gt/intel_gt.h | 1 + drivers/gpu/drm/i915/gt/intel_gt_types.h | 7 ++ drivers/gpu/drm/i915/gt/intel_region_lmem.c | 1 + drivers/gpu/drm/i915/gt/intel_sseu.c | 18 +++++ drivers/gpu/drm/i915/gt/intel_sseu.h | 6 ++ drivers/gpu/drm/i915/gt/intel_workarounds.c | 89 +++++++++++++++++++-- drivers/gpu/drm/i915/i915_drv.h | 3 + drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/i915_reg.h | 4 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 11 files changed, 165 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index e714e21c0a4d..f59bcedbb80b 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -89,6 +89,13 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = { {}, };
+static u16 slicemask(struct intel_gt *gt, int count) +{ + u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); + + return intel_slicemask_from_dssmask(dss_mask, count); +} + int intel_gt_init_mmio(struct intel_gt *gt) { intel_gt_init_clock_frequency(gt); @@ -96,11 +103,24 @@ int intel_gt_init_mmio(struct intel_gt *gt) intel_uc_init_mmio(>->uc); intel_sseu_info_init(gt);
- if (GRAPHICS_VER(gt->i915) >= 11) { + /* + * An mslice is unavailable only if both the meml3 for the slice is + * disabled *and* all of the DSS in the slice (quadrant) are disabled. + */ + if (HAS_MSLICES(gt->i915)) + gt->info.mslice_mask = + slicemask(gt, GEN_DSS_PER_MSLICE) | + (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & + GEN12_MEML3_EN_MASK); + + if (GRAPHICS_VER(gt->i915) >= 11 && + GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) { gt->steering_table[L3BANK] = icl_l3bank_steering_table; gt->info.l3bank_mask = ~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN10_L3BANK_MASK; + } else if (HAS_MSLICES(gt->i915)) { + MISSING_CASE(INTEL_INFO(gt->i915)->platform); }
return intel_engines_init_mmio(gt); @@ -766,6 +786,24 @@ static void intel_gt_get_valid_steering(struct intel_gt *gt, *sliceid = 0; /* unused */ *subsliceid = __ffs(gt->info.l3bank_mask); break; + case MSLICE: + GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */ + + *sliceid = __ffs(gt->info.mslice_mask); + *subsliceid = 0; /* unused */ + break; + case LNCF: + GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */ + + /* + * 0xFDC[29:28] selects the mslice to steer to and 0xFDC[27] + * selects which LNCF within the mslice to steer to. An LNCF + * is always present if its mslice is present, so we can safely + * just steer to LNCF 0 in all cases. + */ + *sliceid = __ffs(gt->info.mslice_mask) << 1; + *subsliceid = 0; /* unused */ + break; default: MISSING_CASE(type); *sliceid = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index e7aabe0cc5bf..f9bcde31f697 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -82,6 +82,7 @@ static inline bool intel_gt_needs_read_steering(struct intel_gt *gt, }
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg); +u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p); diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h index d93d578a4105..b06d8eaf12ea 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_types.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h @@ -47,9 +47,14 @@ struct intel_mmio_range { * of multicast registers. If another type of steering does not have any * overlap in valid steering targets with 'subslice' style registers, we will * need to explicitly re-steer reads of registers of the other type. + * + * Only the replication types that may need additional non-default steering + * are listed here. */ enum intel_steering_type { L3BANK, + MSLICE, + LNCF,
NUM_STEERING_TYPES }; @@ -183,6 +188,8 @@ struct intel_gt {
/* Slice/subslice/EU info */ struct sseu_dev_info sseu; + + unsigned long mslice_mask; } info; };
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c index 1f43aba2e9e2..196e7569a41a 100644 --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c @@ -10,6 +10,7 @@ #include "gem/i915_gem_lmem.h" #include "gem/i915_gem_region.h" #include "gem/i915_gem_ttm.h" +#include "gt/intel_gt.h"
static int init_fake_lmem_bar(struct intel_memory_region *mem) { diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 367fd44b81c8..bbed8e8625e1 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -759,3 +759,21 @@ void intel_sseu_print_topology(const struct sseu_dev_info *sseu, } } } + +u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice) +{ + u16 slice_mask = 0; + int i; + + WARN_ON(sizeof(dss_mask) * 8 / dss_per_slice > 8 * sizeof(slice_mask)); + + for (i = 0; dss_mask; i++) { + if (dss_mask & GENMASK(dss_per_slice - 1, 0)) + slice_mask |= BIT(i); + + dss_mask >>= dss_per_slice; + } + + return slice_mask; +} + diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 4cd1a8a7298a..1073471d1980 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -22,6 +22,10 @@ struct drm_printer; #define GEN_MAX_EUS (16) /* TGL upper bound */ #define GEN_MAX_EU_STRIDE GEN_SSEU_STRIDE(GEN_MAX_EUS)
+#define GEN_DSS_PER_GSLICE 4 +#define GEN_DSS_PER_CSLICE 8 +#define GEN_DSS_PER_MSLICE 8 + struct sseu_dev_info { u8 slice_mask; u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; @@ -104,4 +108,6 @@ void intel_sseu_dump(const struct sseu_dev_info *sseu, struct drm_printer *p); void intel_sseu_print_topology(const struct sseu_dev_info *sseu, struct drm_printer *p);
+u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice); + #endif /* __INTEL_SSEU_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 20c6ca28e407..060d84897635 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -944,12 +944,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); }
+static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal, + unsigned slice, unsigned subslice) +{ + u32 mcr, mcr_mask; + + mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice); + mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK; + + drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr); + + wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); +} + static void icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) { const struct sseu_dev_info *sseu = &i915->gt.info.sseu; unsigned int slice, subslice; - u32 mcr, mcr_mask;
GEM_BUG_ON(GRAPHICS_VER(i915) < 11); GEM_BUG_ON(hweight8(sseu->slice_mask) > 1); @@ -974,12 +986,79 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) if (i915->gt.info.l3bank_mask & BIT(subslice)) i915->gt.steering_table[L3BANK] = NULL;
- mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice); - mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK; + __add_mcr_wa(i915, wal, slice, subslice); +}
- drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr); +__maybe_unused +static void +xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) +{ + struct drm_i915_private *i915 = gt->i915; + const struct sseu_dev_info *sseu = >->info.sseu; + unsigned long slice, subslice = 0, slice_mask = 0; + u64 dss_mask = 0; + u32 lncf_mask = 0; + int i;
- wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); + /* + * On Xe_HP the steering increases in complexity. There are now several + * more units that require steering and we're not guaranteed to be able + * to find a common setting for all of them. These are: + * - GSLICE (fusable) + * - DSS (sub-unit within gslice; fusable) + * - L3 Bank (fusable) + * - MSLICE (fusable) + * - LNCF (sub-unit within mslice; always present if mslice is present) + * - SQIDI (always on) + * + * We'll do our default/implicit steering based on GSLICE (in the + * sliceid field) and DSS (in the subsliceid field). If we can + * find overlap between the valid MSLICE and/or LNCF values with + * a suitable GSLICE, then we can just re-use the default value and + * skip and explicit steering at runtime. + * + * We only need to look for overlap between GSLICE/MSLICE/LNCF to find + * a valid sliceid value. DSS steering is the only type of steering + * that utilizes the 'subsliceid' bits. + * + * Also note that, even though the steering domain is called "GSlice" + * and it is encoded in the register using the gslice format, the spec + * says that the combined (geometry | compute) fuse should be used to + * select the steering. + */ + + /* Find the potential gslice candidates */ + dss_mask = intel_sseu_get_subslices(sseu, 0); + slice_mask = intel_slicemask_from_dssmask(dss_mask, GEN_DSS_PER_GSLICE); + + /* + * Find the potential LNCF candidates. Either LNCF within a valid + * mslice is fine. + */ + for_each_set_bit(i, >->info.mslice_mask, GEN12_MAX_MSLICES) + lncf_mask |= (0x3 << (i * 2)); + + /* + * Are there any sliceid values that work for both GSLICE and LNCF + * steering? + */ + if (slice_mask & lncf_mask) { + slice_mask &= lncf_mask; + gt->steering_table[LNCF] = NULL; + } + + /* How about sliceid values that also work for MSLICE steering? */ + if (slice_mask & gt->info.mslice_mask) { + slice_mask &= gt->info.mslice_mask; + gt->steering_table[MSLICE] = NULL; + } + + slice = __ffs(slice_mask); + subslice = __ffs(dss_mask >> (slice * GEN_DSS_PER_GSLICE)); + WARN_ON(subslice > GEN_DSS_PER_GSLICE); + WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0); + + __add_mcr_wa(i915, wal, slice, subslice); }
static void diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 519cce702f4b..c02600850246 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1672,6 +1672,9 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_RUNTIME_PM(dev_priv) (INTEL_INFO(dev_priv)->has_runtime_pm) #define HAS_64BIT_RELOC(dev_priv) (INTEL_INFO(dev_priv)->has_64bit_reloc)
+#define HAS_MSLICES(dev_priv) \ + (INTEL_INFO(dev_priv)->has_mslices) + #define HAS_IPC(dev_priv) (INTEL_INFO(dev_priv)->display.has_ipc)
#define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 9684b647fd04..88b279452b87 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1012,6 +1012,7 @@ static const struct intel_device_info adl_p_info = { .has_llc = 1, \ .has_logical_ring_contexts = 1, \ .has_logical_ring_elsq = 1, \ + .has_mslices = 1, \ .has_rc6 = 1, \ .has_reset_engine = 1, \ .has_rps = 1, \ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c361a8f30531..43fdf63a2240 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2695,6 +2695,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN11_MCR_SLICE_MASK GEN11_MCR_SLICE(0xf) #define GEN11_MCR_SUBSLICE(subslice) (((subslice) & 0x7) << 24) #define GEN11_MCR_SUBSLICE_MASK GEN11_MCR_SUBSLICE(0x7) +#define GEN11_MCR_MULTICAST REG_BIT(31) #define RING_IPEIR(base) _MMIO((base) + 0x64) #define RING_IPEHR(base) _MMIO((base) + 0x68) #define RING_EIR(base) _MMIO((base) + 0xb0) @@ -3113,6 +3114,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN10_MIRROR_FUSE3 _MMIO(0x9118) #define GEN10_L3BANK_PAIR_COUNT 4 #define GEN10_L3BANK_MASK 0x0F +/* on Xe_HP the same fuses indicates mslices instead of L3 banks */ +#define GEN12_MAX_MSLICES 4 +#define GEN12_MEML3_EN_MASK 0x0F
#define GEN8_EU_DISABLE0 _MMIO(0x9134) #define GEN8_EU_DIS0_S0_MASK 0xffffff diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index b00249906c28..f824de632cfe 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -127,6 +127,7 @@ enum intel_ppgtt_type { func(has_llc); \ func(has_logical_ring_contexts); \ func(has_logical_ring_elsq); \ + func(has_mslices); \ func(has_pooled_eu); \ func(has_rc6); \ func(has_rc6p); \
We no longer have traditional slices on Xe_HP platforms, but the INSTDONE registers are replicated according to gslice representation which is similar. We can mostly re-use the existing instdone code with just a few modifications:
* Create an alternate instdone loop macro that will iterate over the flat DSS space, but still provide the gslice/dss steering values for compatibility with the legacy code.
* We should allocate INSTDONE storage space according to the maximum number of gslices rather than the maximum number of legacy slices to ensure we have enough storage space to hold all of the values. XeHP design has 8 gslices, whereas older platforms never had more than 3 slices.
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 48 +++++++++++--------- drivers/gpu/drm/i915/gt/intel_engine_types.h | 12 ++++- drivers/gpu/drm/i915/gt/intel_sseu.h | 7 +++ drivers/gpu/drm/i915/i915_gpu_error.c | 32 +++++++++---- 4 files changed, 66 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 6e2aa1acc4d4..e1302e9c168b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1181,16 +1181,16 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, u32 mmio_base = engine->mmio_base; int slice; int subslice; + int iter;
memset(instdone, 0, sizeof(*instdone));
- switch (GRAPHICS_VER(i915)) { - default: + if (GRAPHICS_VER(i915) >= 8) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id != RCS0) - break; + return;
instdone->slice_common = intel_uncore_read(uncore, GEN7_SC_INSTDONE); @@ -1200,21 +1200,32 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, instdone->slice_common_extra[1] = intel_uncore_read(uncore, GEN12_SC_INSTDONE_EXTRA2); } - for_each_instdone_slice_subslice(i915, sseu, slice, subslice) { - instdone->sampler[slice][subslice] = - read_subslice_reg(engine, slice, subslice, - GEN7_SAMPLER_INSTDONE); - instdone->row[slice][subslice] = - read_subslice_reg(engine, slice, subslice, - GEN7_ROW_INSTDONE); + + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) { + instdone->sampler[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_SAMPLER_INSTDONE); + instdone->row[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_ROW_INSTDONE); + } + } else { + for_each_instdone_slice_subslice(i915, sseu, slice, subslice) { + instdone->sampler[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_SAMPLER_INSTDONE); + instdone->row[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + GEN7_ROW_INSTDONE); + } } - break; - case 7: + } else if (GRAPHICS_VER(i915) >= 7) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id != RCS0) - break; + return;
instdone->slice_common = intel_uncore_read(uncore, GEN7_SC_INSTDONE); @@ -1222,22 +1233,15 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE); instdone->row[0][0] = intel_uncore_read(uncore, GEN7_ROW_INSTDONE); - - break; - case 6: - case 5: - case 4: + } else if (GRAPHICS_VER(i915) >= 4) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); if (engine->id == RCS0) /* HACK: Using the wrong struct member */ instdone->slice_common = intel_uncore_read(uncore, GEN4_INSTDONE1); - break; - case 3: - case 2: + } else { instdone->instdone = intel_uncore_read(uncore, GEN2_INSTDONE); - break; } }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index b25f594a7e4b..e917b7519f2b 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -78,8 +78,8 @@ struct intel_instdone { /* The following exist only in the RCS engine */ u32 slice_common; u32 slice_common_extra[2]; - u32 sampler[I915_MAX_SLICES][I915_MAX_SUBSLICES]; - u32 row[I915_MAX_SLICES][I915_MAX_SUBSLICES]; + u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; + u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; };
/* @@ -586,4 +586,12 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine) for_each_if((instdone_has_slice(dev_priv_, sseu_, slice_)) && \ (instdone_has_subslice(dev_priv_, sseu_, slice_, \ subslice_))) + +#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \ + for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \ + (iter_) < GEN_MAX_SUBSLICES; \ + (iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \ + (dss_) = (iter_) % GEN_DSS_PER_GSLICE) \ + for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_))) + #endif /* __INTEL_ENGINE_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 1073471d1980..74487650b08f 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -26,6 +26,9 @@ struct drm_printer; #define GEN_DSS_PER_CSLICE 8 #define GEN_DSS_PER_MSLICE 8
+#define GEN_MAX_GSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_GSLICE) +#define GEN_MAX_CSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_CSLICE) + struct sseu_dev_info { u8 slice_mask; u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; @@ -78,6 +81,10 @@ intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice, u8 mask; int ss_idx = subslice / BITS_PER_BYTE;
+ if (slice >= sseu->max_slices || + subslice >= sseu->max_subslices) + return false; + GEM_BUG_ON(ss_idx >= sseu->ss_stride);
mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx]; diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index a2c58b54a592..c1e744b5ab47 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -444,15 +444,29 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, if (GRAPHICS_VER(m->i915) <= 6) return;
- for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice) - err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", - slice, subslice, - ee->instdone.sampler[slice][subslice]); - - for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice) - err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n", - slice, subslice, - ee->instdone.row[slice][subslice]); + if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) { + int iter; + + for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) + err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.sampler[slice][subslice]); + + for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) + err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.row[slice][subslice]); + } else { + for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice) + err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.sampler[slice][subslice]); + + for_each_instdone_slice_subslice(m->i915, sseu, slice, subslice) + err_printf(m, " ROW_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.row[slice][subslice]); + }
if (GRAPHICS_VER(m->i915) < 12) return;
From: Lucas De Marchi lucas.demarchi@intel.com
XeHP SDV is a Intel® dGPU without display. This is just the definition of some basic platform macros, by large a copy of current state of Tigerlake which does not reflect the end state of this platform.
Bspec: 44467, 48077 Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 10 ++++++++++ drivers/gpu/drm/i915/i915_pci.c | 20 ++++++++++++++++++++ drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c02600850246..63bed18a2be7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_DG1(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG1) #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, (IS_ALDERLAKE_P(__i915) && \ IS_GT_STEP(__i915, since, until))
+#define XEHPSDV_REVID_A0 0x0 +#define XEHPSDV_REVID_A1 0x1 +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1 +#define XEHPSDV_REVID_B0 0x4 +#define XEHPSDV_REVID_C0 0x8 + +#define IS_XEHPSDV_REVID(p, since, until) \ + (IS_XEHPSDV(p) && IS_REVID(p, since, until)) + #define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 88b279452b87..046309e95f43 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = { .ppgtt_size = 48, \ .ppgtt_type = INTEL_PPGTT_FULL
+#define XE_HPM_FEATURES \ + .media_ver = 12, \ + .media_ver_release = 50 + +__maybe_unused +static const struct intel_device_info xehpsdv_info = { + XE_HP_FEATURES, + XE_HPM_FEATURES, + DGFX_FEATURES, + PLATFORM(INTEL_XEHPSDV), + .display = { }, + .pipe_mask = 0, + .platform_engine_mask = + BIT(RCS0) | BIT(BCS0) | + BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) | + BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) | + BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7), + .require_force_probe = 1, +}; + #undef PLATFORM
/* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index e8ad14f002c1..7b37b68f4548 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -68,6 +68,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(DG1), PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P), + PLATFORM_NAME(XEHPSDV), }; #undef PLATFORM_NAME
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index f824de632cfe..e8684199b0c9 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -88,6 +88,7 @@ enum intel_platform { INTEL_DG1, INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P, + INTEL_XEHPSDV, INTEL_MAX_PLATFORMS };
On Thu, Jul 01, 2021 at 01:23:50PM -0700, Matt Roper wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
XeHP SDV is a Intel® dGPU without display. This is just the definition of some basic platform macros, by large a copy of current state of Tigerlake which does not reflect the end state of this platform.
Bspec: 44467, 48077 Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com
drivers/gpu/drm/i915/i915_drv.h | 10 ++++++++++ drivers/gpu/drm/i915/i915_pci.c | 20 ++++++++++++++++++++ drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c02600850246..63bed18a2be7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_DG1(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG1) #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, (IS_ALDERLAKE_P(__i915) && \ IS_GT_STEP(__i915, since, until))
+#define XEHPSDV_REVID_A0 0x0 +#define XEHPSDV_REVID_A1 0x1 +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1 +#define XEHPSDV_REVID_B0 0x4 +#define XEHPSDV_REVID_C0 0x8
+#define IS_XEHPSDV_REVID(p, since, until) \
- (IS_XEHPSDV(p) && IS_REVID(p, since, until))
#define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 88b279452b87..046309e95f43 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = { .ppgtt_size = 48, \ .ppgtt_type = INTEL_PPGTT_FULL
+#define XE_HPM_FEATURES \
- .media_ver = 12, \
- .media_ver_release = 50
+__maybe_unused +static const struct intel_device_info xehpsdv_info = {
- XE_HP_FEATURES,
- XE_HPM_FEATURES,
- DGFX_FEATURES,
- PLATFORM(INTEL_XEHPSDV),
- .display = { },
- .pipe_mask = 0,
- .platform_engine_mask =
BIT(RCS0) | BIT(BCS0) |
BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) |
BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) |
BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7),
- .require_force_probe = 1,
+};
#undef PLATFORM
/* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index e8ad14f002c1..7b37b68f4548 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -68,6 +68,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(DG1), PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P),
- PLATFORM_NAME(XEHPSDV),
}; #undef PLATFORM_NAME
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index f824de632cfe..e8684199b0c9 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -88,6 +88,7 @@ enum intel_platform { INTEL_DG1, INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P,
- INTEL_XEHPSDV, INTEL_MAX_PLATFORMS
};
-- 2.25.4
On Thu, 01 Jul 2021, Matt Roper matthew.d.roper@intel.com wrote:
From: Lucas De Marchi lucas.demarchi@intel.com
XeHP SDV is a Intel® dGPU without display. This is just the definition of some basic platform macros, by large a copy of current state of Tigerlake which does not reflect the end state of this platform.
Bspec: 44467, 48077 Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/i915_drv.h | 10 ++++++++++ drivers/gpu/drm/i915/i915_pci.c | 20 ++++++++++++++++++++ drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 4 files changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index c02600850246..63bed18a2be7 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1406,6 +1406,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_DG1(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG1) #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) +#define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1564,6 +1565,15 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, (IS_ALDERLAKE_P(__i915) && \ IS_GT_STEP(__i915, since, until))
+#define XEHPSDV_REVID_A0 0x0 +#define XEHPSDV_REVID_A1 0x1 +#define XEHPSDV_REVID_A_LAST XEHPSDV_REVID_A1 +#define XEHPSDV_REVID_B0 0x4 +#define XEHPSDV_REVID_C0 0x8
+#define IS_XEHPSDV_REVID(p, since, until) \
- (IS_XEHPSDV(p) && IS_REVID(p, since, until))
For new platforms we should be using the mechanisms in intel_step.[ch]. As well as converting the old ones.
BR, Jani.
#define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 88b279452b87..046309e95f43 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1020,6 +1020,26 @@ static const struct intel_device_info adl_p_info = { .ppgtt_size = 48, \ .ppgtt_type = INTEL_PPGTT_FULL
+#define XE_HPM_FEATURES \
- .media_ver = 12, \
- .media_ver_release = 50
+__maybe_unused +static const struct intel_device_info xehpsdv_info = {
- XE_HP_FEATURES,
- XE_HPM_FEATURES,
- DGFX_FEATURES,
- PLATFORM(INTEL_XEHPSDV),
- .display = { },
- .pipe_mask = 0,
- .platform_engine_mask =
BIT(RCS0) | BIT(BCS0) |
BIT(VECS0) | BIT(VECS1) | BIT(VECS2) | BIT(VECS3) |
BIT(VCS0) | BIT(VCS1) | BIT(VCS2) | BIT(VCS3) |
BIT(VCS4) | BIT(VCS5) | BIT(VCS6) | BIT(VCS7),
- .require_force_probe = 1,
+};
#undef PLATFORM
/* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index e8ad14f002c1..7b37b68f4548 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -68,6 +68,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(DG1), PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P),
- PLATFORM_NAME(XEHPSDV),
}; #undef PLATFORM_NAME
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index f824de632cfe..e8684199b0c9 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -88,6 +88,7 @@ enum intel_platform { INTEL_DG1, INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P,
- INTEL_XEHPSDV, INTEL_MAX_PLATFORMS
};
From: Matthew Auld matthew.auld@intel.com
Xe_HP no longer has "slices" in the same way that old platforms did. There are new concepts (gslices, cslices, mslices) that apply in various contexts, but for the purposes of fusing slices no longer exist and we just have one large pool of dual-subslices (DSS) to work with. Furthermore, the meaning of the DSS fuse is inverted compared to past platforms --- it now specifies which DSS are enabled rather than which ones are disabled.
Cc: Henryk Napiatek henryk.j.napiatek@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Matthew Auld matthew.auld@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Prasad Nallani prasad.nallani@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_sseu.c | 24 ++++++++++++++++++++---- drivers/gpu/drm/i915/i915_getparam.c | 6 ++++-- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 3 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index bbed8e8625e1..5d1b7d06c96b 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -139,17 +139,33 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * Gen12 has Dual-Subslices, which behave similarly to 2 gen11 SS. * Instead of splitting these, provide userspace with an array * of DSS to more closely represent the hardware resource. + * + * In addition, the concept of slice has been removed in Xe_HP. + * To be compatible with prior generations, assume a single slice + * across the entire device. Then calculate out the DSS for each + * workload type within that software slice. */ intel_sseu_set_info(sseu, 1, 6, 16);
- s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & - GEN11_GT_S_ENA_MASK; + /* + * As mentioned above, Xe_HP does not have the concept of a slice. + * Enable one for software backwards compatibility. + */ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + s_en = 0x1; + else + s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & + GEN11_GT_S_ENA_MASK;
dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
/* one bit per pair of EUs */ - eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & - GEN11_EU_DIS_MASK); + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + eu_en_fuse = intel_uncore_read(uncore, XEHP_EU_ENABLE) & XEHP_EU_ENA_MASK; + else + eu_en_fuse = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & + GEN11_EU_DIS_MASK); + for (eu = 0; eu < sseu->max_eus_per_subslice / 2; eu++) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1); diff --git a/drivers/gpu/drm/i915/i915_getparam.c b/drivers/gpu/drm/i915/i915_getparam.c index 24e18219eb50..e289397d9178 100644 --- a/drivers/gpu/drm/i915/i915_getparam.c +++ b/drivers/gpu/drm/i915/i915_getparam.c @@ -15,7 +15,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, struct pci_dev *pdev = to_pci_dev(dev->dev); const struct sseu_dev_info *sseu = &i915->gt.info.sseu; drm_i915_getparam_t *param = data; - int value; + int value = 0;
switch (param->param) { case I915_PARAM_IRQ_ACTIVE: @@ -150,7 +150,9 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data, return -ENODEV; break; case I915_PARAM_SUBSLICE_MASK: - value = sseu->subslice_mask[0]; + /* Only copy bits from the first slice */ + memcpy(&value, sseu->subslice_mask, + min(sseu->ss_stride, (u8)sizeof(value))); if (!value) return -ENODEV; break; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 43fdf63a2240..9edb58c796e8 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3151,6 +3151,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define GEN12_GT_DSS_ENABLE _MMIO(0x913C)
+#define XEHP_EU_ENABLE _MMIO(0x9134) +#define XEHP_EU_ENA_MASK 0xFF + #define GEN6_BSD_SLEEP_PSMI_CONTROL _MMIO(0x12050) #define GEN6_BSD_SLEEP_MSG_DISABLE (1 << 0) #define GEN6_BSD_SLEEP_FLUSH_DISABLE (1 << 2)
Due to the removal of legacy slices and the transition to a gslice/cslice/mslice/etc. design, we'll internally store all DSS under "slice0."
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_sseu.c | 5 ++++- drivers/gpu/drm/i915/gt/intel_sseu.h | 2 +- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 2 +- 3 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 5d1b7d06c96b..16c0552fcd1d 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -145,7 +145,10 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - intel_sseu_set_info(sseu, 1, 6, 16); + if (IS_XEHPSDV(gt->i915)) + intel_sseu_set_info(sseu, 1, 32, 16); + else + intel_sseu_set_info(sseu, 1, 6, 16);
/* * As mentioned above, Xe_HP does not have the concept of a slice. diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 74487650b08f..204ea6709460 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -16,7 +16,7 @@ struct intel_gt; struct drm_printer;
#define GEN_MAX_SLICES (6) /* CNL upper bound */ -#define GEN_MAX_SUBSLICES (8) /* ICL upper bound */ +#define GEN_MAX_SUBSLICES (32) /* XEHPSDV upper bound */ #define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE) #define GEN_MAX_SUBSLICE_STRIDE GEN_SSEU_STRIDE(GEN_MAX_SUBSLICES) #define GEN_MAX_EUS (16) /* TGL upper bound */ diff --git a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c index 714fe8495775..a424150b052e 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c @@ -53,7 +53,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt, static void gen10_sseu_device_status(struct intel_gt *gt, struct sseu_dev_info *sseu) { -#define SS_MAX 6 +#define SS_MAX 8 struct intel_uncore *uncore = gt->uncore; const struct intel_gt_info *info = >->info; u32 s_reg[SS_MAX], eu_reg[2 * SS_MAX], eu_mask[2];
From: Stuart Summers stuart.summers@intel.com
Starting in XeHP, the concept of slice has been removed in favor of DSS (Dual-Subslice) masks for various workload types. These workloads have been divided into those enabled for geometry and those enabled for compute.
i915 currently maintains a single set of S/SS/EU masks for the device. The goal of this patch set is to minimize the amount of impact to prior generations while still giving the user maximum flexibility.
Bspec: 33117, 33118, 20376 Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Steve Hampson steven.t.hampson@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_sseu.c | 73 ++++++++++++++++++++-------- drivers/gpu/drm/i915/gt/intel_sseu.h | 5 +- drivers/gpu/drm/i915/i915_reg.h | 3 +- include/uapi/drm/i915_drm.h | 3 -- 4 files changed, 59 insertions(+), 25 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 16c0552fcd1d..5d3b8dff464c 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice) }
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice, - u32 ss_mask) + u8 *subslice_mask, u32 ss_mask) { int offset = slice * sseu->ss_stride;
- memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride); + memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride); }
unsigned int @@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu) return total; }
-static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, - u8 s_en, u32 ss_en, u16 eu_en) +static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en) +{ + u32 ss_mask; + + ss_mask = ss_en >> (s * sseu->max_subslices); + ss_mask &= GENMASK(sseu->max_subslices - 1, 0); + + return ss_mask; +} + +static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en, + u32 g_ss_en, u32 c_ss_en, u16 eu_en) { int s, ss;
- /* ss_en represents entire subslice mask across all slices */ + /* g_ss_en/c_ss_en represent entire subslice mask across all slices */ GEM_BUG_ON(sseu->max_slices * sseu->max_subslices > - sizeof(ss_en) * BITS_PER_BYTE); + sizeof(g_ss_en) * BITS_PER_BYTE);
for (s = 0; s < sseu->max_slices; s++) { if ((s_en & BIT(s)) == 0) @@ -115,7 +125,23 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
sseu->slice_mask |= BIT(s);
- intel_sseu_set_subslices(sseu, s, ss_en); + /* + * XeHP introduces the concept of compute vs + * geometry DSS. To reduce variation between GENs + * around subslice usage, store a mask for both the + * geometry and compute enabled masks, to provide + * to user space later in QUERY_TOPOLOGY_INFO, and + * compute a total enabled subslice count for the + * purposes of selecting subslices to use in a + * particular GEM context. + */ + intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask, + get_ss_stride_mask(sseu, s, c_ss_en)); + intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask, + get_ss_stride_mask(sseu, s, g_ss_en)); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, + get_ss_stride_mask(sseu, s, + g_ss_en | c_ss_en));
for (ss = 0; ss < sseu->max_subslices; ss++) if (intel_sseu_has_subslice(sseu, s, ss)) @@ -129,7 +155,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) { struct sseu_dev_info *sseu = >->info.sseu; struct intel_uncore *uncore = gt->uncore; - u32 dss_en; + u32 g_dss_en, c_dss_en = 0; u16 eu_en = 0; u8 eu_en_fuse; u8 s_en; @@ -145,10 +171,12 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - if (IS_XEHPSDV(gt->i915)) + if (IS_XEHPSDV(gt->i915)) { intel_sseu_set_info(sseu, 1, 32, 16); - else + sseu->has_compute_dss = 1; + } else { intel_sseu_set_info(sseu, 1, 6, 16); + }
/* * As mentioned above, Xe_HP does not have the concept of a slice. @@ -160,7 +188,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt) s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) & GEN11_GT_S_ENA_MASK;
- dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE); + g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE); + if (sseu->has_compute_dss) + c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
/* one bit per pair of EUs */ if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) @@ -173,7 +203,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) if (eu_en_fuse & BIT(eu)) eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
- gen11_compute_sseu_info(sseu, s_en, dss_en, eu_en); + gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
/* TGL only supports slice-level power gating */ sseu->has_slice_pg = 1; @@ -199,7 +229,7 @@ static void gen11_sseu_info_init(struct intel_gt *gt) eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) & GEN11_EU_DIS_MASK);
- gen11_compute_sseu_info(sseu, s_en, ss_en, eu_en); + gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
/* ICL has no power gating restrictions. */ sseu->has_slice_pg = 1; @@ -260,9 +290,9 @@ static void gen10_sseu_info_init(struct intel_gt *gt) * Slice0 can have up to 3 subslices, but there are only 2 in * slice1/2. */ - intel_sseu_set_subslices(sseu, s, s == 0 ? - subslice_mask_with_eus : - subslice_mask_with_eus & 0x3); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, + s == 0 ? subslice_mask_with_eus : + subslice_mask_with_eus & 0x3); }
sseu->eu_total = compute_eu_total(sseu); @@ -317,7 +347,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt) sseu_set_eus(sseu, 0, 1, ~disabled_mask); }
- intel_sseu_set_subslices(sseu, 0, subslice_mask); + intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
sseu->eu_total = compute_eu_total(sseu);
@@ -373,7 +403,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt) /* skip disabled slice */ continue;
- intel_sseu_set_subslices(sseu, s, subslice_mask); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, + subslice_mask);
eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s)); for (ss = 0; ss < sseu->max_subslices; ss++) { @@ -485,7 +516,8 @@ static void bdw_sseu_info_init(struct intel_gt *gt) /* skip disabled slice */ continue;
- intel_sseu_set_subslices(sseu, s, subslice_mask); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, + subslice_mask);
for (ss = 0; ss < sseu->max_subslices; ss++) { u8 eu_disabled_mask; @@ -583,7 +615,8 @@ static void hsw_sseu_info_init(struct intel_gt *gt) sseu->eu_per_subslice);
for (s = 0; s < sseu->max_slices; s++) { - intel_sseu_set_subslices(sseu, s, subslice_mask); + intel_sseu_set_subslices(sseu, s, sseu->subslice_mask, + subslice_mask);
for (ss = 0; ss < sseu->max_subslices; ss++) { sseu_set_eus(sseu, s, ss, diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.h b/drivers/gpu/drm/i915/gt/intel_sseu.h index 204ea6709460..b383e7d97554 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.h +++ b/drivers/gpu/drm/i915/gt/intel_sseu.h @@ -32,6 +32,8 @@ struct drm_printer; struct sseu_dev_info { u8 slice_mask; u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; + u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; + u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE]; u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE]; u16 eu_total; u8 eu_per_subslice; @@ -41,6 +43,7 @@ struct sseu_dev_info { u8 has_slice_pg:1; u8 has_subslice_pg:1; u8 has_eu_pg:1; + u8 has_compute_dss:1;
/* Topology fields */ u8 max_slices; @@ -104,7 +107,7 @@ intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice); u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice, - u32 ss_mask); + u8 *subslice_mask, u32 ss_mask);
void intel_sseu_info_init(struct intel_gt *gt);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 9edb58c796e8..0231f42226db 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3149,7 +3149,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define GEN11_GT_SUBSLICE_DISABLE _MMIO(0x913C)
-#define GEN12_GT_DSS_ENABLE _MMIO(0x913C) +#define GEN12_GT_GEOMETRY_DSS_ENABLE _MMIO(0x913C) +#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
#define XEHP_EU_ENABLE _MMIO(0x9134) #define XEHP_EU_ENA_MASK 0xFF diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 2f70c48567c0..7431c7e6e29e 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -2460,9 +2460,6 @@ struct drm_i915_query { * Z / 8] >> (Z % 8)) & 1 */ struct drm_i915_query_topology_info { - /* - * Unused for now. Must be cleared to zero. - */ __u16 flags;
__u16 max_slices;
Define and initialize the MMIO ranges for which XeHP SDV requires MSLICE and LNCF steering.
Bspec: 66534 Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_gt.c | 19 ++++++++++++++++++- drivers/gpu/drm/i915/gt/intel_workarounds.c | 11 +++++++++-- 2 files changed, 27 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index f59bcedbb80b..9d1c99c9c0dd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -89,6 +89,20 @@ static const struct intel_mmio_range icl_l3bank_steering_table[] = { {}, };
+static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = { + { 0x004000, 0x004AFF }, + { 0x00C800, 0x00CFFF }, + { 0x00DD00, 0x00DDFF }, + { 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */ + {}, +}; + +static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = { + { 0x00B000, 0x00B0FF }, + { 0x00D800, 0x00D8FF }, + {}, +}; + static u16 slicemask(struct intel_gt *gt, int count) { u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); @@ -113,7 +127,10 @@ int intel_gt_init_mmio(struct intel_gt *gt) (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN12_MEML3_EN_MASK);
- if (GRAPHICS_VER(gt->i915) >= 11 && + if (IS_XEHPSDV(gt->i915)) { + gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; + gt->steering_table[LNCF] = xehpsdv_lncf_steering_table; + } else if (GRAPHICS_VER(gt->i915) >= 11 && GRAPHICS_VER_FULL(gt->i915) < IP_VER(12, 50)) { gt->steering_table[L3BANK] = icl_l3bank_steering_table; gt->info.l3bank_mask = diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 060d84897635..4302dc1b728e 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -989,7 +989,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) __add_mcr_wa(i915, wal, slice, subslice); }
-__maybe_unused static void xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) { @@ -1208,10 +1207,18 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) VSUNIT_CLKGATE_DIS_TGL); }
+static void +xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) +{ + xehp_init_mcr(&i915->gt, wal); +} + static void gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal) { - if (IS_DG1(i915)) + if (IS_XEHPSDV(i915)) + xehpsdv_gt_workarounds_init(i915, wal); + else if (IS_DG1(i915)) dg1_gt_workarounds_init(i915, wal); else if (IS_TIGERLAKE(i915)) tgl_gt_workarounds_init(i915, wal);
From: Lucas De Marchi lucas.demarchi@intel.com
Like DG1, XeHP SDV doesn't have LLC/eDRAM control values due to being a dgfx card. XeHP SDV adds 2 more bits: L3_GLBGO to "push the Go point to memory for L3 destined transaction" and L3_LKP to "enable Lookup for uncacheable accesses".
Bspec: 45101 Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Stuart Summers stuart.summers@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_mocs.c | 33 +++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 17848807f111..0c9d0b936c20 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -40,6 +40,8 @@ struct drm_i915_mocs_table { #define L3_ESC(value) ((value) << 0) #define L3_SCC(value) ((value) << 1) #define _L3_CACHEABILITY(value) ((value) << 4) +#define L3_GLBGO(value) ((value) << 6) +#define L3_LKUP(value) ((value) << 7)
/* Helper defines */ #define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */ @@ -314,6 +316,31 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = { MOCS_ENTRY(63, 0, L3_1_UC), };
+static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = { + /* wa_1608975824 */ + MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)), + + /* UC - Coherent; GO:L3 */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)), + /* UC - Non-Coherent; GO:L3 */ + MOCS_ENTRY(4, 0, L3_1_UC), + + /* WB */ + MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)), + + /* HW Reserved - SW program but never use. */ + MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)), + MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)), + MOCS_ENTRY(60, 0, L3_1_UC), + MOCS_ENTRY(61, 0, L3_1_UC), + MOCS_ENTRY(62, 0, L3_1_UC), + MOCS_ENTRY(63, 0, L3_1_UC), +}; + enum { HAS_GLOBAL_MOCS = BIT(0), HAS_ENGINE_MOCS = BIT(1), @@ -340,7 +367,11 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915, { unsigned int flags;
- if (IS_DG1(i915)) { + if (IS_XEHPSDV(i915)) { + table->size = ARRAY_SIZE(xehpsdv_mocs_table); + table->table = xehpsdv_mocs_table; + table->n_entries = GEN9_NUM_MOCS_ENTRIES; + } else if (IS_DG1(i915)) { table->size = ARRAY_SIZE(dg1_mocs_table); table->table = dg1_mocs_table; table->n_entries = GEN9_NUM_MOCS_ENTRIES;
From: Lucas De Marchi lucas.demarchi@intel.com
Instead of maintaining the same if ladder in 3 different places, add a function to read RP_STATE_CAP.
Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/debugfs_gt_pm.c | 8 +++----- drivers/gpu/drm/i915/gt/intel_rps.c | 17 ++++++++++++----- drivers/gpu/drm/i915/gt/intel_rps.h | 1 + drivers/gpu/drm/i915/i915_debugfs.c | 8 +++----- 4 files changed, 19 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c index 4270b5a34a83..1061a62bdfce 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c @@ -309,13 +309,11 @@ static int frequency_show(struct seq_file *m, void *unused) int max_freq;
rp_state_limits = intel_uncore_read(uncore, GEN6_RP_STATE_LIMITS); - if (IS_GEN9_LP(i915)) { - rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP); + rp_state_cap = intel_rps_read_state_cap(rps); + if (IS_GEN9_LP(i915)) gt_perf_status = intel_uncore_read(uncore, BXT_GT_PERF_STATUS); - } else { - rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP); + else gt_perf_status = intel_uncore_read(uncore, GEN6_GT_PERF_STATUS); - }
/* RPSTAT1 is in the GT power well */ intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL); diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 06e9a8ed4e03..490bc1513480 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -975,20 +975,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val) static void gen6_rps_init(struct intel_rps *rps) { struct drm_i915_private *i915 = rps_to_i915(rps); - struct intel_uncore *uncore = rps_to_uncore(rps); + u32 rp_state_cap = intel_rps_read_state_cap(rps);
/* All of these values are in units of 50MHz */
/* static values from HW: RP0 > RP1 > RPn (min_freq) */ if (IS_GEN9_LP(i915)) { - u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP); - rps->rp0_freq = (rp_state_cap >> 16) & 0xff; rps->rp1_freq = (rp_state_cap >> 8) & 0xff; rps->min_freq = (rp_state_cap >> 0) & 0xff; } else { - u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP); - rps->rp0_freq = (rp_state_cap >> 0) & 0xff; rps->rp1_freq = (rp_state_cap >> 8) & 0xff; rps->min_freq = (rp_state_cap >> 16) & 0xff; @@ -1936,6 +1932,17 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps) return freq; }
+u32 intel_rps_read_state_cap(struct intel_rps *rps) +{ + struct drm_i915_private *i915 = rps_to_i915(rps); + struct intel_uncore *uncore = rps_to_uncore(rps); + + if (IS_GEN9_LP(i915)) + return intel_uncore_read(uncore, BXT_RP_STATE_CAP); + else + return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); +} + /* External interface for intel_ips.ko */
static struct drm_i915_private __rcu *ips_mchdev; diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h index 1d2cfc98b510..6e06dd61f818 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.h +++ b/drivers/gpu/drm/i915/gt/intel_rps.h @@ -31,6 +31,7 @@ int intel_gpu_freq(struct intel_rps *rps, int val); int intel_freq_opcode(struct intel_rps *rps, int val); u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1); u32 intel_rps_read_actual_frequency(struct intel_rps *rps); +u32 intel_rps_read_state_cap(struct intel_rps *rps);
void gen5_rps_irq_handler(struct intel_rps *rps); void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir); diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index cc745751ac53..6c83da3956b9 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -420,13 +420,11 @@ static int i915_frequency_info(struct seq_file *m, void *unused) int max_freq;
rp_state_limits = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_LIMITS); - if (IS_GEN9_LP(dev_priv)) { - rp_state_cap = intel_uncore_read(&dev_priv->uncore, BXT_RP_STATE_CAP); + rp_state_cap = intel_rps_read_state_cap(rps); + if (IS_GEN9_LP(dev_priv)) gt_perf_status = intel_uncore_read(&dev_priv->uncore, BXT_GT_PERF_STATUS); - } else { - rp_state_cap = intel_uncore_read(&dev_priv->uncore, GEN6_RP_STATE_CAP); + else gt_perf_status = intel_uncore_read(&dev_priv->uncore, GEN6_GT_PERF_STATUS); - }
/* RPSTAT1 is in the GT power well */ intel_uncore_forcewake_get(&dev_priv->uncore, FORCEWAKE_ALL);
The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this register is now a per-tile register at GTTMMADDR offset 0x250014.
Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 490bc1513480..8e7b70248392 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps) struct drm_i915_private *i915 = rps_to_i915(rps); struct intel_uncore *uncore = rps_to_uncore(rps);
- if (IS_GEN9_LP(i915)) + if (IS_XEHPSDV(i915)) + return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP); + else if (IS_GEN9_LP(i915)) return intel_uncore_read(uncore, BXT_RP_STATE_CAP); else return intel_uncore_read(uncore, GEN6_RP_STATE_CAP); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 0231f42226db..2992e8585399 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN6_RP_STATE_CAP _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998) #define BXT_RP_STATE_CAP _MMIO(0x138170) #define GEN9_RP_STATE_LIMITS _MMIO(0x138148) +#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014)
/* * Logical Context regs
On Thu, Jul 01, 2021 at 01:23:57PM -0700, Matt Roper wrote:
The RP_STATE_CAP register is no longer part of the MCHBAR on XEHPSDV; this register is now a per-tile register at GTTMMADDR offset 0x250014.
Cc: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com
Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com
drivers/gpu/drm/i915/gt/intel_rps.c | 4 +++- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 490bc1513480..8e7b70248392 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -1937,7 +1937,9 @@ u32 intel_rps_read_state_cap(struct intel_rps *rps) struct drm_i915_private *i915 = rps_to_i915(rps); struct intel_uncore *uncore = rps_to_uncore(rps);
- if (IS_GEN9_LP(i915))
- if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
- else if (IS_GEN9_LP(i915)) return intel_uncore_read(uncore, BXT_RP_STATE_CAP); else return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 0231f42226db..2992e8585399 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4110,6 +4110,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN6_RP_STATE_CAP _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5998) #define BXT_RP_STATE_CAP _MMIO(0x138170) #define GEN9_RP_STATE_LIMITS _MMIO(0x138148) +#define XEHPSDV_RP_STATE_CAP _MMIO(0x250014)
/*
- Logical Context regs
-- 2.25.4
DG2 has Xe_LPD display (version 13) and Xe_HPG (version 12.55) graphics. There are two variants (treated as subplatforms in the code): DG2-G10 and DG2-G11 that require independent programming in some areas (e.g., workarounds).
Bspec: 44472, 44474, 46197, 48028, 48077 Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/i915_drv.h | 27 ++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_pci.c | 16 ++++++++++++++ drivers/gpu/drm/i915/intel_device_info.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 5 +++++ drivers/gpu/drm/i915/intel_step.c | 20 +++++++++++++++++- drivers/gpu/drm/i915/intel_step.h | 1 + 6 files changed, 69 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 63bed18a2be7..828ad607795a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1407,6 +1407,11 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_ALDERLAKE_S(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_S) #define IS_ALDERLAKE_P(dev_priv) IS_PLATFORM(dev_priv, INTEL_ALDERLAKE_P) #define IS_XEHPSDV(dev_priv) IS_PLATFORM(dev_priv, INTEL_XEHPSDV) +#define IS_DG2(dev_priv) IS_PLATFORM(dev_priv, INTEL_DG2) +#define IS_DG2_G10(dev_priv) \ + IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G10) +#define IS_DG2_G11(dev_priv) \ + IS_SUBPLATFORM(dev_priv, INTEL_DG2, INTEL_SUBPLATFORM_G11) #define IS_HSW_EARLY_SDV(dev_priv) (IS_HASWELL(dev_priv) && \ (INTEL_DEVID(dev_priv) & 0xFF00) == 0x0C00) #define IS_BDW_ULT(dev_priv) \ @@ -1574,6 +1579,28 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_XEHPSDV_REVID(p, since, until) \ (IS_XEHPSDV(p) && IS_REVID(p, since, until))
+/* + * DG2 hardware steppings are a bit unusual. The hardware design was forked + * to create two variants (G10 and G11) which have distinct workaround sets. + * The G11 fork of the DG2 design resets the GT stepping back to "A0" for its + * first iteration, even though it's more similar to a G10 B0 stepping in terms + * of functionality and workarounds. However the display stepping does not + * reset in the same manner --- a specific stepping like "B0" has a consistent + * meaning regardless of whether it belongs to a G10 or G11 DG2. + * + * TLDR: All GT workarounds and stepping-specific logic must be applied in + * relation to a specific subplatform (G10 or G11), whereas display workarounds + * and stepping-specific logic will be applied with a general DG2-wide stepping + * number. + */ +#define IS_DG2_GT_STEP(__i915, variant, since, until) \ + (IS_SUBPLATFORM(__i915, INTEL_DG2, INTEL_SUBPLATFORM_##variant) && \ + IS_GT_STEP(__i915, since, until)) + +#define IS_DG2_DISP_STEP(__i915, since, until) \ + (IS_DG2(__i915) && \ + IS_DISPLAY_STEP(__i915, since, until)) + #define IS_LP(dev_priv) (INTEL_INFO(dev_priv)->is_lp) #define IS_GEN9_LP(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && IS_LP(dev_priv)) #define IS_GEN9_BC(dev_priv) (GRAPHICS_VER(dev_priv) == 9 && !IS_LP(dev_priv)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 046309e95f43..a41e1792c0ff 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1040,6 +1040,22 @@ static const struct intel_device_info xehpsdv_info = { .require_force_probe = 1, };
+__maybe_unused +static const struct intel_device_info dg2_info = { + XE_HP_FEATURES, + XE_HPM_FEATURES, + XE_LPD_FEATURES, + DGFX_FEATURES, + .graphics_ver_release = 55, + .media_ver_release = 55, + PLATFORM(INTEL_DG2), + .platform_engine_mask = + BIT(RCS0) | BIT(BCS0) | + BIT(VECS0) | BIT(VECS1) | + BIT(VCS0) | BIT(VCS2), + .require_force_probe = 1, +}; + #undef PLATFORM
/* diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index 7b37b68f4548..41205dc356b7 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -69,6 +69,7 @@ static const char * const platform_names[] = { PLATFORM_NAME(ALDERLAKE_S), PLATFORM_NAME(ALDERLAKE_P), PLATFORM_NAME(XEHPSDV), + PLATFORM_NAME(DG2), }; #undef PLATFORM_NAME
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index e8684199b0c9..856f0aa5d68f 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -89,6 +89,7 @@ enum intel_platform { INTEL_ALDERLAKE_S, INTEL_ALDERLAKE_P, INTEL_XEHPSDV, + INTEL_DG2, INTEL_MAX_PLATFORMS };
@@ -107,6 +108,10 @@ enum intel_platform { /* CNL/ICL */ #define INTEL_SUBPLATFORM_PORTF (0)
+/* DG2 */ +#define INTEL_SUBPLATFORM_G10 0 +#define INTEL_SUBPLATFORM_G11 1 + enum intel_ppgtt_type { INTEL_PPGTT_NONE = I915_GEM_PPGTT_NONE, INTEL_PPGTT_ALIASING = I915_GEM_PPGTT_ALIASING, diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index ba9479a67521..30a53d9b76c1 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -54,6 +54,18 @@ static const struct intel_step_info adlp_revid_step_tbl[] = { [0xC] = { .gt_step = STEP_C0, .display_step = STEP_D0 }, };
+static const struct intel_step_info dg2_g10_revid_step_tbl[] = { + [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, + [0x1] = { .gt_step = STEP_A1, .display_step = STEP_A0 }, + [0x4] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, + [0x8] = { .gt_step = STEP_C0, .display_step = STEP_C0 }, +}; + +static const struct intel_step_info dg2_g11_revid_step_tbl[] = { + [0x0] = { .gt_step = STEP_A0, .display_step = STEP_B0 }, + [0x4] = { .gt_step = STEP_B0, .display_step = STEP_C0 }, +}; + void intel_step_init(struct drm_i915_private *i915) { const struct intel_step_info *revids = NULL; @@ -61,7 +73,13 @@ void intel_step_init(struct drm_i915_private *i915) int revid = INTEL_REVID(i915); struct intel_step_info step = {};
- if (IS_ALDERLAKE_P(i915)) { + if (IS_DG2_G10(i915)) { + revids = dg2_g10_revid_step_tbl; + size = ARRAY_SIZE(dg2_g10_revid_step_tbl); + } else if (IS_DG2_G11(i915)) { + revids = dg2_g11_revid_step_tbl; + size = ARRAY_SIZE(dg2_g11_revid_step_tbl); + } else if (IS_ALDERLAKE_P(i915)) { revids = adlp_revid_step_tbl; size = ARRAY_SIZE(adlp_revid_step_tbl); } else if (IS_ALDERLAKE_S(i915)) { diff --git a/drivers/gpu/drm/i915/intel_step.h b/drivers/gpu/drm/i915/intel_step.h index 958a8bb5d677..8efacef6ab31 100644 --- a/drivers/gpu/drm/i915/intel_step.h +++ b/drivers/gpu/drm/i915/intel_step.h @@ -22,6 +22,7 @@ struct intel_step_info { enum intel_step { STEP_NONE = 0, STEP_A0, + STEP_A1, STEP_A2, STEP_B0, STEP_B1,
DG2 supports compute DSS and has the same maximum number of DSS and EU as XeHP SDV.
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_sseu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_sseu.c b/drivers/gpu/drm/i915/gt/intel_sseu.c index 5d3b8dff464c..eaff221db5b0 100644 --- a/drivers/gpu/drm/i915/gt/intel_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_sseu.c @@ -171,7 +171,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt) * across the entire device. Then calculate out the DSS for each * workload type within that software slice. */ - if (IS_XEHPSDV(gt->i915)) { + if (IS_DG2(gt->i915) || IS_XEHPSDV(gt->i915)) { intel_sseu_set_info(sseu, 1, 32, 16); sseu->has_compute_dss = 1; } else {
The DG2 forcewake table is very similar to the one used by XeHP SDV (and both platforms are even presented as a single table in the bspec). For the most part DG2 starts using a few additional ranges that were 'reserved' on XeHP SDV and stops using some others. However there is a single range (0xd800-0xd87f) that needs to be handled differently between the two platforms (it needs GT wake on XeHP SDV, but render wake on DG2) so unless we want to wake both domains (which could waste power) or define new types of forcewake domains for this special case we need to have separate tables for the two platforms. Let's define the ranges for both platforms with a parameterized macro so that we don't actually need to duplicate everything in the code.
It should be fine for DG2 to re-use the Xe_HP shadow register list so we can continue to use the 'xehpsdv' MMIO write functions and don't need to spin up a separate DG2 instance.
Bspec: 66534 Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 305 +++++++++++++++------------- 1 file changed, 168 insertions(+), 137 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 676b0052f01e..0c35acfcd6da 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1317,143 +1317,170 @@ static const struct intel_forcewake_range __gen12_fw_ranges[] = { 0x1d3f00 - 0x1d3fff: VD2 */ };
-/* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ -static const struct intel_forcewake_range __xehp_fw_ranges[] = { - GEN_FW_RANGE(0x0, 0x1fff, 0), /* - 0x0 - 0xaff: reserved - 0xb00 - 0x1fff: always on */ - GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT), - GEN_FW_RANGE(0x4b00, 0x51ff, 0), /* - 0x4b00 - 0x4fff: reserved - 0x5000 - 0x51ff: always on */ - GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT), - GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8160, 0x81ff, 0), /* - 0x8160 - 0x817f: reserved - 0x8180 - 0x81ff: always on */ - GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT), - GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x8500, 0x94cf, FORCEWAKE_GT), /* - 0x8500 - 0x87ff: gt - 0x8800 - 0x8fff: reserved - 0x9000 - 0x947f: gt - 0x9480 - 0x94cf: reserved */ - GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER), - GEN_FW_RANGE(0x9560, 0x97ff, 0), /* - 0x9560 - 0x95ff: always on - 0x9600 - 0x97ff: reserved */ - GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /* - 0x9800 - 0xb4ff: gt - 0xb500 - 0xbfff: reserved - 0xc000 - 0xcfff: gt */ - GEN_FW_RANGE(0xd000, 0xd7ff, 0), - GEN_FW_RANGE(0xd800, 0xdbff, FORCEWAKE_GT), - GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER), - GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /* - 0xdd00 - 0xddff: gt - 0xde00 - 0xde7f: reserved */ - GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /* - 0xde80 - 0xdfff: render - 0xe000 - 0xe0ff: reserved - 0xe100 - 0xe8ff: render */ - GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /* - 0xe900 - 0xe9ff: gt - 0xea00 - 0xefff: reserved - 0xf000 - 0xffff: gt */ - GEN_FW_RANGE(0x10000, 0x13fff, 0), /* - 0x10000 - 0x11fff: reserved - 0x12000 - 0x127ff: always on - 0x12800 - 0x13fff: reserved */ - GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), - GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), - GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), - GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), - GEN_FW_RANGE(0x14800, 0x1ffff, FORCEWAKE_RENDER), /* - 0x14800 - 0x14fff: render - 0x15000 - 0x16dff: reserved - 0x16e00 - 0x1ffff: render */ - GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /* - 0x20000 - 0x20fff: VD0 - 0x21000 - 0x21fff: reserved */ - GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT), - GEN_FW_RANGE(0x24000, 0x2417f, 0), /* - 0x24000 - 0x2407f: always on - 0x24080 - 0x2417f: reserved */ - GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /* - 0x24180 - 0x241ff: gt - 0x24200 - 0x249ff: reserved */ - GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /* - 0x24a00 - 0x24a7f: render - 0x24a80 - 0x251ff: reserved */ - GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /* - 0x25200 - 0x252ff: gt - 0x25300 - 0x25fff: reserved */ - GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /* - 0x26000 - 0x27fff: render - 0x28000 - 0x29fff: reserved - 0x2a000 - 0x2ffff: undocumented */ - GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT), - GEN_FW_RANGE(0x40000, 0x1bffff, 0), - GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /* - 0x1c0000 - 0x1c2bff: VD0 - 0x1c2c00 - 0x1c2cff: reserved - 0x1c2d00 - 0x1c2dff: VD0 - 0x1c2e00 - 0x1c3eff: reserved - 0x1c3f00 - 0x1c3fff: VD0 */ - GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /* - 0x1c4000 - 0x1c6bff: VD1 - 0x1c6c00 - 0x1c6cff: reserved - 0x1c6d00 - 0x1c6dff: VD1 - 0x1c6e00 - 0x1c7fff: reserved */ - GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /* - 0x1c8000 - 0x1ca0ff: VE0 - 0x1ca100 - 0x1cbfff: reserved */ - GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0), - GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2), - GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4), - GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6), - GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /* - 0x1d0000 - 0x1d2bff: VD2 - 0x1d2c00 - 0x1d2cff: reserved - 0x1d2d00 - 0x1d2dff: VD2 - 0x1d2e00 - 0x1d3eff: reserved - 0x1d3f00 - 0x1d3fff: VD2 */ - GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /* - 0x1d4000 - 0x1d6bff: VD3 - 0x1d6c00 - 0x1d6cff: reserved - 0x1d6d00 - 0x1d6dff: VD3 - 0x1d6e00 - 0x1d7fff: reserved */ - GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /* - 0x1d8000 - 0x1da0ff: VE1 - 0x1da100 - 0x1dffff: reserved */ - GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /* - 0x1e0000 - 0x1e2bff: VD4 - 0x1e2c00 - 0x1e2cff: reserved - 0x1e2d00 - 0x1e2dff: VD4 - 0x1e2e00 - 0x1e3eff: reserved - 0x1e3f00 - 0x1e3fff: VD4 */ - GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /* - 0x1e4000 - 0x1e6bff: VD5 - 0x1e6c00 - 0x1e6cff: reserved - 0x1e6d00 - 0x1e6dff: VD5 - 0x1e6e00 - 0x1e7fff: reserved */ - GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /* - 0x1e8000 - 0x1ea0ff: VE2 - 0x1ea100 - 0x1effff: reserved */ - GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /* - 0x1f0000 - 0x1f2bff: VD6 - 0x1f2c00 - 0x1f2cff: reserved - 0x1f2d00 - 0x1f2dff: VD6 - 0x1f2e00 - 0x1f3eff: reserved - 0x1f3f00 - 0x1f3fff: VD6 */ - GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /* - 0x1f4000 - 0x1f6bff: VD7 - 0x1f6c00 - 0x1f6cff: reserved - 0x1f6d00 - 0x1f6dff: VD7 - 0x1f6e00 - 0x1f7fff: reserved */ +/* + * Graphics IP version 12.55 brings a slight change to the 0xd800 range, + * switching it from the GT domain to the render domain. + * + * *Must* be sorted by offset ranges! See intel_fw_table_check(). + */ +#define XEHP_FWRANGES(FW_RANGE_D800) \ + GEN_FW_RANGE(0x0, 0x1fff, 0), /* \ + 0x0 - 0xaff: reserved \ + 0xb00 - 0x1fff: always on */ \ + GEN_FW_RANGE(0x2000, 0x26ff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x2700, 0x4aff, FORCEWAKE_GT), \ + GEN_FW_RANGE(0x4b00, 0x51ff, 0), /* \ + 0x4b00 - 0x4fff: reserved \ + 0x5000 - 0x51ff: always on */ \ + GEN_FW_RANGE(0x5200, 0x7fff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x8000, 0x813f, FORCEWAKE_GT), \ + GEN_FW_RANGE(0x8140, 0x815f, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x8160, 0x81ff, 0), /* \ + 0x8160 - 0x817f: reserved \ + 0x8180 - 0x81ff: always on */ \ + GEN_FW_RANGE(0x8200, 0x82ff, FORCEWAKE_GT), \ + GEN_FW_RANGE(0x8300, 0x84ff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x8500, 0x8cff, FORCEWAKE_GT), /* \ + 0x8500 - 0x87ff: gt \ + 0x8800 - 0x8c7f: reserved \ + 0x8c80 - 0x8cff: gt (DG2 only) */ \ + GEN_FW_RANGE(0x8d00, 0x8fff, FORCEWAKE_RENDER), /* \ + 0x8d00 - 0x8dff: render (DG2 only) \ + 0x8e00 - 0x8fff: reserved */ \ + GEN_FW_RANGE(0x9000, 0x94cf, FORCEWAKE_GT), /* \ + 0x9000 - 0x947f: gt \ + 0x9480 - 0x94cf: reserved */ \ + GEN_FW_RANGE(0x94d0, 0x955f, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x9560, 0x967f, 0), /* \ + 0x9560 - 0x95ff: always on \ + 0x9600 - 0x967f: reserved */ \ + GEN_FW_RANGE(0x9680, 0x97ff, FORCEWAKE_RENDER), /* \ + 0x9680 - 0x96ff: render (DG2 only) \ + 0x9700 - 0x97ff: reserved */ \ + GEN_FW_RANGE(0x9800, 0xcfff, FORCEWAKE_GT), /* \ + 0x9800 - 0xb4ff: gt \ + 0xb500 - 0xbfff: reserved \ + 0xc000 - 0xcfff: gt */ \ + GEN_FW_RANGE(0xd000, 0xd7ff, 0), \ + GEN_FW_RANGE(0xd800, 0xd87f, FW_RANGE_D800), \ + GEN_FW_RANGE(0xd880, 0xdbff, FORCEWAKE_GT), \ + GEN_FW_RANGE(0xdc00, 0xdcff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0xdd00, 0xde7f, FORCEWAKE_GT), /* \ + 0xdd00 - 0xddff: gt \ + 0xde00 - 0xde7f: reserved */ \ + GEN_FW_RANGE(0xde80, 0xe8ff, FORCEWAKE_RENDER), /* \ + 0xde80 - 0xdfff: render \ + 0xe000 - 0xe0ff: reserved \ + 0xe100 - 0xe8ff: render */ \ + GEN_FW_RANGE(0xe900, 0xffff, FORCEWAKE_GT), /* \ + 0xe900 - 0xe9ff: gt \ + 0xea00 - 0xefff: reserved \ + 0xf000 - 0xffff: gt */ \ + GEN_FW_RANGE(0x10000, 0x12fff, 0), /* \ + 0x10000 - 0x11fff: reserved \ + 0x12000 - 0x127ff: always on \ + 0x12800 - 0x12fff: reserved */ \ + GEN_FW_RANGE(0x13000, 0x131ff, FORCEWAKE_MEDIA_VDBOX0), /* DG2 only */ \ + GEN_FW_RANGE(0x13200, 0x13fff, FORCEWAKE_MEDIA_VDBOX2), /* \ + 0x13200 - 0x133ff: VD2 (DG2 only) \ + 0x13400 - 0x13fff: reserved */ \ + GEN_FW_RANGE(0x14000, 0x141ff, FORCEWAKE_MEDIA_VDBOX0), /* XEHPSDV only */ \ + GEN_FW_RANGE(0x14200, 0x143ff, FORCEWAKE_MEDIA_VDBOX2), /* XEHPSDV only */ \ + GEN_FW_RANGE(0x14400, 0x145ff, FORCEWAKE_MEDIA_VDBOX4), /* XEHPSDV only */ \ + GEN_FW_RANGE(0x14600, 0x147ff, FORCEWAKE_MEDIA_VDBOX6), /* XEHPSDV only */ \ + GEN_FW_RANGE(0x14800, 0x14fff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x15000, 0x16dff, FORCEWAKE_GT), /* \ + 0x15000 - 0x15fff: gt (DG2 only) \ + 0x16000 - 0x16dff: reserved */ \ + GEN_FW_RANGE(0x16e00, 0x1ffff, FORCEWAKE_RENDER), \ + GEN_FW_RANGE(0x20000, 0x21fff, FORCEWAKE_MEDIA_VDBOX0), /* \ + 0x20000 - 0x20fff: VD0 (XEHPSDV only) \ + 0x21000 - 0x21fff: reserved */ \ + GEN_FW_RANGE(0x22000, 0x23fff, FORCEWAKE_GT), \ + GEN_FW_RANGE(0x24000, 0x2417f, 0), /* \ + 0x24000 - 0x2407f: always on \ + 0x24080 - 0x2417f: reserved */ \ + GEN_FW_RANGE(0x24180, 0x249ff, FORCEWAKE_GT), /* \ + 0x24180 - 0x241ff: gt \ + 0x24200 - 0x249ff: reserved */ \ + GEN_FW_RANGE(0x24a00, 0x251ff, FORCEWAKE_RENDER), /* \ + 0x24a00 - 0x24a7f: render \ + 0x24a80 - 0x251ff: reserved */ \ + GEN_FW_RANGE(0x25200, 0x25fff, FORCEWAKE_GT), /* \ + 0x25200 - 0x252ff: gt \ + 0x25300 - 0x25fff: reserved */ \ + GEN_FW_RANGE(0x26000, 0x2ffff, FORCEWAKE_RENDER), /* \ + 0x26000 - 0x27fff: render \ + 0x28000 - 0x29fff: reserved \ + 0x2a000 - 0x2ffff: undocumented */ \ + GEN_FW_RANGE(0x30000, 0x3ffff, FORCEWAKE_GT), \ + GEN_FW_RANGE(0x40000, 0x1bffff, 0), \ + GEN_FW_RANGE(0x1c0000, 0x1c3fff, FORCEWAKE_MEDIA_VDBOX0), /* \ + 0x1c0000 - 0x1c2bff: VD0 \ + 0x1c2c00 - 0x1c2cff: reserved \ + 0x1c2d00 - 0x1c2dff: VD0 \ + 0x1c2e00 - 0x1c3eff: VD0 (DG2 only) \ + 0x1c3f00 - 0x1c3fff: VD0 */ \ + GEN_FW_RANGE(0x1c4000, 0x1c7fff, FORCEWAKE_MEDIA_VDBOX1), /* \ + 0x1c4000 - 0x1c6bff: VD1 \ + 0x1c6c00 - 0x1c6cff: reserved \ + 0x1c6d00 - 0x1c6dff: VD1 \ + 0x1c6e00 - 0x1c7fff: reserved */ \ + GEN_FW_RANGE(0x1c8000, 0x1cbfff, FORCEWAKE_MEDIA_VEBOX0), /* \ + 0x1c8000 - 0x1ca0ff: VE0 \ + 0x1ca100 - 0x1cbfff: reserved */ \ + GEN_FW_RANGE(0x1cc000, 0x1ccfff, FORCEWAKE_MEDIA_VDBOX0), \ + GEN_FW_RANGE(0x1cd000, 0x1cdfff, FORCEWAKE_MEDIA_VDBOX2), \ + GEN_FW_RANGE(0x1ce000, 0x1cefff, FORCEWAKE_MEDIA_VDBOX4), \ + GEN_FW_RANGE(0x1cf000, 0x1cffff, FORCEWAKE_MEDIA_VDBOX6), \ + GEN_FW_RANGE(0x1d0000, 0x1d3fff, FORCEWAKE_MEDIA_VDBOX2), /* \ + 0x1d0000 - 0x1d2bff: VD2 \ + 0x1d2c00 - 0x1d2cff: reserved \ + 0x1d2d00 - 0x1d2dff: VD2 \ + 0x1d2e00 - 0x1d3dff: VD2 (DG2 only) \ + 0x1d3e00 - 0x1d3eff: reserved \ + 0x1d3f00 - 0x1d3fff: VD2 */ \ + GEN_FW_RANGE(0x1d4000, 0x1d7fff, FORCEWAKE_MEDIA_VDBOX3), /* \ + 0x1d4000 - 0x1d6bff: VD3 \ + 0x1d6c00 - 0x1d6cff: reserved \ + 0x1d6d00 - 0x1d6dff: VD3 \ + 0x1d6e00 - 0x1d7fff: reserved */ \ + GEN_FW_RANGE(0x1d8000, 0x1dffff, FORCEWAKE_MEDIA_VEBOX1), /* \ + 0x1d8000 - 0x1da0ff: VE1 \ + 0x1da100 - 0x1dffff: reserved */ \ + GEN_FW_RANGE(0x1e0000, 0x1e3fff, FORCEWAKE_MEDIA_VDBOX4), /* \ + 0x1e0000 - 0x1e2bff: VD4 \ + 0x1e2c00 - 0x1e2cff: reserved \ + 0x1e2d00 - 0x1e2dff: VD4 \ + 0x1e2e00 - 0x1e3eff: reserved \ + 0x1e3f00 - 0x1e3fff: VD4 */ \ + GEN_FW_RANGE(0x1e4000, 0x1e7fff, FORCEWAKE_MEDIA_VDBOX5), /* \ + 0x1e4000 - 0x1e6bff: VD5 \ + 0x1e6c00 - 0x1e6cff: reserved \ + 0x1e6d00 - 0x1e6dff: VD5 \ + 0x1e6e00 - 0x1e7fff: reserved */ \ + GEN_FW_RANGE(0x1e8000, 0x1effff, FORCEWAKE_MEDIA_VEBOX2), /* \ + 0x1e8000 - 0x1ea0ff: VE2 \ + 0x1ea100 - 0x1effff: reserved */ \ + GEN_FW_RANGE(0x1f0000, 0x1f3fff, FORCEWAKE_MEDIA_VDBOX6), /* \ + 0x1f0000 - 0x1f2bff: VD6 \ + 0x1f2c00 - 0x1f2cff: reserved \ + 0x1f2d00 - 0x1f2dff: VD6 \ + 0x1f2e00 - 0x1f3eff: reserved \ + 0x1f3f00 - 0x1f3fff: VD6 */ \ + GEN_FW_RANGE(0x1f4000, 0x1f7fff, FORCEWAKE_MEDIA_VDBOX7), /* \ + 0x1f4000 - 0x1f6bff: VD7 \ + 0x1f6c00 - 0x1f6cff: reserved \ + 0x1f6d00 - 0x1f6dff: VD7 \ + 0x1f6e00 - 0x1f7fff: reserved */ \ GEN_FW_RANGE(0x1f8000, 0x1fa0ff, FORCEWAKE_MEDIA_VEBOX3), + +static const struct intel_forcewake_range __xehp_fw_ranges[] = { + XEHP_FWRANGES(FORCEWAKE_GT) +}; + +static const struct intel_forcewake_range __dg2_fw_ranges[] = { + XEHP_FWRANGES(FORCEWAKE_RENDER) };
static void @@ -2084,7 +2111,11 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) return ret; forcewake_early_sanitize(uncore, 0);
- if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, xehp_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
DG2's replicated register ranges are almost the same at XeHP SDV with the exception of one LNCF sub-range that switches to gslice steering. We can re-use the XeHP SDV mslice steering table and just provide a DG2-specific LNCF steering table.
Bspec: 66534 Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_gt.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 9d1c99c9c0dd..d640fd37792f 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -103,6 +103,12 @@ static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = { {}, };
+static const struct intel_mmio_range dg2_lncf_steering_table[] = { + { 0x00B000, 0x00B0FF }, + { 0x00D880, 0x00D8FF }, + {}, +}; + static u16 slicemask(struct intel_gt *gt, int count) { u64 dss_mask = intel_sseu_get_subslices(>->info.sseu, 0); @@ -127,7 +133,10 @@ int intel_gt_init_mmio(struct intel_gt *gt) (intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) & GEN12_MEML3_EN_MASK);
- if (IS_XEHPSDV(gt->i915)) { + if (IS_DG2(gt->i915)) { + gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; + gt->steering_table[LNCF] = dg2_lncf_steering_table; + } else if (IS_XEHPSDV(gt->i915)) { gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table; gt->steering_table[LNCF] = xehpsdv_lncf_steering_table; } else if (GRAPHICS_VER(gt->i915) >= 11 &&
Although DG2_G10 platforms will always have all SQIDI's present and don't need steering for registers in a SQIDI MMIO range, this isn't true for DG2_G11 platforms; only SQIDI's 2 and 3 can be used on those.
We handle SQIDI ranges a bit differently from other types of explicit steering. The SQIDI ranges belong to either the MCFG unit or the SF unit, both of which have their own dedicated steering registers and do not use the typical 0xFDC steering control that all other types of ranges use. Thus we only need to worry about picking a valid initial value for the MCFG and SF steering registers (0xFD0 and 0xFD8 resepectively) at driver init; they won't change after we set them up so we don't need to worry about re-steering them explicitly at runtime.
Given that any SQIDI value should work fine for DG2-G10 and XeHP SDV, while only values of 2 and 3 are valid for DG2-G11, we'll just initialize the MCFG and SF steering registers to a constant value of "2" for all XeHP-based platforms for simplicity --- that will work in all cases.
Bspec: 66534 Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 28 +++++++++++++++++---- drivers/gpu/drm/i915/i915_reg.h | 2 ++ 2 files changed, 25 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 4302dc1b728e..f97ff2848122 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -944,17 +944,24 @@ cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); }
-static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal, - unsigned slice, unsigned subslice) +static void __set_mcr_steering(struct i915_wa_list *wal, + i915_reg_t steering_reg, + unsigned int slice, unsigned int subslice) { u32 mcr, mcr_mask;
mcr = GEN11_MCR_SLICE(slice) | GEN11_MCR_SUBSLICE(subslice); mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
- drm_dbg(&i915->drm, "MCR slice/subslice = %x\n", mcr); + wa_write_clr_set(wal, steering_reg, mcr_mask, mcr); +} + +static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal, + unsigned int slice, unsigned int subslice) +{ + drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
- wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); + __set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice); }
static void @@ -1008,7 +1015,6 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) * - L3 Bank (fusable) * - MSLICE (fusable) * - LNCF (sub-unit within mslice; always present if mslice is present) - * - SQIDI (always on) * * We'll do our default/implicit steering based on GSLICE (in the * sliceid field) and DSS (in the subsliceid field). If we can @@ -1058,6 +1064,18 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal) WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
__add_mcr_wa(i915, wal, slice, subslice); + + /* + * SQIDI ranges are special because they use different steering + * registers than everything else we work with. On XeHP SDV and + * DG2-G10, any value in the steering registers will work fine since + * all instances are present, but DG2-G11 only has SQIDI instances at + * ID's 2 and 3, so we need to steer to one of those. For simplicity + * we'll just steer to a hardcoded "2" since that value will work + * everywhere. + */ + __set_mcr_steering(wal, MCFG_MCR_SELECTOR, 0, 2); + __set_mcr_steering(wal, SF_MCR_SELECTOR, 0, 2); }
static void diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2992e8585399..b19d102e0a01 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2686,6 +2686,8 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_SC_INSTDONE_EXTRA2 _MMIO(0x7108) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define MCFG_MCR_SELECTOR _MMIO(0xfd0) +#define SF_MCR_SELECTOR _MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc) #define GEN8_MCR_SLICE(slice) (((slice) & 3) << 26) #define GEN8_MCR_SLICE_MASK GEN8_MCR_SLICE(3)
From: Akeem G Abodunrin akeem.g.abodunrin@intel.com
New LRI register offsets were introduced for DG2, this patch adds those extra registers, and create new register table for setting offsets to compare with HW generated context image - especially for gt_lrc test. Also updates general purpose register with scratch offset for DG2, in order to use it for live_lrc_fixed selftest.
Cc: Chris P Wilson chris.p.wilson@intel.com Cc: Prathap Kumar Valsan prathap.kumar.valsan@intel.com Signed-off-by: Akeem G Abodunrin akeem.g.abodunrin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_lrc.c | 85 ++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index fee735e2a524..da7ac1d970af 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = { END };
+static const u8 dg2_xcs_offsets[] = { + NOP(1), + LRI(15, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + REG(0x120), + REG(0x124), + + NOP(1), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + END +}; + static const u8 gen8_rcs_offsets[] = { NOP(1), LRI(14, POSTED), @@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = { END };
+static const u8 dg2_rcs_offsets[] = { + NOP(1), + LRI(15, POSTED), + REG16(0x244), + REG(0x034), + REG(0x030), + REG(0x038), + REG(0x03c), + REG(0x168), + REG(0x140), + REG(0x110), + REG(0x1c0), + REG(0x1c4), + REG(0x1c8), + REG(0x180), + REG16(0x2b4), + REG(0x120), + REG(0x124), + + NOP(1), + LRI(9, POSTED), + REG16(0x3a8), + REG16(0x28c), + REG16(0x288), + REG16(0x284), + REG16(0x280), + REG16(0x27c), + REG16(0x278), + REG16(0x274), + REG16(0x270), + + LRI(3, POSTED), + REG(0x1b0), + REG16(0x5a8), + REG16(0x5ac), + + NOP(6), + LRI(1, 0), + REG(0x0c8), + + END +}; + #undef END #undef REG16 #undef REG @@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) !intel_engine_has_relative_mmio(engine));
if (engine->class == RENDER_CLASS) { - if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) + return dg2_rcs_offsets; + else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50)) return xehp_rcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_rcs_offsets; @@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine) else return gen8_rcs_offsets; } else { - if (GRAPHICS_VER(engine->i915) >= 12) + if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55)) + return dg2_xcs_offsets; + else if (GRAPHICS_VER(engine->i915) >= 12) return gen12_xcs_offsets; else if (GRAPHICS_VER(engine->i915) >= 9) return gen9_xcs_offsets;
For tgl+, the per-context setting of MI_MODE[12] determines whether the bits of a nested MI_BATCH_BUFFER_START instruction should be interpreted in the traditional manner or whether they should instead use a new tgl+ meaning that breaks backward compatibility, but allows nesting into 3rd-level batchbuffers. For previous platforms, the hardware default for this register bit is to maintain backward-compatible behavior unless a context intentionally opts into the new behavior; however Xe_HPG flips the hardware default behavior.
From a SW perspective, we want to maintain the backward-compatible
behavior for userspace, so we'll apply a fake workaround to set it back to the legacy behavior on platforms where the hardware default is to break compatibility. At the moment there is no Linux userspace that utilizes third-level batchbuffers, so this will avoid userspace from needing to make any changes. using the legacy meaning is the correct thing to do. If/when we have userspace consumers that want to utilize third-level batch nesting, we can provide a context parameter to allow them to opt-in.
Bspec: 45974, 45718 Cc: John Harrison John.C.Harrison@Intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++++++++++++++++++-- drivers/gpu/drm/i915/i915_reg.h | 1 + 2 files changed, 38 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index f97ff2848122..43db766b0672 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -686,6 +686,37 @@ static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine, DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE); }
+static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine, + struct i915_wa_list *wal) +{ + /* + * This is a "fake" workaround defined by software to ensure we + * maintain reliable, backward-compatible behavior for userspace with + * regards to how nested MI_BATCH_BUFFER_START commands are handled. + * + * The per-context setting of MI_MODE[12] determines whether the bits + * of a nested MI_BATCH_BUFFER_START instruction should be interpreted + * in the traditional manner or whether they should instead use a new + * tgl+ meaning that breaks backward compatibility, but allows nesting + * into 3rd-level batchbuffers. When this new capability was first + * added in TGL, it remained off by default unless a context + * intentionally opted in to the new behavior. However Xe_HPG now + * flips this on by default and requires that we explicitly opt out if + * we don't want the new behavior. + * + * From a SW perspective, we want to maintain the backward-compatible + * behavior for userspace, so we'll apply a fake workaround to set it + * back to the legacy behavior on platforms where the hardware default + * is to break compatibility. At the moment there is no Linux + * userspace that utilizes third-level batchbuffers, so this will avoid + * userspace from needing to make any changes. using the legacy + * meaning is the correct thing to do. If/when we have userspace + * consumers that want to utilize third-level batch nesting, we can + * provide a context parameter to allow them to opt-in. + */ + wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN); +} + static void __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, struct i915_wa_list *wal, @@ -693,11 +724,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, { struct drm_i915_private *i915 = engine->i915;
+ wa_init_start(wal, name, engine->name); + + /* Applies to all engines */ + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) + fakewa_disable_nestedbb_mode(engine, wal); + if (engine->class != RENDER_CLASS) return;
- wa_init_start(wal, name, engine->name); - if (IS_DG1(i915)) dg1_ctx_workarounds_init(engine, wal); else if (GRAPHICS_VER(i915) == 12) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index b19d102e0a01..35a42df1f2aa 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define MI_MODE _MMIO(0x209c) # define VS_TIMER_DISPATCH (1 << 6) # define MI_FLUSH_ENABLE (1 << 12) +# define TGL_NESTED_BB_EN (1 << 12) # define ASYNC_FLIP_PERF_DISABLE (1 << 14) # define MODE_IDLE (1 << 9) # define STOP_RING (1 << 8)
Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team has indicated that having these reported in the error state would be useful for debugging GPU hangs. These registers are replicated per-DSS with gslice steering.
Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 +++++++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 3 +++ drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++-- drivers/gpu/drm/i915/i915_reg.h | 1 + 4 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index e1302e9c168b..b3c002e4ae9f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1220,6 +1220,13 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, GEN7_ROW_INSTDONE); } } + + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { + for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) + instdone->geom_svg[slice][subslice] = + read_subslice_reg(engine, slice, subslice, + XEHPG_INSTDONE_GEOM_SVG); + } } else if (GRAPHICS_VER(i915) >= 7) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base)); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index e917b7519f2b..93609d797ac2 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -80,6 +80,9 @@ struct intel_instdone { u32 slice_common_extra[2]; u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; + + /* Added in XeHPG */ + u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; };
/* diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c1e744b5ab47..4de7edc451ef 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -431,6 +431,7 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu; int slice; int subslice; + int iter;
err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone.instdone); @@ -445,8 +446,6 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, return;
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) { - int iter; - for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", slice, subslice, @@ -471,6 +470,13 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, if (GRAPHICS_VER(m->i915) < 12) return;
+ if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) { + for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) + err_printf(m, " GEOM_SVGUNIT_INSTDONE[%d][%d]: 0x%08x\n", + slice, subslice, + ee->instdone.geom_svg[slice][subslice]); + } + err_printf(m, " SC_INSTDONE_EXTRA: 0x%08x\n", ee->instdone.slice_common_extra[0]); err_printf(m, " SC_INSTDONE_EXTRA2: 0x%08x\n", diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 35a42df1f2aa..d58864c7adc6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_SC_INSTDONE_EXTRA2 _MMIO(0x7108) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define XEHPG_INSTDONE_GEOM_SVG _MMIO(0x666c) #define MCFG_MCR_SELECTOR _MMIO(0xfd0) #define SF_MCR_SELECTOR _MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc)
On 01/07/2021 23:24, Matt Roper wrote:
Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team has indicated that having these reported in the error state would be useful for debugging GPU hangs. These registers are replicated per-DSS with gslice steering.
Cc: Lionel Landwerlin lionel.g.landwerlin@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
Thanks,
Acked-by: Lionel Landwerlin lionel.g.landwerlin@intel.com
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 7 +++++++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 3 +++ drivers/gpu/drm/i915/i915_gpu_error.c | 10 ++++++++-- drivers/gpu/drm/i915/i915_reg.h | 1 + 4 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index e1302e9c168b..b3c002e4ae9f 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -1220,6 +1220,13 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine, GEN7_ROW_INSTDONE); } }
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
instdone->geom_svg[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
XEHPG_INSTDONE_GEOM_SVG);
} else if (GRAPHICS_VER(i915) >= 7) { instdone->instdone = intel_uncore_read(uncore, RING_INSTDONE(mmio_base));}
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index e917b7519f2b..93609d797ac2 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -80,6 +80,9 @@ struct intel_instdone { u32 slice_common_extra[2]; u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
/* Added in XeHPG */
u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES]; };
/*
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index c1e744b5ab47..4de7edc451ef 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -431,6 +431,7 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, const struct sseu_dev_info *sseu = &ee->engine->gt->info.sseu; int slice; int subslice;
int iter;
err_printf(m, " INSTDONE: 0x%08x\n", ee->instdone.instdone);
@@ -445,8 +446,6 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, return;
if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 50)) {
int iter;
- for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice) err_printf(m, " SAMPLER_INSTDONE[%d][%d]: 0x%08x\n", slice, subslice,
@@ -471,6 +470,13 @@ static void error_print_instdone(struct drm_i915_error_state_buf *m, if (GRAPHICS_VER(m->i915) < 12) return;
- if (GRAPHICS_VER_FULL(m->i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(m->i915, sseu, iter, slice, subslice)
err_printf(m, " GEOM_SVGUNIT_INSTDONE[%d][%d]: 0x%08x\n",
slice, subslice,
ee->instdone.geom_svg[slice][subslice]);
- }
- err_printf(m, " SC_INSTDONE_EXTRA: 0x%08x\n", ee->instdone.slice_common_extra[0]); err_printf(m, " SC_INSTDONE_EXTRA2: 0x%08x\n",
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 35a42df1f2aa..d58864c7adc6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2686,6 +2686,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define GEN12_SC_INSTDONE_EXTRA2 _MMIO(0x7108) #define GEN7_SAMPLER_INSTDONE _MMIO(0xe160) #define GEN7_ROW_INSTDONE _MMIO(0xe164) +#define XEHPG_INSTDONE_GEOM_SVG _MMIO(0x666c) #define MCFG_MCR_SELECTOR _MMIO(0xfd0) #define SF_MCR_SELECTOR _MMIO(0xfd8) #define GEN8_MCR_SELECTOR _MMIO(0xfdc)
Bspec: 45101, 45427 Cc: Ramalingam C ramalingam.c@intel.com(v5) Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/gt/intel_mocs.c | 35 +++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_mocs.c b/drivers/gpu/drm/i915/gt/intel_mocs.c index 0c9d0b936c20..d22ca8212092 100644 --- a/drivers/gpu/drm/i915/gt/intel_mocs.c +++ b/drivers/gpu/drm/i915/gt/intel_mocs.c @@ -341,6 +341,30 @@ static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = { MOCS_ENTRY(63, 0, L3_1_UC), };
+static const struct drm_i915_mocs_entry dg2_mocs_table[] = { + /* UC - Coherent; GO:L3 */ + MOCS_ENTRY(0, 0, L3_1_UC | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)), + + /* WB - LC */ + MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)), +}; + +static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = { + /* Wa_14011441408: Set Go to Memory for MOCS#0 */ + MOCS_ENTRY(0, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Coherent; GO:Memory */ + MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)), + /* UC - Non-Coherent; GO:Memory */ + MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)), + + /* WB - LC */ + MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)), +}; + enum { HAS_GLOBAL_MOCS = BIT(0), HAS_ENGINE_MOCS = BIT(1), @@ -367,7 +391,16 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915, { unsigned int flags;
- if (IS_XEHPSDV(i915)) { + if (IS_DG2(i915)) { + if (IS_DG2_GT_STEP(i915, G10, STEP_A0, (STEP_B0 - 1))) { + table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax); + table->table = dg2_mocs_table_g10_ax; + } else { + table->size = ARRAY_SIZE(dg2_mocs_table); + table->table = dg2_mocs_table; + } + table->n_entries = GEN9_NUM_MOCS_ENTRIES; + } else if (IS_XEHPSDV(i915)) { table->size = ARRAY_SIZE(xehpsdv_mocs_table); table->table = xehpsdv_mocs_table; table->n_entries = GEN9_NUM_MOCS_ENTRIES;
As with DG1, DG2 has an ICL-style south display interface provided on the same PCI device. Add a fake PCH to ensure DG2 takes the appropriate codepaths for south display handling.
Bspec: 54871, 50062, 49961, 53673 Cc: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Aditya Swarup aditya.swarup@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com --- drivers/gpu/drm/i915/i915_irq.c | 2 +- drivers/gpu/drm/i915/intel_pch.c | 3 +++ drivers/gpu/drm/i915/intel_pch.h | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9d47ffa39093..34a0d49e760e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -208,7 +208,7 @@ static void intel_hpd_init_pins(struct drm_i915_private *dev_priv) (!HAS_PCH_SPLIT(dev_priv) || HAS_PCH_NOP(dev_priv))) return;
- if (HAS_PCH_DG1(dev_priv)) + if (INTEL_PCH_TYPE(dev_priv) >= PCH_DG1) hpd->pch_hpd = hpd_sde_dg1; else if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP) hpd->pch_hpd = hpd_icp; diff --git a/drivers/gpu/drm/i915/intel_pch.c b/drivers/gpu/drm/i915/intel_pch.c index 4e92ae19189e..cc44164e242b 100644 --- a/drivers/gpu/drm/i915/intel_pch.c +++ b/drivers/gpu/drm/i915/intel_pch.c @@ -211,6 +211,9 @@ void intel_detect_pch(struct drm_i915_private *dev_priv) if (IS_DG1(dev_priv)) { dev_priv->pch_type = PCH_DG1; return; + } else if (IS_DG2(dev_priv)) { + dev_priv->pch_type = PCH_DG2; + return; }
/* diff --git a/drivers/gpu/drm/i915/intel_pch.h b/drivers/gpu/drm/i915/intel_pch.h index e2f3f30c6445..7c0d83d292dc 100644 --- a/drivers/gpu/drm/i915/intel_pch.h +++ b/drivers/gpu/drm/i915/intel_pch.h @@ -30,6 +30,7 @@ enum intel_pch {
/* Fake PCHs, functionality handled on the same PCI dev */ PCH_DG1 = 1024, + PCH_DG2, };
#define INTEL_PCH_DEVICE_ID_MASK 0xff80 @@ -62,6 +63,7 @@ enum intel_pch {
#define INTEL_PCH_TYPE(dev_priv) ((dev_priv)->pch_type) #define INTEL_PCH_ID(dev_priv) ((dev_priv)->pch_id) +#define HAS_PCH_DG2(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG2) #define HAS_PCH_ADP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_ADP) #define HAS_PCH_DG1(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG1) #define HAS_PCH_JSP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_JSP)
On Thu, Jul 01, 2021 at 01:24:07PM -0700, Matt Roper wrote:
As with DG1, DG2 has an ICL-style south display interface provided on the same PCI device. Add a fake PCH to ensure DG2 takes the appropriate codepaths for south display handling.
Bspec: 54871, 50062, 49961, 53673 Cc: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Aditya Swarup aditya.swarup@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
drivers/gpu/drm/i915/i915_irq.c | 2 +- drivers/gpu/drm/i915/intel_pch.c | 3 +++ drivers/gpu/drm/i915/intel_pch.h | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9d47ffa39093..34a0d49e760e 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -208,7 +208,7 @@ static void intel_hpd_init_pins(struct drm_i915_private *dev_priv) (!HAS_PCH_SPLIT(dev_priv) || HAS_PCH_NOP(dev_priv))) return;
- if (HAS_PCH_DG1(dev_priv))
- if (INTEL_PCH_TYPE(dev_priv) >= PCH_DG1) hpd->pch_hpd = hpd_sde_dg1; else if (INTEL_PCH_TYPE(dev_priv) >= PCH_ICP) hpd->pch_hpd = hpd_icp;
diff --git a/drivers/gpu/drm/i915/intel_pch.c b/drivers/gpu/drm/i915/intel_pch.c index 4e92ae19189e..cc44164e242b 100644 --- a/drivers/gpu/drm/i915/intel_pch.c +++ b/drivers/gpu/drm/i915/intel_pch.c @@ -211,6 +211,9 @@ void intel_detect_pch(struct drm_i915_private *dev_priv) if (IS_DG1(dev_priv)) { dev_priv->pch_type = PCH_DG1; return;
} else if (IS_DG2(dev_priv)) {
dev_priv->pch_type = PCH_DG2;
return;
}
/*
diff --git a/drivers/gpu/drm/i915/intel_pch.h b/drivers/gpu/drm/i915/intel_pch.h index e2f3f30c6445..7c0d83d292dc 100644 --- a/drivers/gpu/drm/i915/intel_pch.h +++ b/drivers/gpu/drm/i915/intel_pch.h @@ -30,6 +30,7 @@ enum intel_pch {
/* Fake PCHs, functionality handled on the same PCI dev */ PCH_DG1 = 1024,
- PCH_DG2,
};
#define INTEL_PCH_DEVICE_ID_MASK 0xff80 @@ -62,6 +63,7 @@ enum intel_pch {
#define INTEL_PCH_TYPE(dev_priv) ((dev_priv)->pch_type) #define INTEL_PCH_ID(dev_priv) ((dev_priv)->pch_id) +#define HAS_PCH_DG2(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG2) #define HAS_PCH_ADP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_ADP) #define HAS_PCH_DG1(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_DG1)
#define HAS_PCH_JSP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_JSP)
2.25.4
Note that DG2 only has a single possible refclk frequency (38.4 MHz).
Bspec: 54034 Cc: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_cdclk.c | 24 ++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c index 613ffcc68eba..08f34d87684f 100644 --- a/drivers/gpu/drm/i915/display/intel_cdclk.c +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c @@ -1290,6 +1290,18 @@ static const struct intel_cdclk_vals adlp_cdclk_table[] = { {} };
+static const struct intel_cdclk_vals dg2_cdclk_table[] = { + { .refclk = 38400, .cdclk = 172800, .divider = 2, .ratio = 9 }, + { .refclk = 38400, .cdclk = 179200, .divider = 3, .ratio = 14 }, + { .refclk = 38400, .cdclk = 192000, .divider = 2, .ratio = 10 }, + { .refclk = 38400, .cdclk = 192000, .divider = 3, .ratio = 15 }, + { .refclk = 38400, .cdclk = 307200, .divider = 2, .ratio = 16 }, + { .refclk = 38400, .cdclk = 326400, .divider = 4, .ratio = 34 }, + { .refclk = 38400, .cdclk = 556800, .divider = 2, .ratio = 29 }, + { .refclk = 38400, .cdclk = 652800, .divider = 2, .ratio = 34 }, + {} +}; + static int bxt_calc_cdclk(struct drm_i915_private *dev_priv, int min_cdclk) { const struct intel_cdclk_vals *table = dev_priv->cdclk.table; @@ -1408,7 +1420,9 @@ static void bxt_de_pll_readout(struct drm_i915_private *dev_priv, { u32 val, ratio;
- if (DISPLAY_VER(dev_priv) >= 11) + if (IS_DG2(dev_priv)) + cdclk_config->ref = 38400; + else if (DISPLAY_VER(dev_priv) >= 11) icl_readout_refclk(dev_priv, cdclk_config); else if (IS_CANNONLAKE(dev_priv)) cnl_readout_refclk(dev_priv, cdclk_config); @@ -2878,7 +2892,13 @@ u32 intel_read_rawclk(struct drm_i915_private *dev_priv) */ void intel_init_cdclk_hooks(struct drm_i915_private *dev_priv) { - if (IS_ALDERLAKE_P(dev_priv)) { + if (IS_DG2(dev_priv)) { + dev_priv->display.set_cdclk = bxt_set_cdclk; + dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk; + dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk; + dev_priv->display.calc_voltage_level = tgl_calc_voltage_level; + dev_priv->cdclk.table = dg2_cdclk_table; + } else if (IS_ALDERLAKE_P(dev_priv)) { dev_priv->display.set_cdclk = bxt_set_cdclk; dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk; dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk;
DG2 has no shared DPLL's or DDI clock muxing. The Port PLL is embedded within the PHY.
Bspec: 54032 Bspec: 54034 Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: Mohammed Khajapasha mohammed.khajapasha@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display.c | 10 +++++++--- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 5 ++++- 2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 026c28c612f0..c673d0c8fb4a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3474,7 +3474,8 @@ static void icl_ddi_bigjoiner_pre_enable(struct intel_atomic_state *state, * Enable sequence steps 1-7 on bigjoiner master */ intel_encoders_pre_pll_enable(state, master); - intel_enable_shared_dpll(master_crtc_state); + if (master_crtc_state->shared_dpll) + intel_enable_shared_dpll(master_crtc_state); intel_encoders_pre_enable(state, master);
/* and DSC on slave */ @@ -8633,10 +8634,11 @@ intel_pipe_config_compare(const struct intel_crtc_state *current_config,
PIPE_CONF_CHECK_BOOL(double_wide);
- PIPE_CONF_CHECK_P(shared_dpll); + if (dev_priv->dpll.mgr) + PIPE_CONF_CHECK_P(shared_dpll);
/* FIXME do the readout properly and get rid of this quirk */ - if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { + if (dev_priv->dpll.mgr && !PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { PIPE_CONF_CHECK_X(dpll_hw_state.dpll); PIPE_CONF_CHECK_X(dpll_hw_state.dpll_md); PIPE_CONF_CHECK_X(dpll_hw_state.fp0); @@ -8668,7 +8670,9 @@ intel_pipe_config_compare(const struct intel_crtc_state *current_config, PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_ssc); PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_bias); PIPE_CONF_CHECK_X(dpll_hw_state.mg_pll_tdc_coldst_bias); + }
+ if (!PIPE_CONF_QUIRK(PIPE_CONFIG_QUIRK_BIGJOINER_SLAVE)) { PIPE_CONF_CHECK_X(dsi_pll.ctrl); PIPE_CONF_CHECK_X(dsi_pll.div);
diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c index 882bfd499e55..5688d9704636 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c @@ -4462,7 +4462,10 @@ void intel_shared_dpll_init(struct drm_device *dev) const struct dpll_info *dpll_info; int i;
- if (IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + /* No shared DPLLs on DG2; port PLLs are part of the PHY */ + dpll_mgr = NULL; + else if (IS_ALDERLAKE_P(dev_priv)) dpll_mgr = &adlp_pll_mgr; else if (IS_ALDERLAKE_S(dev_priv)) dpll_mgr = &adls_pll_mgr;
On DG2 we're supposed to just wait 600us after programming the well before moving on; there won't be an ack from the hardware.
Bspec: 49296 Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../gpu/drm/i915/display/intel_display_power.c | 16 ++++++++++++++++ .../gpu/drm/i915/display/intel_display_power.h | 6 ++++++ 2 files changed, 22 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 285380079aab..c34ff0947b85 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -341,6 +341,17 @@ static void hsw_wait_for_power_well_enable(struct drm_i915_private *dev_priv, { const struct i915_power_well_regs *regs = power_well->desc->hsw.regs; int pw_idx = power_well->desc->hsw.idx; + int enable_delay = power_well->desc->hsw.fixed_enable_delay; + + /* + * For some power wells we're not supposed to watch the status bit for + * an ack, but rather just wait a fixed amount of time and then + * proceed. This is only used on DG2. + */ + if (IS_DG2(dev_priv) && enable_delay) { + usleep_range(enable_delay, 2 * enable_delay); + return; + }
/* Timeout for PW1:10 us, AUX:not specified, other PWs:20 us. */ if (intel_de_wait_for_set(dev_priv, regs->driver, @@ -4828,6 +4839,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_A, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4838,6 +4850,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_B, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4848,6 +4861,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = ICL_PW_CTL_IDX_AUX_C, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4858,6 +4872,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = XELPD_PW_CTL_IDX_AUX_D, + .hsw.fixed_enable_delay = 600, }, }, { @@ -4878,6 +4893,7 @@ static const struct i915_power_well_desc xelpd_power_wells[] = { { .hsw.regs = &icl_aux_power_well_regs, .hsw.idx = TGL_PW_CTL_IDX_AUX_TC1, + .hsw.fixed_enable_delay = 600, }, }, { diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 4f0917df4375..22367b5cba96 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -223,6 +223,12 @@ struct i915_power_well_desc { u8 idx; /* Mask of pipes whose IRQ logic is backed by the pw */ u8 irq_pipe_mask; + /* + * Instead of waiting for the status bit to ack enables, + * just wait a specific amount of time and then consider + * the well enabled. + */ + u16 fixed_enable_delay; /* The pw is backing the VGA functionality */ bool has_vga:1; bool has_fuses:1;
DG2 has outputs on DDI A-D attached to what the bspec diagram shows as "Combo PHY A-D." Note that despite being labelled "combo" the PHYs on these outputs are Synopsys PHYs rather than traditional Intel combo PHY technology.
Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index c673d0c8fb4a..dc2b943a4e72 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -11329,7 +11329,12 @@ static void intel_setup_outputs(struct drm_i915_private *dev_priv) if (!HAS_DISPLAY(dev_priv)) return;
- if (IS_ALDERLAKE_P(dev_priv)) { + if (IS_DG2(dev_priv)) { + intel_ddi_init(dev_priv, PORT_A); + intel_ddi_init(dev_priv, PORT_B); + intel_ddi_init(dev_priv, PORT_C); + intel_ddi_init(dev_priv, PORT_D_XELPD); + } else if (IS_ALDERLAKE_P(dev_priv)) { intel_ddi_init(dev_priv, PORT_A); intel_ddi_init(dev_priv, PORT_B); intel_ddi_init(dev_priv, PORT_TC1);
DG2 extends our DDB to four DBuf slices; pipes A+B only have access to the first two slices, whereas pipes C+D only have access to the second two.
Confusingly, our bspec decided to switch from 1-based numbering of dbuf slices (S1, S2) to 0-based numbering (S0, S1, S2, S3) in Display13. At the moment we're using the 0-based number scheme for the DBUF_CTL_S() register addressing, but the 1-based number scheme in the actual slice assignment tables. We may want to consider switching the assignment over to 0-based numbering too at some point...
Bspec: 49255 Bspec: 50057 Cc: Stanislav Lisovskiy stanislav.lisovskiy@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../drm/i915/display/intel_display_power.h | 4 + drivers/gpu/drm/i915/intel_pm.c | 120 +++++++++++++++++- 2 files changed, 123 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.h b/drivers/gpu/drm/i915/display/intel_display_power.h index 22367b5cba96..ad788bbd727d 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.h +++ b/drivers/gpu/drm/i915/display/intel_display_power.h @@ -392,6 +392,10 @@ intel_display_power_put_all_in_set(struct drm_i915_private *i915, intel_display_power_put_mask_in_set(i915, power_domain_set, power_domain_set->mask); }
+/* + * FIXME: We should probably switch this to a 0-based scheme to be consistent + * with how we now name/number DBUF_CTL instances. + */ enum dbuf_slice { DBUF_S1, DBUF_S2, diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 5fdb96e7d266..ff8d89fff502 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4584,6 +4584,117 @@ static const struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {} };
+static const struct dbuf_slice_conf_entry dg2_allowed_dbufs[] = { + { + .active_pipes = BIT(PIPE_A), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_B), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + }, + }, + { + .active_pipes = BIT(PIPE_C), + .dbuf_mask = { + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_B) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_D), + .dbuf_mask = { + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_B) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + [PIPE_D] = BIT(DBUF_S3) | BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_B] = BIT(DBUF_S1) | BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + { + .active_pipes = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C) | BIT(PIPE_D), + .dbuf_mask = { + [PIPE_A] = BIT(DBUF_S1), + [PIPE_B] = BIT(DBUF_S2), + [PIPE_C] = BIT(DBUF_S3), + [PIPE_D] = BIT(DBUF_S4), + }, + }, + {} +}; + static const struct dbuf_slice_conf_entry adlp_allowed_dbufs[] = { { .active_pipes = BIT(PIPE_A), @@ -4759,12 +4870,19 @@ static u32 adlp_compute_dbuf_slices(enum pipe pipe, u32 active_pipes) return compute_dbuf_slices(pipe, active_pipes, adlp_allowed_dbufs); }
+static u32 dg2_compute_dbuf_slices(enum pipe pipe, u32 active_pipes) +{ + return compute_dbuf_slices(pipe, active_pipes, dg2_allowed_dbufs); +} + static u8 skl_compute_dbuf_slices(struct intel_crtc *crtc, u8 active_pipes) { struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); enum pipe pipe = crtc->pipe;
- if (IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + return dg2_compute_dbuf_slices(pipe, active_pipes); + else if (IS_ALDERLAKE_P(dev_priv)) return adlp_compute_dbuf_slices(pipe, active_pipes); else if (DISPLAY_VER(dev_priv) == 12) return tgl_compute_dbuf_slices(pipe, active_pipes);
Although the BW_BUDDY registers still exist, they are not used for anything on DG2. This change is expected to hold true for future dgpu's too.
Bspec: 49218 Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display_power.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index c34ff0947b85..df6358638fee 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -5814,6 +5814,10 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv) unsigned long abox_mask = INTEL_INFO(dev_priv)->abox_mask; int config, i;
+ /* BW_BUDDY registers are not used on dgpu's beyond DG1 */ + if (IS_DGFX(dev_priv) && !IS_DG1(dev_priv)) + return; + if (IS_ALDERLAKE_S(dev_priv) || IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) || IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
DG2 does not use system DRAM information for BW_BUDDY programming or watermark workarounds, so there's no need to read this out at startup.
Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_dram.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_dram.c b/drivers/gpu/drm/i915/intel_dram.c index 879b0f007be3..9675bb94b70b 100644 --- a/drivers/gpu/drm/i915/intel_dram.c +++ b/drivers/gpu/drm/i915/intel_dram.c @@ -494,15 +494,15 @@ void intel_dram_detect(struct drm_i915_private *i915) struct dram_info *dram_info = &i915->dram_info; int ret;
+ if (GRAPHICS_VER(i915) < 9 || IS_DG2(i915) || !HAS_DISPLAY(i915)) + return; + /* * Assume level 0 watermark latency adjustment is needed until proven * otherwise, this w/a is not needed by bxt/glk. */ dram_info->wm_lv_0_adjust_needed = !IS_GEN9_LP(i915);
- if (GRAPHICS_VER(i915) < 9 || !HAS_DISPLAY(i915)) - return; - if (GRAPHICS_VER(i915) >= 12) ret = gen12_get_dram_info(i915); else if (GRAPHICS_VER(i915) >= 11)
DG2 doesn't have a SAGV or QGV points that determine memory bandwidth. Instead it has a constant amount of memory bandwidth available to display that does not need to be reduced based on the number of active planes.
For simplicity, we'll just modify driver initialization to create a single dummy QGV point with the proper amount of memory bandwidth, rather than trying to query the pcode for this information.
Bspec: 64631 Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_bw.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c index bfb398f0432e..4ca83874d0aa 100644 --- a/drivers/gpu/drm/i915/display/intel_bw.c +++ b/drivers/gpu/drm/i915/display/intel_bw.c @@ -234,6 +234,26 @@ static int icl_get_bw_info(struct drm_i915_private *dev_priv, const struct intel return 0; }
+static void dg2_get_bw_info(struct drm_i915_private *i915) +{ + struct intel_bw_info *bi = &i915->max_bw[0]; + + /* + * DG2 doesn't have SAGV or QGV points, just a constant max bandwidth + * that doesn't depend on the number of planes enabled. Create a + * single dummy QGV point to reflect that. DG2-G10 platforms have a + * constant 50 GB/s bandwidth, whereas DG2-G11 platforms have 38 GB/s. + */ + bi->num_planes = 1; + bi->num_qgv_points = 1; + if (IS_DG2_G11(i915)) + bi->deratedbw[0] = 38000; + else + bi->deratedbw[0] = 50000; + + i915->sagv_status = I915_SAGV_NOT_CONTROLLED; +} + static unsigned int icl_max_bw(struct drm_i915_private *dev_priv, int num_planes, int qgv_point) { @@ -267,7 +287,9 @@ void intel_bw_init_hw(struct drm_i915_private *dev_priv) if (!HAS_DISPLAY(dev_priv)) return;
- if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + dg2_get_bw_info(dev_priv); + else if (IS_ALDERLAKE_S(dev_priv) || IS_ALDERLAKE_P(dev_priv)) icl_get_bw_info(dev_priv, &adls_sa_info); else if (IS_ROCKETLAKE(dev_priv)) icl_get_bw_info(dev_priv, &rkl_sa_info);
DG2's SNPS PHYs incorporate a dedicated port PLL called MPLLB which takes the place of the shared DPLLs we've used on past platforms. Let's add the MPLLB programming sequences; they'll be plugged into the rest of the code in future patches.
Bspec: 54032 Bspec: 53881 Cc: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Vandita Kulkarni vandita.kulkarni@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Nidhi Gupta nidhi1.gupta@intel.com --- drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/display/intel_display.c | 1 + .../drm/i915/display/intel_display_types.h | 17 +- drivers/gpu/drm/i915/display/intel_dpll.c | 12 +- drivers/gpu/drm/i915/display/intel_snps_phy.c | 517 ++++++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 18 + drivers/gpu/drm/i915/i915_reg.h | 56 ++ 7 files changed, 616 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.c create mode 100644 drivers/gpu/drm/i915/display/intel_snps_phy.h
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 01f28ad5ea57..6b6c1e5a72d5 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -268,6 +268,7 @@ i915-y += \ display/intel_pps.o \ display/intel_qp_tables.o \ display/intel_sdvo.o \ + display/intel_snps_phy.o \ display/intel_tv.o \ display/intel_vdsc.o \ display/intel_vrr.o \ diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index dc2b943a4e72..91f6964ec406 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -59,6 +59,7 @@ #include "display/intel_hdmi.h" #include "display/intel_lvds.h" #include "display/intel_sdvo.h" +#include "display/intel_snps_phy.h" #include "display/intel_tv.h" #include "display/intel_vdsc.h" #include "display/intel_vrr.h" diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index d94f361b548b..29ae1d9b5abc 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -884,6 +884,18 @@ enum intel_output_format { INTEL_OUTPUT_FORMAT_YCBCR444, };
+struct intel_mpllb_state { + u32 clock; /* in KHz */ + u32 ref_control; + u32 mpllb_cp; + u32 mpllb_div; + u32 mpllb_div2; + u32 mpllb_fracn1; + u32 mpllb_fracn2; + u32 mpllb_sscen; + u32 mpllb_sscstep; +}; + struct intel_crtc_state { /* * uapi (drm) state. This is the software state shown to userspace. @@ -1018,7 +1030,10 @@ struct intel_crtc_state { struct intel_shared_dpll *shared_dpll;
/* Actual register state of the dpll, for shared dpll cross-checking. */ - struct intel_dpll_hw_state dpll_hw_state; + union { + struct intel_dpll_hw_state dpll_hw_state; + struct intel_mpllb_state mpllb_state; + };
/* * ICL reserved DPLLs for the CRTC/port. The active PLL is selected by diff --git a/drivers/gpu/drm/i915/display/intel_dpll.c b/drivers/gpu/drm/i915/display/intel_dpll.c index 89635da9f6f6..14515e62c05e 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll.c +++ b/drivers/gpu/drm/i915/display/intel_dpll.c @@ -11,6 +11,7 @@ #include "intel_lvds.h" #include "intel_panel.h" #include "intel_sideband.h" +#include "display/intel_snps_phy.h"
struct intel_limit { struct { @@ -923,12 +924,13 @@ static int hsw_crtc_compute_clock(struct intel_crtc *crtc, struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); struct intel_atomic_state *state = to_intel_atomic_state(crtc_state->uapi.state); + struct intel_encoder *encoder = + intel_get_crtc_new_encoder(state, crtc_state);
- if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) || - DISPLAY_VER(dev_priv) >= 11) { - struct intel_encoder *encoder = - intel_get_crtc_new_encoder(state, crtc_state); - + if (IS_DG2(dev_priv)) { + return intel_mpllb_calc_state(crtc_state, encoder); + } else if (!intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DSI) || + DISPLAY_VER(dev_priv) >= 11) { if (!intel_reserve_shared_dplls(state, crtc, encoder)) { drm_dbg_kms(&dev_priv->drm, "failed to find PLL for pipe %c\n", diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c new file mode 100644 index 000000000000..6d9205906595 --- /dev/null +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -0,0 +1,517 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright © 2019 Intel Corporation + */ + +#include "intel_de.h" +#include "intel_display_types.h" +#include "intel_snps_phy.h" + +/** + * DOC: Synopsis PHY support + * + * Synopsis PHYs are primarily programmed by looking up magic register values + * in tables rather than calculating the necessary values at runtime. + * + * Of special note is that the SNPS PHYs include a dedicated port PLL, known as + * an "MPLLB." The MPLLB replaces the shared DPLL functionality used on other + * platforms and must be programming directly during the modeset sequence + * since it is not handled by the shared DPLL framework as on other platforms. + */ + +/* + * Basic DP link rates with 100 MHz reference clock. + */ + +static const struct intel_mpllb_state dg2_dp_rbr_100 = { + .clock = 162000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 20) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 226), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 39321) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 3), +}; + +static const struct intel_mpllb_state dg2_dp_hbr1_100 = { + .clock = 270000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 20) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 184), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), +}; + +static const struct intel_mpllb_state dg2_dp_hbr2_100 = { + .clock = 540000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 20) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 184), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), +}; + +static const struct intel_mpllb_state dg2_dp_hbr3_100 = { + .clock = 810000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 19) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 292), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), +}; + +static const struct intel_mpllb_state *dg2_dp_100_tables[] = { + &dg2_dp_rbr_100, + &dg2_dp_hbr1_100, + &dg2_dp_hbr2_100, + &dg2_dp_hbr3_100, + NULL, +}; + +/* + * Basic DP link rates with 38.4 MHz reference clock. + */ + +static const struct intel_mpllb_state dg2_dp_rbr_38_4 = { + .clock = 162000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 1), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 25) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 304), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 49152), +}; + +static const struct intel_mpllb_state dg2_dp_hbr1_38_4 = { + .clock = 270000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 1), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 25) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 248), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 40960), +}; + +static const struct intel_mpllb_state dg2_dp_hbr2_38_4 = { + .clock = 540000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 1), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 25) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 248), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 40960), +}; + +static const struct intel_mpllb_state dg2_dp_hbr3_38_4 = { + .clock = 810000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 1), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 6) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 26) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 388), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 61440), +}; + +static const struct intel_mpllb_state *dg2_dp_38_4_tables[] = { + &dg2_dp_rbr_38_4, + &dg2_dp_hbr1_38_4, + &dg2_dp_hbr2_38_4, + &dg2_dp_hbr3_38_4, + NULL, +}; + +/* + * eDP link rates with 100 MHz reference clock. + */ + +static const struct intel_mpllb_state dg2_edp_r216 = { + .clock = 216000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 19) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 312), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 52428) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 4), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_PEAK, 50961), + .mpllb_sscstep = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_STEPSIZE, 65752), +}; + +static const struct intel_mpllb_state dg2_edp_r243 = { + .clock = 243000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 20) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 356), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 26214) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_PEAK, 57331), + .mpllb_sscstep = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_STEPSIZE, 73971), +}; + +static const struct intel_mpllb_state dg2_edp_r324 = { + .clock = 324000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 20) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 226), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 39321) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 3), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_PEAK, 38221), + .mpllb_sscstep = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_STEPSIZE, 49314), +}; + +static const struct intel_mpllb_state dg2_edp_r432 = { + .clock = 432000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 19) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 65) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 127), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 312), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 52428) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 4), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_PEAK, 50961), + .mpllb_sscstep = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_STEPSIZE, 65752), +}; + +static const struct intel_mpllb_state *dg2_edp_tables[] = { + &dg2_dp_rbr_100, + &dg2_edp_r216, + &dg2_edp_r243, + &dg2_dp_hbr1_100, + &dg2_edp_r324, + &dg2_edp_r432, + &dg2_dp_hbr2_100, + &dg2_dp_hbr3_100, + NULL, +}; + +int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, + struct intel_encoder *encoder) +{ + const struct intel_mpllb_state **tables; + int i; + + if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_EDP)) { + tables = dg2_edp_tables; + } else if (intel_crtc_has_dp_encoder(crtc_state)) { + /* + * FIXME: Initially we're just enabling the "combo" outputs on + * port A-D. The MPLLB for those ports takes an input from the + * "Display Filter PLL" which always has an output frequency + * of 100 MHz, hence the use of the _100 tables below. + * + * Once we enable port TC1 it will either use the same 100 MHz + * "Display Filter PLL" (when strapped to support a native + * display connection) or different 38.4 MHz "Filter PLL" when + * strapped to support a USB connection, so we'll need to check + * that to determine which table to use. + */ + if (0) + tables = dg2_dp_38_4_tables; + else + tables = dg2_dp_100_tables; + } else { + /* TODO: Add HDMI support */ + MISSING_CASE(encoder->type); + return -EINVAL; + } + + for (i = 0; tables[i]; i++) { + if (crtc_state->port_clock <= tables[i]->clock) { + crtc_state->mpllb_state = *tables[i]; + return 0; + } + } + + return -EINVAL; +} + +void intel_mpllb_enable(struct intel_encoder *encoder, + const struct intel_crtc_state *crtc_state) +{ + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + const struct intel_mpllb_state *pll_state = &crtc_state->mpllb_state; + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + i915_reg_t enable_reg = (phy <= PHY_D ? + DG2_PLL_ENABLE(phy) : MG_PLL_ENABLE(0)); + + /* + * 3. Software programs the following PLL registers for the desired + * frequency. + */ + intel_de_write(dev_priv, SNPS_PHY_MPLLB_CP(phy), pll_state->mpllb_cp); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_DIV(phy), pll_state->mpllb_div); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_DIV2(phy), pll_state->mpllb_div2); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_SSCEN(phy), pll_state->mpllb_sscen); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_SSCSTEP(phy), pll_state->mpllb_sscstep); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_FRACN1(phy), pll_state->mpllb_fracn1); + intel_de_write(dev_priv, SNPS_PHY_MPLLB_FRACN2(phy), pll_state->mpllb_fracn2); + + /* + * 4. If the frequency will result in a change to the voltage + * requirement, follow the Display Voltage Frequency Switching - + * Sequence Before Frequency Change. + * + * We handle this step in bxt_set_cdclk(). + */ + + /* 5. Software sets DPLL_ENABLE [PLL Enable] to "1". */ + intel_uncore_rmw(&dev_priv->uncore, enable_reg, 0, PLL_ENABLE); + + /* + * 9. Software sets SNPS_PHY_MPLLB_DIV dp_mpllb_force_en to "1". This + * will keep the PLL running during the DDI lane programming and any + * typeC DP cable disconnect. Do not set the force before enabling the + * PLL because that will start the PLL before it has sampled the + * divider values. + */ + intel_de_write(dev_priv, SNPS_PHY_MPLLB_DIV(phy), + pll_state->mpllb_div | SNPS_PHY_MPLLB_FORCE_EN); + + /* + * 10. Software polls on register DPLL_ENABLE [PLL Lock] to confirm PLL + * is locked at new settings. This register bit is sampling PHY + * dp_mpllb_state interface signal. + */ + if (intel_de_wait_for_set(dev_priv, enable_reg, PLL_LOCK, 5)) + DRM_ERROR("Port %c PLL not locked\n", phy_name(phy)); + + /* + * 11. If the frequency will result in a change to the voltage + * requirement, follow the Display Voltage Frequency Switching - + * Sequence After Frequency Change. + * + * We handle this step in bxt_set_cdclk(). + */ +} + +void intel_mpllb_disable(struct intel_encoder *encoder) +{ + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + i915_reg_t enable_reg = (phy <= PHY_D ? + DG2_PLL_ENABLE(phy) : MG_PLL_ENABLE(0)); + + /* + * 1. If the frequency will result in a change to the voltage + * requirement, follow the Display Voltage Frequency Switching - + * Sequence Before Frequency Change. + * + * We handle this step in bxt_set_cdclk(). + */ + + /* 2. Software programs DPLL_ENABLE [PLL Enable] to "0" */ + intel_uncore_rmw(&dev_priv->uncore, enable_reg, PLL_ENABLE, 0); + + /* + * 4. Software programs SNPS_PHY_MPLLB_DIV dp_mpllb_force_en to "0". + * This will allow the PLL to stop running. + */ + intel_uncore_rmw(&dev_priv->uncore, SNPS_PHY_MPLLB_DIV(phy), + SNPS_PHY_MPLLB_FORCE_EN, 0); + + /* + * 5. Software polls DPLL_ENABLE [PLL Lock] for PHY acknowledgment + * (dp_txX_ack) that the new transmitter setting request is completed. + */ + if (intel_de_wait_for_clear(dev_priv, enable_reg, PLL_LOCK, 5)) + DRM_ERROR("Port %c PLL not locked\n", phy_name(phy)); + + /* + * 6. If the frequency will result in a change to the voltage + * requirement, follow the Display Voltage Frequency Switching - + * Sequence After Frequency Change. + * + * We handle this step in bxt_set_cdclk(). + */ +} diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h new file mode 100644 index 000000000000..205ab46f0b67 --- /dev/null +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef __INTEL_SNPS_PHY_H__ +#define __INTEL_SNPS_PHY_H__ + +struct intel_encoder; +struct intel_crtc_state; + +int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, + struct intel_encoder *encoder); +void intel_mpllb_enable(struct intel_encoder *encoder, + const struct intel_crtc_state *crtc_state); +void intel_mpllb_disable(struct intel_encoder *encoder); + +#endif /* __INTEL_SNPS_PHY_H__ */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index d58864c7adc6..c0417c994b97 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2286,6 +2286,57 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define MG_DP_MODE_CFG_DP_X2_MODE (1 << 7) #define MG_DP_MODE_CFG_DP_X1_MODE (1 << 6)
+/* + * DG2 SNPS PHY registers (TC1 = PHY_E) + */ +#define _SNPS_PHY_A_BASE 0x168000 +#define _SNPS_PHY_B_BASE 0x169000 +#define _SNPS_PHY(phy) _PHY(phy, \ + _SNPS_PHY_A_BASE, \ + _SNPS_PHY_B_BASE) +#define _SNPS2(phy, reg) (_SNPS_PHY(phy) - \ + _SNPS_PHY_A_BASE + (reg)) +#define _MMIO_SNPS(phy, reg) _MMIO(_SNPS2(phy, reg)) +#define _MMIO_SNPS_LN(ln, phy, reg) _MMIO(_SNPS2(phy, \ + (reg) + (ln) * 0x10)) + +#define SNPS_PHY_MPLLB_CP(phy) _MMIO_SNPS(phy, 0x168000) +#define SNPS_PHY_MPLLB_CP_INT REG_GENMASK(31, 25) +#define SNPS_PHY_MPLLB_CP_INT_GS REG_GENMASK(23, 17) +#define SNPS_PHY_MPLLB_CP_PROP REG_GENMASK(15, 9) +#define SNPS_PHY_MPLLB_CP_PROP_GS REG_GENMASK(7, 1) + +#define SNPS_PHY_MPLLB_DIV(phy) _MMIO_SNPS(phy, 0x168004) +#define SNPS_PHY_MPLLB_FORCE_EN REG_BIT(31) +#define SNPS_PHY_MPLLB_DIV5_CLK_EN REG_BIT(29) +#define SNPS_PHY_MPLLB_V2I REG_GENMASK(27, 26) +#define SNPS_PHY_MPLLB_FREQ_VCO REG_GENMASK(25, 24) +#define SNPS_PHY_MPLLB_PMIX_EN REG_BIT(10) +#define SNPS_PHY_MPLLB_TX_CLK_DIV REG_GENMASK(7, 5) + +#define SNPS_PHY_MPLLB_FRACN1(phy) _MMIO_SNPS(phy, 0x168008) +#define SNPS_PHY_MPLLB_FRACN_EN REG_BIT(31) +#define SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN REG_BIT(30) +#define SNPS_PHY_MPLLB_FRACN_DEN REG_GENMASK(15, 0) + +#define SNPS_PHY_MPLLB_FRACN2(phy) _MMIO_SNPS(phy, 0x16800C) +#define SNPS_PHY_MPLLB_FRACN_REM REG_GENMASK(31, 16) +#define SNPS_PHY_MPLLB_FRACN_QUOT REG_GENMASK(15, 0) + +#define SNPS_PHY_MPLLB_SSCEN(phy) _MMIO_SNPS(phy, 0x168014) +#define SNPS_PHY_MPLLB_SSC_EN REG_BIT(31) +#define SNPS_PHY_MPLLB_SSC_PEAK REG_GENMASK(29, 10) + +#define SNPS_PHY_MPLLB_SSCSTEP(phy) _MMIO_SNPS(phy, 0x168018) +#define SNPS_PHY_MPLLB_SSC_STEPSIZE REG_GENMASK(31, 11) + +#define SNPS_PHY_MPLLB_DIV2(phy) _MMIO_SNPS(phy, 0x16801C) +#define SNPS_PHY_MPLLB_REF_CLK_DIV REG_GENMASK(14, 12) +#define SNPS_PHY_MPLLB_MULTIPLIER REG_GENMASK(11, 0) + +#define SNPS_PHY_REF_CONTROL(phy) _MMIO_SNPS(phy, 0x168188) +#define SNPS_PHY_REF_CONTROL_REF_RANGE REG_GENMASK(31, 27) + /* The spec defines this only for BXT PHY0, but lets assume that this * would exist for PHY1 too if it had a second channel. */ @@ -10601,6 +10652,11 @@ enum skl_power_gate { #define CNL_DPLL_ENABLE(pll) _MMIO_PLL3(pll, DPLL0_ENABLE, DPLL1_ENABLE, \ _ADLS_DPLL2_ENABLE, _ADLS_DPLL3_ENABLE)
+#define _DG2_PLL3_ENABLE 0x4601C + +#define DG2_PLL_ENABLE(pll) _MMIO_PLL3(pll, DPLL0_ENABLE, DPLL1_ENABLE, \ + _ADLS_DPLL2_ENABLE, _DG2_PLL3_ENABLE) + #define TBT_PLL_ENABLE _MMIO(0x46020)
#define _MG_PLL1_ENABLE 0x46030
At the moment we don't have a proper algorithm that can be used to calculate PHY settings for arbitrary HDMI link rates. The PHY tables here should support the regular modes of real-world HDMI monitors.
Bspec: 54032 Cc: Matt Atwood matthew.s.atwood@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Vandita Kulkarni vandita.kulkarni@intel.com --- drivers/gpu/drm/i915/display/intel_ddi.c | 14 +- drivers/gpu/drm/i915/display/intel_display.c | 47 +++ drivers/gpu/drm/i915/display/intel_hdmi.c | 11 + drivers/gpu/drm/i915/display/intel_snps_phy.c | 286 +++++++++++++++++- drivers/gpu/drm/i915/display/intel_snps_phy.h | 7 + drivers/gpu/drm/i915/i915_reg.h | 3 + 6 files changed, 355 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 26a3aa73fcc4..929a95ddb316 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -51,6 +51,7 @@ #include "intel_panel.h" #include "intel_pps.h" #include "intel_psr.h" +#include "intel_snps_phy.h" #include "intel_sprite.h" #include "intel_tc.h" #include "intel_vdsc.h" @@ -3745,6 +3746,15 @@ void intel_ddi_get_clock(struct intel_encoder *encoder, &crtc_state->dpll_hw_state); }
+static void dg2_ddi_get_config(struct intel_encoder *encoder, + struct intel_crtc_state *crtc_state) +{ + intel_mpllb_readout_hw_state(encoder, &crtc_state->mpllb_state); + crtc_state->port_clock = intel_mpllb_calc_port_clock(encoder, &crtc_state->mpllb_state); + + intel_ddi_get_config(encoder, crtc_state); +} + static void adls_ddi_get_config(struct intel_encoder *encoder, struct intel_crtc_state *crtc_state) { @@ -4606,7 +4616,9 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->cloneable = 0; encoder->pipe_mask = ~0;
- if (IS_ALDERLAKE_S(dev_priv)) { + if (IS_DG2(dev_priv)) { + encoder->get_config = dg2_ddi_get_config; + } else if (IS_ALDERLAKE_S(dev_priv)) { encoder->enable_clock = adls_ddi_enable_clock; encoder->disable_clock = adls_ddi_disable_clock; encoder->is_clock_enabled = adls_ddi_is_clock_enabled; diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 91f6964ec406..cce520b6dfcf 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9113,6 +9113,52 @@ verify_shared_dpll_state(struct intel_crtc *crtc, } }
+static void +verify_mpllb_state(struct intel_atomic_state *state, + struct intel_crtc_state *new_crtc_state) +{ + struct drm_i915_private *i915 = to_i915(state->base.dev); + struct intel_mpllb_state mpllb_hw_state = { 0 }; + struct intel_mpllb_state *mpllb_sw_state = &new_crtc_state->mpllb_state; + struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->uapi.crtc); + struct intel_encoder *encoder; + + if (!IS_DG2(i915)) + return; + + if (!new_crtc_state->hw.active) + return; + + encoder = intel_get_crtc_new_encoder(state, new_crtc_state); + intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state); + +#define MPLLB_CHECK(name) do { \ + if (mpllb_sw_state->name != mpllb_hw_state.name) { \ + pipe_config_mismatch(false, crtc, "MPLLB:" __stringify(name), \ + "(expected 0x%08x, found 0x%08x)", \ + mpllb_sw_state->name, \ + mpllb_hw_state.name); \ + } \ +} while (0) + + MPLLB_CHECK(mpllb_cp); + MPLLB_CHECK(mpllb_div); + MPLLB_CHECK(mpllb_div2); + MPLLB_CHECK(mpllb_fracn1); + MPLLB_CHECK(mpllb_fracn2); + MPLLB_CHECK(mpllb_sscen); + MPLLB_CHECK(mpllb_sscstep); + + /* + * ref_control is handled by the hardware/firemware and never + * programmed by the software, but the proper values are supplied + * in the bspec for verification purposes. + */ + MPLLB_CHECK(ref_control); + +#undef MPLLB_CHECK +} + static void intel_modeset_verify_crtc(struct intel_crtc *crtc, struct intel_atomic_state *state, @@ -9126,6 +9172,7 @@ intel_modeset_verify_crtc(struct intel_crtc *crtc, verify_connector_state(state, crtc); verify_crtc_state(crtc, old_crtc_state, new_crtc_state); verify_shared_dpll_state(crtc, old_crtc_state, new_crtc_state); + verify_mpllb_state(state, new_crtc_state); }
static void diff --git a/drivers/gpu/drm/i915/display/intel_hdmi.c b/drivers/gpu/drm/i915/display/intel_hdmi.c index 852af2b23540..b04685bb6439 100644 --- a/drivers/gpu/drm/i915/display/intel_hdmi.c +++ b/drivers/gpu/drm/i915/display/intel_hdmi.c @@ -51,6 +51,7 @@ #include "intel_hdmi.h" #include "intel_lspcon.h" #include "intel_panel.h" +#include "intel_snps_phy.h"
static struct drm_device *intel_hdmi_to_dev(struct intel_hdmi *intel_hdmi) { @@ -1850,6 +1851,16 @@ hdmi_port_clock_valid(struct intel_hdmi *hdmi, if (IS_CHERRYVIEW(dev_priv) && clock > 216000 && clock < 240000) return MODE_CLOCK_RANGE;
+ /* + * SNPS PHYs' MPLLB table-based programming can only handle a fixed + * set of link rates. + * + * FIXME: We will hopefully get an algorithmic way of programming + * the MPLLB for HDMI in the future. + */ + if (IS_DG2(dev_priv)) + return intel_snps_phy_check_hdmi_link_rate(clock); + return MODE_OK; }
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 6d9205906595..1317b4e94b50 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -3,6 +3,8 @@ * Copyright © 2019 Intel Corporation */
+#include <linux/util_macros.h> + #include "intel_de.h" #include "intel_display_types.h" #include "intel_snps_phy.h" @@ -375,14 +377,172 @@ static const struct intel_mpllb_state *dg2_edp_tables[] = { NULL, };
-int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, - struct intel_encoder *encoder) -{ - const struct intel_mpllb_state **tables; - int i; +/* + * HDMI link rates with 100 MHz reference clock. + */ + +static const struct intel_mpllb_state dg2_hdmi_25_175 = { + .clock = 25175, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 15) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 128) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 143), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 36663) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 71), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), +}; + +static const struct intel_mpllb_state dg2_hdmi_27_0 = { + .clock = 27000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 15) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 5) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 140) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 26214) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), +}; + +static const struct intel_mpllb_state dg2_hdmi_74_25 = { + .clock = 74250, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 15) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 3) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 86) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 26214) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), +}; + +static const struct intel_mpllb_state dg2_hdmi_148_5 = { + .clock = 148500, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 15) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_TX_CLK_DIV, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 86) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 26214) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), +}; + +static const struct intel_mpllb_state dg2_hdmi_594 = { + .clock = 594000, + .ref_control = + REG_FIELD_PREP(SNPS_PHY_REF_CONTROL_REF_RANGE, 3), + .mpllb_cp = + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT, 4) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP, 15) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_INT_GS, 64) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_CP_PROP_GS, 124), + .mpllb_div = + REG_FIELD_PREP(SNPS_PHY_MPLLB_DIV5_CLK_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_PMIX_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_V2I, 2) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FREQ_VCO, 3), + .mpllb_div2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_REF_CLK_DIV, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_MULTIPLIER, 86) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_HDMI_DIV, 1), + .mpllb_fracn1 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_CGG_UPDATE_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_EN, 1) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 5), + .mpllb_fracn2 = + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 26214) | + REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_REM, 2), + .mpllb_sscen = + REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1), +};
+static const struct intel_mpllb_state *dg2_hdmi_tables[] = { + &dg2_hdmi_25_175, + &dg2_hdmi_27_0, + &dg2_hdmi_74_25, + &dg2_hdmi_148_5, + &dg2_hdmi_594, + NULL, +}; + +static const struct intel_mpllb_state ** +intel_mpllb_tables_get(struct intel_crtc_state *crtc_state, + struct intel_encoder *encoder) +{ if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_EDP)) { - tables = dg2_edp_tables; + return dg2_edp_tables; } else if (intel_crtc_has_dp_encoder(crtc_state)) { /* * FIXME: Initially we're just enabling the "combo" outputs on @@ -397,15 +557,41 @@ int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, * that to determine which table to use. */ if (0) - tables = dg2_dp_38_4_tables; + return dg2_dp_38_4_tables; else - tables = dg2_dp_100_tables; - } else { - /* TODO: Add HDMI support */ - MISSING_CASE(encoder->type); - return -EINVAL; + return dg2_dp_100_tables; + } else if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI)) { + return dg2_hdmi_tables; }
+ MISSING_CASE(encoder->type); + return NULL; +} + +int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, + struct intel_encoder *encoder) +{ + const struct intel_mpllb_state **tables; + int i; + + if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI)) { + if (intel_snps_phy_check_hdmi_link_rate(crtc_state->port_clock) + != MODE_OK) { + /* + * FIXME: Can only support fixed HDMI frequencies + * until we have a proper algorithm under a valid + * license. + */ + DRM_DEBUG_KMS("Can't support HDMI link rate %d\n", + crtc_state->port_clock); + return -EINVAL; + } + } + + tables = intel_mpllb_tables_get(crtc_state, encoder); + if (!tables) + return -EINVAL; + for (i = 0; tables[i]; i++) { if (crtc_state->port_clock <= tables[i]->clock) { crtc_state->mpllb_state = *tables[i]; @@ -515,3 +701,79 @@ void intel_mpllb_disable(struct intel_encoder *encoder) * We handle this step in bxt_set_cdclk(). */ } + +int intel_mpllb_calc_port_clock(struct intel_encoder *encoder, + const struct intel_mpllb_state *pll_state) +{ + unsigned int frac_quot = 0, frac_rem = 0, frac_den = 1; + unsigned int multiplier, tx_clk_div, refclk; + bool frac_en; + + if (0) + refclk = 38400; + else + refclk = 100000; + + refclk >>= REG_FIELD_GET(SNPS_PHY_MPLLB_REF_CLK_DIV, pll_state->mpllb_div2) - 1; + + frac_en = REG_FIELD_GET(SNPS_PHY_MPLLB_FRACN_EN, pll_state->mpllb_fracn1); + + if (frac_en) { + frac_quot = REG_FIELD_GET(SNPS_PHY_MPLLB_FRACN_QUOT, pll_state->mpllb_fracn2); + frac_rem = REG_FIELD_GET(SNPS_PHY_MPLLB_FRACN_REM, pll_state->mpllb_fracn2); + frac_den = REG_FIELD_GET(SNPS_PHY_MPLLB_FRACN_DEN, pll_state->mpllb_fracn1); + } + + multiplier = REG_FIELD_GET(SNPS_PHY_MPLLB_MULTIPLIER, pll_state->mpllb_div2) / 2 + 16; + + tx_clk_div = REG_FIELD_GET(SNPS_PHY_MPLLB_TX_CLK_DIV, pll_state->mpllb_div); + + return DIV_ROUND_CLOSEST_ULL(mul_u32_u32(refclk, (multiplier << 16) + frac_quot) + + DIV_ROUND_CLOSEST(refclk * frac_rem, frac_den), + 10 << (tx_clk_div + 16)); +} + +void intel_mpllb_readout_hw_state(struct intel_encoder *encoder, + struct intel_mpllb_state *pll_state) +{ + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + + pll_state->mpllb_cp = intel_de_read(dev_priv, SNPS_PHY_MPLLB_CP(phy)); + pll_state->mpllb_div = intel_de_read(dev_priv, SNPS_PHY_MPLLB_DIV(phy)); + pll_state->mpllb_div2 = intel_de_read(dev_priv, SNPS_PHY_MPLLB_DIV2(phy)); + pll_state->mpllb_sscen = intel_de_read(dev_priv, SNPS_PHY_MPLLB_SSCEN(phy)); + pll_state->mpllb_sscstep = intel_de_read(dev_priv, SNPS_PHY_MPLLB_SSCSTEP(phy)); + pll_state->mpllb_fracn1 = intel_de_read(dev_priv, SNPS_PHY_MPLLB_FRACN1(phy)); + pll_state->mpllb_fracn2 = intel_de_read(dev_priv, SNPS_PHY_MPLLB_FRACN2(phy)); + + /* + * REF_CONTROL is under firmware control and never programmed by the + * driver; we read it only for sanity checking purposes. The bspec + * only tells us the expected value for one field in this register, + * so we'll only read out those specific bits here. + */ + pll_state->ref_control = intel_de_read(dev_priv, SNPS_PHY_REF_CONTROL(phy)) & + SNPS_PHY_REF_CONTROL_REF_RANGE; + + /* + * MPLLB_DIV is programmed twice, once with the software-computed + * state, then again with the MPLLB_FORCE_EN bit added. Drop that + * extra bit during readout so that we return the actual expected + * software state. + */ + pll_state->mpllb_div &= ~SNPS_PHY_MPLLB_FORCE_EN; +} + +int intel_snps_phy_check_hdmi_link_rate(int clock) +{ + const struct intel_mpllb_state **tables = dg2_hdmi_tables; + int i; + + for (i = 0; tables[i]; i++) { + if (clock == tables[i]->clock) + return MODE_OK; + } + + return MODE_CLOCK_RANGE; +} diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index 205ab46f0b67..ca4c2a25182b 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -8,11 +8,18 @@
struct intel_encoder; struct intel_crtc_state; +struct intel_mpllb_state;
int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, struct intel_encoder *encoder); void intel_mpllb_enable(struct intel_encoder *encoder, const struct intel_crtc_state *crtc_state); void intel_mpllb_disable(struct intel_encoder *encoder); +void intel_mpllb_readout_hw_state(struct intel_encoder *encoder, + struct intel_mpllb_state *pll_state); +int intel_mpllb_calc_port_clock(struct intel_encoder *encoder, + const struct intel_mpllb_state *pll_state); + +int intel_snps_phy_check_hdmi_link_rate(int clock);
#endif /* __INTEL_SNPS_PHY_H__ */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index c0417c994b97..15465f9cf9ab 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2325,12 +2325,15 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
#define SNPS_PHY_MPLLB_SSCEN(phy) _MMIO_SNPS(phy, 0x168014) #define SNPS_PHY_MPLLB_SSC_EN REG_BIT(31) +#define SNPS_PHY_MPLLB_SSC_UP_SPREAD REG_BIT(30) #define SNPS_PHY_MPLLB_SSC_PEAK REG_GENMASK(29, 10)
#define SNPS_PHY_MPLLB_SSCSTEP(phy) _MMIO_SNPS(phy, 0x168018) #define SNPS_PHY_MPLLB_SSC_STEPSIZE REG_GENMASK(31, 11)
#define SNPS_PHY_MPLLB_DIV2(phy) _MMIO_SNPS(phy, 0x16801C) +#define SNPS_PHY_MPLLB_HDMI_PIXEL_CLK_DIV REG_GENMASK(19, 18) +#define SNPS_PHY_MPLLB_HDMI_DIV REG_GENMASK(17, 15) #define SNPS_PHY_MPLLB_REF_CLK_DIV REG_GENMASK(14, 12) #define SNPS_PHY_MPLLB_MULTIPLIER REG_GENMASK(11, 0)
Vswing programming for SNPS PHYs is just a single step -- look up the value that corresponds to the voltage level from a table and program it into the SNPS_PHY_TX_EQ register.
Bspec: 53920 Cc: Matt Atwood matthew.s.atwood@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com --- drivers/gpu/drm/i915/display/intel_ddi.c | 23 ++++++-- drivers/gpu/drm/i915/display/intel_snps_phy.c | 54 +++++++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 4 ++ drivers/gpu/drm/i915/i915_reg.h | 5 ++ 4 files changed, 83 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 929a95ddb316..ade03cf41caa 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -1496,6 +1496,16 @@ static int intel_ddi_dp_level(struct intel_dp *intel_dp) return translate_signal_level(intel_dp, signal_levels); }
+static void +dg2_set_signal_levels(struct intel_dp *intel_dp, + const struct intel_crtc_state *crtc_state) +{ + struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base; + int level = intel_ddi_dp_level(intel_dp); + + intel_snps_phy_ddi_vswing_sequence(encoder, level); +} + static void tgl_set_signal_levels(struct intel_dp *intel_dp, const struct intel_crtc_state *crtc_state) @@ -2563,7 +2573,10 @@ static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, */
/* 7.e Configure voltage swing and related IO settings */ - tgl_ddi_vswing_sequence(encoder, crtc_state, level); + if (IS_DG2(dev_priv)) + intel_snps_phy_ddi_vswing_sequence(encoder, level); + else + tgl_ddi_vswing_sequence(encoder, crtc_state, level);
/* * 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up @@ -3102,7 +3115,9 @@ static void intel_enable_ddi_hdmi(struct intel_atomic_state *state, "[CONNECTOR:%d:%s] Failed to configure sink scrambling/TMDS bit clock ratio\n", connector->base.id, connector->name);
- if (DISPLAY_VER(dev_priv) >= 12) + if (IS_DG2(dev_priv)) + intel_snps_phy_ddi_vswing_sequence(encoder, U32_MAX); + else if (DISPLAY_VER(dev_priv) >= 12) tgl_ddi_vswing_sequence(encoder, crtc_state, level); else if (DISPLAY_VER(dev_priv) == 11) icl_ddi_vswing_sequence(encoder, crtc_state, level); @@ -4075,7 +4090,9 @@ intel_ddi_init_dp_connector(struct intel_digital_port *dig_port) dig_port->dp.set_link_train = intel_ddi_set_link_train; dig_port->dp.set_idle_link_train = intel_ddi_set_idle_link_train;
- if (DISPLAY_VER(dev_priv) >= 12) + if (IS_DG2(dev_priv)) + dig_port->dp.set_signal_levels = dg2_set_signal_levels; + else if (DISPLAY_VER(dev_priv) >= 12) dig_port->dp.set_signal_levels = tgl_set_signal_levels; else if (DISPLAY_VER(dev_priv) >= 11) dig_port->dp.set_signal_levels = icl_set_signal_levels; diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 1317b4e94b50..77759bda98a4 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -21,6 +21,60 @@ * since it is not handled by the shared DPLL framework as on other platforms. */
+static const u32 dg2_ddi_translations[] = { + /* VS 0, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), + + /* VS 0, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 33) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 6), + + /* VS 0, pre-emph 2 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 38) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 12), + + /* VS 0, pre-emph 3 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 43) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 19), + + /* VS 1, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 39), + + /* VS 1, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 44) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 8), + + /* VS 1, pre-emph 2 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 47) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 15), + + /* VS 2, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 52), + + /* VS 2, pre-emph 1 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 51) | + REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 10), + + /* VS 3, pre-emph 0 */ + REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 62), +}; + +void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder, + u32 level) +{ + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + int n_entries, ln; + + n_entries = ARRAY_SIZE(dg2_ddi_translations); + if (level >= n_entries) + level = n_entries - 1; + + for (ln = 0; ln < 4; ln++) + intel_de_write(dev_priv, SNPS_PHY_TX_EQ(ln, phy), + dg2_ddi_translations[level]); +} + /* * Basic DP link rates with 100 MHz reference clock. */ diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index ca4c2a25182b..3ce92d424f66 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -6,6 +6,8 @@ #ifndef __INTEL_SNPS_PHY_H__ #define __INTEL_SNPS_PHY_H__
+#include <linux/types.h> + struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state; @@ -21,5 +23,7 @@ int intel_mpllb_calc_port_clock(struct intel_encoder *encoder, const struct intel_mpllb_state *pll_state);
int intel_snps_phy_check_hdmi_link_rate(int clock); +void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder, + u32 level);
#endif /* __INTEL_SNPS_PHY_H__ */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 15465f9cf9ab..203056b9f02c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2340,6 +2340,11 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define SNPS_PHY_REF_CONTROL(phy) _MMIO_SNPS(phy, 0x168188) #define SNPS_PHY_REF_CONTROL_REF_RANGE REG_GENMASK(31, 27)
+#define SNPS_PHY_TX_EQ(ln, phy) _MMIO_SNPS_LN(ln, phy, 0x168300) +#define SNPS_PHY_TX_EQ_MAIN REG_GENMASK(23, 18) +#define SNPS_PHY_TX_EQ_POST REG_GENMASK(15, 10) +#define SNPS_PHY_TX_EQ_PRE REG_GENMASK(7, 2) + /* The spec defines this only for BXT PHY0, but lets assume that this * would exist for PHY1 too if it had a second channel. */
On Thu, 01 Jul 2021, Matt Roper matthew.d.roper@intel.com wrote:
Vswing programming for SNPS PHYs is just a single step -- look up the value that corresponds to the voltage level from a table and program it into the SNPS_PHY_TX_EQ register.
I've got some patches to turn this to the same ddi buf trans mechanism that everything else uses. Seems like this patch tries to bypass it because it's simple, but then we end up using encoder->get_buf_trans() anyway, for example in intel_ddi_dp_voltage_max(), and it returns some tgl buf trans tables... I guess it works by coincidence. :|
Anyway, as I said, I've got the patches, and I can fix this up afterwards.
BR, Jani.
Bspec: 53920 Cc: Matt Atwood matthew.s.atwood@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Jani Nikula jani.nikula@intel.com
drivers/gpu/drm/i915/display/intel_ddi.c | 23 ++++++-- drivers/gpu/drm/i915/display/intel_snps_phy.c | 54 +++++++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 4 ++ drivers/gpu/drm/i915/i915_reg.h | 5 ++ 4 files changed, 83 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 929a95ddb316..ade03cf41caa 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -1496,6 +1496,16 @@ static int intel_ddi_dp_level(struct intel_dp *intel_dp) return translate_signal_level(intel_dp, signal_levels); }
+static void +dg2_set_signal_levels(struct intel_dp *intel_dp,
const struct intel_crtc_state *crtc_state)
+{
- struct intel_encoder *encoder = &dp_to_dig_port(intel_dp)->base;
- int level = intel_ddi_dp_level(intel_dp);
- intel_snps_phy_ddi_vswing_sequence(encoder, level);
+}
static void tgl_set_signal_levels(struct intel_dp *intel_dp, const struct intel_crtc_state *crtc_state) @@ -2563,7 +2573,10 @@ static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, */
/* 7.e Configure voltage swing and related IO settings */
- tgl_ddi_vswing_sequence(encoder, crtc_state, level);
if (IS_DG2(dev_priv))
intel_snps_phy_ddi_vswing_sequence(encoder, level);
else
tgl_ddi_vswing_sequence(encoder, crtc_state, level);
/*
- 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up
@@ -3102,7 +3115,9 @@ static void intel_enable_ddi_hdmi(struct intel_atomic_state *state, "[CONNECTOR:%d:%s] Failed to configure sink scrambling/TMDS bit clock ratio\n", connector->base.id, connector->name);
- if (DISPLAY_VER(dev_priv) >= 12)
- if (IS_DG2(dev_priv))
intel_snps_phy_ddi_vswing_sequence(encoder, U32_MAX);
- else if (DISPLAY_VER(dev_priv) >= 12) tgl_ddi_vswing_sequence(encoder, crtc_state, level); else if (DISPLAY_VER(dev_priv) == 11) icl_ddi_vswing_sequence(encoder, crtc_state, level);
@@ -4075,7 +4090,9 @@ intel_ddi_init_dp_connector(struct intel_digital_port *dig_port) dig_port->dp.set_link_train = intel_ddi_set_link_train; dig_port->dp.set_idle_link_train = intel_ddi_set_idle_link_train;
- if (DISPLAY_VER(dev_priv) >= 12)
- if (IS_DG2(dev_priv))
dig_port->dp.set_signal_levels = dg2_set_signal_levels;
- else if (DISPLAY_VER(dev_priv) >= 12) dig_port->dp.set_signal_levels = tgl_set_signal_levels; else if (DISPLAY_VER(dev_priv) >= 11) dig_port->dp.set_signal_levels = icl_set_signal_levels;
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 1317b4e94b50..77759bda98a4 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -21,6 +21,60 @@
- since it is not handled by the shared DPLL framework as on other platforms.
*/
+static const u32 dg2_ddi_translations[] = {
- /* VS 0, pre-emph 0 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26),
- /* VS 0, pre-emph 1 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 33) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 6),
- /* VS 0, pre-emph 2 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 38) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 12),
- /* VS 0, pre-emph 3 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 43) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 19),
- /* VS 1, pre-emph 0 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 39),
- /* VS 1, pre-emph 1 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 44) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 8),
- /* VS 1, pre-emph 2 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 47) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 15),
- /* VS 2, pre-emph 0 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 52),
- /* VS 2, pre-emph 1 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 51) |
REG_FIELD_PREP(SNPS_PHY_TX_EQ_POST, 10),
- /* VS 3, pre-emph 0 */
- REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 62),
+};
+void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder,
u32 level)
+{
- struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
- enum phy phy = intel_port_to_phy(dev_priv, encoder->port);
- int n_entries, ln;
- n_entries = ARRAY_SIZE(dg2_ddi_translations);
- if (level >= n_entries)
level = n_entries - 1;
- for (ln = 0; ln < 4; ln++)
intel_de_write(dev_priv, SNPS_PHY_TX_EQ(ln, phy),
dg2_ddi_translations[level]);
+}
/*
- Basic DP link rates with 100 MHz reference clock.
*/ diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index ca4c2a25182b..3ce92d424f66 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -6,6 +6,8 @@ #ifndef __INTEL_SNPS_PHY_H__ #define __INTEL_SNPS_PHY_H__
+#include <linux/types.h>
struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state; @@ -21,5 +23,7 @@ int intel_mpllb_calc_port_clock(struct intel_encoder *encoder, const struct intel_mpllb_state *pll_state);
int intel_snps_phy_check_hdmi_link_rate(int clock); +void intel_snps_phy_ddi_vswing_sequence(struct intel_encoder *encoder,
u32 level);
#endif /* __INTEL_SNPS_PHY_H__ */ diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 15465f9cf9ab..203056b9f02c 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2340,6 +2340,11 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define SNPS_PHY_REF_CONTROL(phy) _MMIO_SNPS(phy, 0x168188) #define SNPS_PHY_REF_CONTROL_REF_RANGE REG_GENMASK(31, 27)
+#define SNPS_PHY_TX_EQ(ln, phy) _MMIO_SNPS_LN(ln, phy, 0x168300) +#define SNPS_PHY_TX_EQ_MAIN REG_GENMASK(23, 18) +#define SNPS_PHY_TX_EQ_POST REG_GENMASK(15, 10) +#define SNPS_PHY_TX_EQ_PRE REG_GENMASK(7, 2)
/* The spec defines this only for BXT PHY0, but lets assume that this
- would exist for PHY1 too if it had a second channel.
*/
DG2 has some changes to the expected modesetting sequences when compared to gen12. Adjust our driver logic accordingly. Although the DP sequence is pretty similar to TGL's, there are some steps that change, so let's split the handling for that out into a separate function.
Bspec: 54128 Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_ddi.c | 135 +++++++++++++++++++++-- 1 file changed, 127 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index ade03cf41caa..5499a2975a0e 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,14 +172,22 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, enum port port) { + int ret; + /* Wait > 518 usecs for DDI_BUF_CTL to be non idle */ if (DISPLAY_VER(dev_priv) < 10) { usleep_range(518, 1000); return; }
- if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & - DDI_BUF_IS_IDLE), 500)) + if (IS_DG2(dev_priv)) + ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + DDI_BUF_IS_IDLE), 1200); + else + ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) & + DDI_BUF_IS_IDLE), 500); + + if (ret) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get active\n", port_name(port)); } @@ -2207,7 +2215,7 @@ void intel_ddi_sanitize_encoder_pll_mapping(struct intel_encoder *encoder) ddi_clk_needed = false; }
- if (ddi_clk_needed || !encoder->disable_clock || + if (ddi_clk_needed || !encoder->is_clock_enabled || !encoder->is_clock_enabled(encoder)) return;
@@ -2488,6 +2496,116 @@ static void intel_ddi_mso_configure(const struct intel_crtc_state *crtc_state) OVERLAP_PIXELS_MASK, dss1); }
+static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, + struct intel_encoder *encoder, + const struct intel_crtc_state *crtc_state, + const struct drm_connector_state *conn_state) +{ + struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); + enum phy phy = intel_port_to_phy(dev_priv, encoder->port); + struct intel_digital_port *dig_port = enc_to_dig_port(encoder); + bool is_mst = intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DP_MST); + int level = intel_ddi_dp_level(intel_dp); + + intel_dp_set_link_params(intel_dp, crtc_state->port_clock, + crtc_state->lane_count); + + /* + * 1. Enable Power Wells + * + * This was handled at the beginning of intel_atomic_commit_tail(), + * before we called down into this function. + */ + + /* 2. Enable Panel Power if PPS is required */ + intel_pps_on(intel_dp); + + /* + * 3. Enable the port PLL. + */ + intel_ddi_enable_clock(encoder, crtc_state); + + /* 4. Enable IO power */ + if (!intel_phy_is_tc(dev_priv, phy) || + dig_port->tc_mode != TC_PORT_TBT_ALT) + dig_port->ddi_io_wakeref = intel_display_power_get(dev_priv, + dig_port->ddi_io_power_domain); + + /* + * 5. The rest of the below are substeps under the bspec's "Enable and + * Train Display Port" step. Note that steps that are specific to + * MST will be handled by intel_mst_pre_enable_dp() before/after it + * calls into this function. Also intel_mst_pre_enable_dp() only calls + * us when active_mst_links==0, so any steps designated for "single + * stream or multi-stream master transcoder" can just be performed + * unconditionally here. + */ + + /* + * 5.a Configure Transcoder Clock Select to direct the Port clock to the + * Transcoder. + */ + intel_ddi_enable_pipe_clock(encoder, crtc_state); + + /* 5.b Not relevant to i915 for now */ + + /* + * 5.c Configure TRANS_DDI_FUNC_CTL DDI Select, DDI Mode Select & MST + * Transport Select + */ + intel_ddi_config_transcoder_func(encoder, crtc_state); + + /* + * 5.d Configure & enable DP_TP_CTL with link training pattern 1 + * selected + * + * This will be handled by the intel_dp_start_link_train() farther + * down this function. + */ + + /* 5.e Configure voltage swing and related IO settings */ + intel_snps_phy_ddi_vswing_sequence(encoder, level); + + /* + * 5.f Configure and enable DDI_BUF_CTL + * 5.g Wait for DDI_BUF_CTL DDI Idle Status = 0b (Not Idle), timeout + * after 1200 us. + * + * We only configure what the register value will be here. Actual + * enabling happens during link training farther down. + */ + intel_ddi_init_dp_buf_reg(encoder, crtc_state); + + if (!is_mst) + intel_dp_set_power(intel_dp, DP_SET_POWER_D0); + + intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true); + /* + * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit + * in the FEC_CONFIGURATION register to 1 before initiating link + * training + */ + intel_dp_sink_set_fec_ready(intel_dp, crtc_state); + + /* + * 5.h Follow DisplayPort specification training sequence (see notes for + * failure handling) + * 5.i If DisplayPort multi-stream - Set DP_TP_CTL link training to Idle + * Pattern, wait for 5 idle patterns (DP_TP_STATUS Min_Idles_Sent) + * (timeout after 800 us) + */ + intel_dp_start_link_train(intel_dp, crtc_state); + + /* 5.j Set DP_TP_CTL link training to Normal */ + if (!is_trans_port_sync_mode(crtc_state)) + intel_dp_stop_link_train(intel_dp, crtc_state); + + /* 5.k Configure and enable FEC if needed */ + intel_ddi_enable_fec(encoder, crtc_state); + intel_dsc_enable(encoder, crtc_state); +} + static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, struct intel_encoder *encoder, const struct intel_crtc_state *crtc_state, @@ -2573,10 +2691,7 @@ static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, */
/* 7.e Configure voltage swing and related IO settings */ - if (IS_DG2(dev_priv)) - intel_snps_phy_ddi_vswing_sequence(encoder, level); - else - tgl_ddi_vswing_sequence(encoder, crtc_state, level); + tgl_ddi_vswing_sequence(encoder, crtc_state, level);
/* * 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up @@ -2708,7 +2823,9 @@ static void intel_ddi_pre_enable_dp(struct intel_atomic_state *state, { struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
- if (DISPLAY_VER(dev_priv) >= 12) + if (IS_DG2(dev_priv)) + dg2_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state); + else if (DISPLAY_VER(dev_priv) >= 12) tgl_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state); else hsw_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state); @@ -4634,6 +4751,8 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->pipe_mask = ~0;
if (IS_DG2(dev_priv)) { + encoder->enable_clock = intel_mpllb_enable; + encoder->disable_clock = intel_mpllb_disable; encoder->get_config = dg2_ddi_get_config; } else if (IS_ALDERLAKE_S(dev_priv)) { encoder->enable_clock = adls_ddi_enable_clock;
On Thu, 01 Jul 2021, Matt Roper matthew.d.roper@intel.com wrote:
DG2 has some changes to the expected modesetting sequences when compared to gen12. Adjust our driver logic accordingly. Although the DP sequence is pretty similar to TGL's, there are some steps that change, so let's split the handling for that out into a separate function.
Bspec: 54128 Cc: Lucas De Marchi lucas.demarchi@intel.com Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/display/intel_ddi.c | 135 +++++++++++++++++++++-- 1 file changed, 127 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index ade03cf41caa..5499a2975a0e 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -172,14 +172,22 @@ void intel_wait_ddi_buf_idle(struct drm_i915_private *dev_priv, static void intel_wait_ddi_buf_active(struct drm_i915_private *dev_priv, enum port port) {
- int ret;
- /* Wait > 518 usecs for DDI_BUF_CTL to be non idle */ if (DISPLAY_VER(dev_priv) < 10) { usleep_range(518, 1000); return; }
- if (wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
DDI_BUF_IS_IDLE), 500))
- if (IS_DG2(dev_priv))
ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
DDI_BUF_IS_IDLE), 1200);
- else
ret = wait_for_us(!(intel_de_read(dev_priv, DDI_BUF_CTL(port)) &
DDI_BUF_IS_IDLE), 500);
Nitpick, this should really parametetrized the delay, 500 vs. 1200, not add duplicate the code.
BR, Jani.
- if (ret) drm_err(&dev_priv->drm, "Timeout waiting for DDI BUF %c to get active\n", port_name(port));
} @@ -2207,7 +2215,7 @@ void intel_ddi_sanitize_encoder_pll_mapping(struct intel_encoder *encoder) ddi_clk_needed = false; }
- if (ddi_clk_needed || !encoder->disable_clock ||
- if (ddi_clk_needed || !encoder->is_clock_enabled || !encoder->is_clock_enabled(encoder)) return;
@@ -2488,6 +2496,116 @@ static void intel_ddi_mso_configure(const struct intel_crtc_state *crtc_state) OVERLAP_PIXELS_MASK, dss1); }
+static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state,
struct intel_encoder *encoder,
const struct intel_crtc_state *crtc_state,
const struct drm_connector_state *conn_state)
+{
- struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
- struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
- enum phy phy = intel_port_to_phy(dev_priv, encoder->port);
- struct intel_digital_port *dig_port = enc_to_dig_port(encoder);
- bool is_mst = intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DP_MST);
- int level = intel_ddi_dp_level(intel_dp);
- intel_dp_set_link_params(intel_dp, crtc_state->port_clock,
crtc_state->lane_count);
- /*
* 1. Enable Power Wells
*
* This was handled at the beginning of intel_atomic_commit_tail(),
* before we called down into this function.
*/
- /* 2. Enable Panel Power if PPS is required */
- intel_pps_on(intel_dp);
- /*
* 3. Enable the port PLL.
*/
- intel_ddi_enable_clock(encoder, crtc_state);
- /* 4. Enable IO power */
- if (!intel_phy_is_tc(dev_priv, phy) ||
dig_port->tc_mode != TC_PORT_TBT_ALT)
dig_port->ddi_io_wakeref = intel_display_power_get(dev_priv,
dig_port->ddi_io_power_domain);
- /*
* 5. The rest of the below are substeps under the bspec's "Enable and
* Train Display Port" step. Note that steps that are specific to
* MST will be handled by intel_mst_pre_enable_dp() before/after it
* calls into this function. Also intel_mst_pre_enable_dp() only calls
* us when active_mst_links==0, so any steps designated for "single
* stream or multi-stream master transcoder" can just be performed
* unconditionally here.
*/
- /*
* 5.a Configure Transcoder Clock Select to direct the Port clock to the
* Transcoder.
*/
- intel_ddi_enable_pipe_clock(encoder, crtc_state);
- /* 5.b Not relevant to i915 for now */
- /*
* 5.c Configure TRANS_DDI_FUNC_CTL DDI Select, DDI Mode Select & MST
* Transport Select
*/
- intel_ddi_config_transcoder_func(encoder, crtc_state);
- /*
* 5.d Configure & enable DP_TP_CTL with link training pattern 1
* selected
*
* This will be handled by the intel_dp_start_link_train() farther
* down this function.
*/
- /* 5.e Configure voltage swing and related IO settings */
- intel_snps_phy_ddi_vswing_sequence(encoder, level);
- /*
* 5.f Configure and enable DDI_BUF_CTL
* 5.g Wait for DDI_BUF_CTL DDI Idle Status = 0b (Not Idle), timeout
* after 1200 us.
*
* We only configure what the register value will be here. Actual
* enabling happens during link training farther down.
*/
- intel_ddi_init_dp_buf_reg(encoder, crtc_state);
- if (!is_mst)
intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
- intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true);
- /*
* DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit
* in the FEC_CONFIGURATION register to 1 before initiating link
* training
*/
- intel_dp_sink_set_fec_ready(intel_dp, crtc_state);
- /*
* 5.h Follow DisplayPort specification training sequence (see notes for
* failure handling)
* 5.i If DisplayPort multi-stream - Set DP_TP_CTL link training to Idle
* Pattern, wait for 5 idle patterns (DP_TP_STATUS Min_Idles_Sent)
* (timeout after 800 us)
*/
- intel_dp_start_link_train(intel_dp, crtc_state);
- /* 5.j Set DP_TP_CTL link training to Normal */
- if (!is_trans_port_sync_mode(crtc_state))
intel_dp_stop_link_train(intel_dp, crtc_state);
- /* 5.k Configure and enable FEC if needed */
- intel_ddi_enable_fec(encoder, crtc_state);
- intel_dsc_enable(encoder, crtc_state);
+}
static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, struct intel_encoder *encoder, const struct intel_crtc_state *crtc_state, @@ -2573,10 +2691,7 @@ static void tgl_ddi_pre_enable_dp(struct intel_atomic_state *state, */
/* 7.e Configure voltage swing and related IO settings */
- if (IS_DG2(dev_priv))
intel_snps_phy_ddi_vswing_sequence(encoder, level);
- else
tgl_ddi_vswing_sequence(encoder, crtc_state, level);
tgl_ddi_vswing_sequence(encoder, crtc_state, level);
/*
- 7.f Combo PHY: Configure PORT_CL_DW10 Static Power Down to power up
@@ -2708,7 +2823,9 @@ static void intel_ddi_pre_enable_dp(struct intel_atomic_state *state, { struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
- if (DISPLAY_VER(dev_priv) >= 12)
- if (IS_DG2(dev_priv))
dg2_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state);
- else if (DISPLAY_VER(dev_priv) >= 12) tgl_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state); else hsw_ddi_pre_enable_dp(state, encoder, crtc_state, conn_state);
@@ -4634,6 +4751,8 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port) encoder->pipe_mask = ~0;
if (IS_DG2(dev_priv)) {
encoder->enable_clock = intel_mpllb_enable;
encoder->get_config = dg2_ddi_get_config; } else if (IS_ALDERLAKE_S(dev_priv)) { encoder->enable_clock = adls_ddi_enable_clock;encoder->disable_clock = intel_mpllb_disable;
Although the bspec labels four of DG2's outputs as "combo PHY," the underlying PHYs in both cases are actually Synopsys PHYs that are programmed completely differently than the traditional Intel "combo" PHY units. As such, we don't want intel_phy_is_combo to take us down legacy programming paths, so just return false from it on DG2. Instead add a new intel_phy_is_snps() that will return true for all DG2 PHYs.
Cc: Anusha Srivatsa anusha.srivatsa@intel.com Cc: Matt Atwood matthew.s.atwood@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display.c | 26 +++++++++++++++++++- drivers/gpu/drm/i915/display/intel_display.h | 1 + 2 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index cce520b6dfcf..9655f1b1b41b 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3698,6 +3698,13 @@ bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy) { if (phy == PHY_NONE) return false; + else if (IS_DG2(dev_priv)) + /* + * DG2 outputs labelled as "combo PHY" in the bspec use + * SNPS PHYs with completely different programming, + * hence we always return false here. + */ + return false; else if (IS_ALDERLAKE_S(dev_priv)) return phy <= PHY_E; else if (IS_DG1(dev_priv) || IS_ROCKETLAKE(dev_priv)) @@ -3712,7 +3719,10 @@ bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy)
bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy) { - if (IS_ALDERLAKE_P(dev_priv)) + if (IS_DG2(dev_priv)) + /* DG2's "TC1" output uses a SNPS PHY */ + return false; + else if (IS_ALDERLAKE_P(dev_priv)) return phy >= PHY_F && phy <= PHY_I; else if (IS_TIGERLAKE(dev_priv)) return phy >= PHY_D && phy <= PHY_I; @@ -3722,6 +3732,20 @@ bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy) return false; }
+bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy) +{ + if (phy == PHY_NONE) + return false; + else if (IS_DG2(dev_priv)) + /* + * All four "combo" ports and the TC1 port (PHY E) use + * Synopsis PHYs. + */ + return phy <= PHY_E; + + return false; +} + enum phy intel_port_to_phy(struct drm_i915_private *i915, enum port port) { if (DISPLAY_VER(i915) >= 13 && port >= PORT_D_XELPD) diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h index c9dbaf074d77..284936f0ddab 100644 --- a/drivers/gpu/drm/i915/display/intel_display.h +++ b/drivers/gpu/drm/i915/display/intel_display.h @@ -561,6 +561,7 @@ struct drm_display_mode * intel_encoder_current_mode(struct intel_encoder *encoder); bool intel_phy_is_combo(struct drm_i915_private *dev_priv, enum phy phy); bool intel_phy_is_tc(struct drm_i915_private *dev_priv, enum phy phy); +bool intel_phy_is_snps(struct drm_i915_private *dev_priv, enum phy phy); enum tc_port intel_port_to_tc(struct drm_i915_private *dev_priv, enum port port); int intel_get_pipe_from_crtc_id_ioctl(struct drm_device *dev, void *data,
Initialization of the PHY is handled by the hardware/firmware, but the driver should wait up to 25ms for the PHY to report that its calibration has completed.
Bspec: 49189 Bspec: 50107 Cc: Matt Atwood matthew.s.atwood@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../gpu/drm/i915/display/intel_display_power.c | 5 +++++ drivers/gpu/drm/i915/display/intel_snps_phy.c | 15 +++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 3 +++ drivers/gpu/drm/i915/i915_reg.h | 1 + 4 files changed, 24 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index df6358638fee..83bc2e691560 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -18,6 +18,7 @@ #include "intel_pm.h" #include "intel_pps.h" #include "intel_sideband.h" +#include "intel_snps_phy.h" #include "intel_tc.h" #include "intel_vga.h"
@@ -5899,6 +5900,10 @@ static void icl_display_core_init(struct drm_i915_private *dev_priv, if (DISPLAY_VER(dev_priv) >= 12) tgl_bw_buddy_init(dev_priv);
+ /* 8. Ensure PHYs have completed calibration and adaptation */ + if (IS_DG2(dev_priv)) + intel_snps_phy_wait_for_calibration(dev_priv); + if (resume && intel_dmc_has_payload(dev_priv)) intel_dmc_load_program(dev_priv);
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index 77759bda98a4..f0c30d3d2dfb 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -21,6 +21,21 @@ * since it is not handled by the shared DPLL framework as on other platforms. */
+void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv) +{ + enum phy phy; + + for_each_phy_masked(phy, ~0) { + if (!intel_phy_is_snps(dev_priv, phy)) + continue; + + if (intel_de_wait_for_clear(dev_priv, ICL_PHY_MISC(phy), + DG2_PHY_DP_TX_ACK_MASK, 25)) + DRM_ERROR("SNPS PHY %c failed to calibrate after 25ms.\n", + phy); + } +} + static const u32 dg2_ddi_translations[] = { /* VS 0, pre-emph 0 */ REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index 3ce92d424f66..6aa33ff729ec 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -8,10 +8,13 @@
#include <linux/types.h>
+struct drm_i915_private; struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state;
+void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv); + int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, struct intel_encoder *encoder); void intel_mpllb_enable(struct intel_encoder *encoder, diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 203056b9f02c..e3a165eb4fb6 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -12442,6 +12442,7 @@ enum skl_power_gate { _ICL_PHY_MISC_B) #define ICL_PHY_MISC_MUX_DDID (1 << 28) #define ICL_PHY_MISC_DE_IO_COMP_PWR_DOWN (1 << 23) +#define DG2_PHY_DP_TX_ACK_MASK REG_GENMASK(23, 20)
/* Icelake Display Stream Compression Registers */ #define DSCA_PICTURE_PARAMETER_SET_0 _MMIO(0x6B200)
From: Gwan-gyeong Mun gwan-gyeong.mun@intel.com
The PSR enable/disable sequences now require that we program an extra register in the PHY to adjust the lane disable power setting.
Bspec: 49274 Bspec: 53885 Cc: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Gwan-gyeong Mun gwan-gyeong.mun@intel.com --- drivers/gpu/drm/i915/display/intel_psr.c | 7 +++++++ drivers/gpu/drm/i915/display/intel_snps_phy.c | 14 ++++++++++++++ drivers/gpu/drm/i915/display/intel_snps_phy.h | 3 +++ drivers/gpu/drm/i915/i915_reg.h | 3 +++ 4 files changed, 27 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 9643624fe160..4ba5337064ea 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -32,6 +32,7 @@ #include "intel_dp_aux.h" #include "intel_hdmi.h" #include "intel_psr.h" +#include "intel_snps_phy.h" #include "intel_sprite.h" #include "skl_universal_plane.h"
@@ -1206,6 +1207,7 @@ static void intel_psr_enable_locked(struct intel_dp *intel_dp, { struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp); struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + enum phy phy = intel_port_to_phy(dev_priv, dig_port->base.port); struct intel_encoder *encoder = &dig_port->base; u32 val;
@@ -1231,6 +1233,7 @@ static void intel_psr_enable_locked(struct intel_dp *intel_dp, intel_dp_compute_psr_vsc_sdp(intel_dp, crtc_state, conn_state, &intel_dp->psr.vsc); intel_write_dp_vsc_sdp(encoder, crtc_state, &intel_dp->psr.vsc); + intel_snps_phy_update_psr_power_state(dev_priv, phy, true); intel_psr_enable_sink(intel_dp); intel_psr_enable_source(intel_dp); intel_dp->psr.enabled = true; @@ -1327,6 +1330,8 @@ static void intel_psr_wait_exit_locked(struct intel_dp *intel_dp) static void intel_psr_disable_locked(struct intel_dp *intel_dp) { struct drm_i915_private *dev_priv = dp_to_i915(intel_dp); + enum phy phy = intel_port_to_phy(dev_priv, + dp_to_dig_port(intel_dp)->base.port);
lockdep_assert_held(&intel_dp->psr.lock);
@@ -1353,6 +1358,8 @@ static void intel_psr_disable_locked(struct intel_dp *intel_dp) TRANS_SET_CONTEXT_LATENCY(intel_dp->psr.transcoder), TRANS_SET_CONTEXT_LATENCY_MASK, 0);
+ intel_snps_phy_update_psr_power_state(dev_priv, phy, false); + /* Disable PSR on Sink */ drm_dp_dpcd_writeb(&intel_dp->aux, DP_PSR_EN_CFG, 0);
diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c b/drivers/gpu/drm/i915/display/intel_snps_phy.c index f0c30d3d2dfb..18b52b64af95 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c @@ -36,6 +36,20 @@ void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv) } }
+void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv, + enum phy phy, bool enable) +{ + u32 val; + + if (!intel_phy_is_snps(dev_priv, phy)) + return; + + val = REG_FIELD_PREP(SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR, + enable ? 2 : 3); + intel_uncore_rmw(&dev_priv->uncore, SNPS_PHY_TX_REQ(phy), + SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR, val); +} + static const u32 dg2_ddi_translations[] = { /* VS 0, pre-emph 0 */ REG_FIELD_PREP(SNPS_PHY_TX_EQ_MAIN, 26), diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.h b/drivers/gpu/drm/i915/display/intel_snps_phy.h index 6aa33ff729ec..6261ff88ef5c 100644 --- a/drivers/gpu/drm/i915/display/intel_snps_phy.h +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.h @@ -12,8 +12,11 @@ struct drm_i915_private; struct intel_encoder; struct intel_crtc_state; struct intel_mpllb_state; +enum phy;
void intel_snps_phy_wait_for_calibration(struct drm_i915_private *dev_priv); +void intel_snps_phy_update_psr_power_state(struct drm_i915_private *dev_priv, + enum phy phy, bool enable);
int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state, struct intel_encoder *encoder); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index e3a165eb4fb6..9c5426aeddff 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2340,6 +2340,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg) #define SNPS_PHY_REF_CONTROL(phy) _MMIO_SNPS(phy, 0x168188) #define SNPS_PHY_REF_CONTROL_REF_RANGE REG_GENMASK(31, 27)
+#define SNPS_PHY_TX_REQ(phy) _MMIO_SNPS(phy, 0x168200) +#define SNPS_PHY_TX_REQ_LN_DIS_PWR_STATE_PSR REG_GENMASK(31, 30) + #define SNPS_PHY_TX_EQ(ln, phy) _MMIO_SNPS_LN(ln, phy, 0x168300) #define SNPS_PHY_TX_EQ_MAIN REG_GENMASK(23, 18) #define SNPS_PHY_TX_EQ_POST REG_GENMASK(15, 10)
From: José Roberto de Souza jose.souza@intel.com
PSR2 is not supported on DG2.
Cc: Caz Yokoyama Caz.Yokoyama@intel.com Cc: Gwan-gyeong Mun gwan-gyeong.mun@intel.com Signed-off-by: José Roberto de Souza jose.souza@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_psr.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 4ba5337064ea..422e48927b5b 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -866,7 +866,8 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp, }
/* Wa_16011181250 */ - if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv)) { + if (IS_ROCKETLAKE(dev_priv) || IS_ALDERLAKE_S(dev_priv) || + IS_DG2(dev_priv)) { drm_dbg_kms(&dev_priv->drm, "PSR2 is defeatured for this platform\n"); return false; }
From: Anusha Srivatsa anusha.srivatsa@intel.com
DSC can be supported per DP connector. This patch creates a per connector debugfs node to expose the Input and Compressed BPP.
The same node can be used from userspace to force DSC to a certain BPP.
force_dsc_bpp is written through this debugfs node to force DSC BPP to all accepted values
Cc: Vandita Kulkarni vandita.kulkarni@intel.com Cc: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Patnana Venkata Sai venkata.sai.patnana@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- .../drm/i915/display/intel_display_debugfs.c | 103 +++++++++++++++++- .../drm/i915/display/intel_display_types.h | 1 + 2 files changed, 103 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c b/drivers/gpu/drm/i915/display/intel_display_debugfs.c index af9e58619667..1805d70ea817 100644 --- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c +++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c @@ -2389,6 +2389,100 @@ static const struct file_operations i915_dsc_fec_support_fops = { .write = i915_dsc_fec_support_write };
+static int i915_dsc_bpp_support_show(struct seq_file *m, void *data) +{ + struct drm_connector *connector = m->private; + struct drm_device *dev = connector->dev; + struct drm_crtc *crtc; + struct intel_dp *intel_dp; + struct drm_modeset_acquire_ctx ctx; + struct intel_crtc_state *crtc_state = NULL; + int ret = 0; + bool try_again = false; + + drm_modeset_acquire_init(&ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE); + + do { + try_again = false; + ret = drm_modeset_lock(&dev->mode_config.connection_mutex, + &ctx); + if (ret) { + ret = -EINTR; + break; + } + crtc = connector->state->crtc; + if (connector->status != connector_status_connected || !crtc) { + ret = -ENODEV; + break; + } + ret = drm_modeset_lock(&crtc->mutex, &ctx); + if (ret == -EDEADLK) { + ret = drm_modeset_backoff(&ctx); + if (!ret) { + try_again = true; + continue; + } + break; + } else if (ret) { + break; + } + intel_dp = intel_attached_dp(to_intel_connector(connector)); + crtc_state = to_intel_crtc_state(crtc->state); + seq_printf(m, "Input_BPP: %d\n", crtc_state->pipe_bpp); + seq_printf(m, "Compressed_BPP: %d\n", + crtc_state->dsc.compressed_bpp); + } while (try_again); + + drm_modeset_drop_locks(&ctx); + drm_modeset_acquire_fini(&ctx); + + return ret; +} + +static ssize_t i915_dsc_bpp_support_write(struct file *file, + const char __user *ubuf, + size_t len, loff_t *offp) +{ + int dsc_bpp = 0; + int ret; + struct drm_connector *connector = + ((struct seq_file *)file->private_data)->private; + struct intel_encoder *encoder = intel_attached_encoder(to_intel_connector(connector)); + struct drm_i915_private *i915 = to_i915(encoder->base.dev); + struct intel_dp *intel_dp = enc_to_intel_dp(encoder); + + if (len == 0) + return 0; + + drm_dbg(&i915->drm, + "Copied %zu bytes from user to force BPP\n", len); + + ret = kstrtoint_from_user(ubuf, len, 0, &dsc_bpp); + + intel_dp->force_dsc_bpp = dsc_bpp; + if (ret < 0) + return ret; + + *offp += len; + return len; +} + +static int i915_dsc_bpp_support_open(struct inode *inode, + struct file *file) +{ + return single_open(file, i915_dsc_bpp_support_show, + inode->i_private); +} + +static const struct file_operations i915_dsc_bpp_support_fops = { + .owner = THIS_MODULE, + .open = i915_dsc_bpp_support_open, + .read = seq_read, + .llseek = seq_lseek, + .release = single_release, + .write = i915_dsc_bpp_support_write +}; + /** * intel_connector_debugfs_add - add i915 specific connector debugfs files * @connector: pointer to a registered drm_connector @@ -2427,9 +2521,16 @@ int intel_connector_debugfs_add(struct drm_connector *connector) connector, &i915_hdcp_sink_capability_fops); }
- if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && !to_intel_connector(connector)->mst_port) || connector->connector_type == DRM_MODE_CONNECTOR_eDP)) + if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && + ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && + !to_intel_connector(connector)->mst_port) || + connector->connector_type == DRM_MODE_CONNECTOR_eDP)) { debugfs_create_file("i915_dsc_fec_support", S_IRUGO, root, connector, &i915_dsc_fec_support_fops); + debugfs_create_file("i915_dsc_bpp_support", 0444, + root, connector, + &i915_dsc_bpp_support_fops); + }
/* Legacy panels doesn't lpsp on any platform */ if ((DISPLAY_VER(dev_priv) >= 9 || IS_HASWELL(dev_priv) || diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index 29ae1d9b5abc..00320d89d266 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -1627,6 +1627,7 @@ struct intel_dp {
/* Display stream compression testing */ bool force_dsc_en; + int force_dsc_bpp;
bool hobl_failed; bool hobl_active;
On Thu, 01 Jul 2021, Matt Roper matthew.d.roper@intel.com wrote:
From: Anusha Srivatsa anusha.srivatsa@intel.com
DSC can be supported per DP connector. This patch creates a per connector debugfs node to expose the Input and Compressed BPP.
The same node can be used from userspace to force DSC to a certain BPP.
force_dsc_bpp is written through this debugfs node to force DSC BPP to all accepted values
I think this patch needs rework, and it's independent of the rest of the series. Please just drop this one and the next.
BR, Jani.
Cc: Vandita Kulkarni vandita.kulkarni@intel.com Cc: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Patnana Venkata Sai venkata.sai.patnana@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
.../drm/i915/display/intel_display_debugfs.c | 103 +++++++++++++++++- .../drm/i915/display/intel_display_types.h | 1 + 2 files changed, 103 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c b/drivers/gpu/drm/i915/display/intel_display_debugfs.c index af9e58619667..1805d70ea817 100644 --- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c +++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c @@ -2389,6 +2389,100 @@ static const struct file_operations i915_dsc_fec_support_fops = { .write = i915_dsc_fec_support_write };
+static int i915_dsc_bpp_support_show(struct seq_file *m, void *data) +{
- struct drm_connector *connector = m->private;
- struct drm_device *dev = connector->dev;
- struct drm_crtc *crtc;
- struct intel_dp *intel_dp;
- struct drm_modeset_acquire_ctx ctx;
- struct intel_crtc_state *crtc_state = NULL;
- int ret = 0;
- bool try_again = false;
- drm_modeset_acquire_init(&ctx, DRM_MODESET_ACQUIRE_INTERRUPTIBLE);
- do {
try_again = false;
ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
&ctx);
if (ret) {
ret = -EINTR;
break;
}
crtc = connector->state->crtc;
if (connector->status != connector_status_connected || !crtc) {
ret = -ENODEV;
break;
}
ret = drm_modeset_lock(&crtc->mutex, &ctx);
if (ret == -EDEADLK) {
ret = drm_modeset_backoff(&ctx);
if (!ret) {
try_again = true;
continue;
}
break;
} else if (ret) {
break;
}
intel_dp = intel_attached_dp(to_intel_connector(connector));
crtc_state = to_intel_crtc_state(crtc->state);
seq_printf(m, "Input_BPP: %d\n", crtc_state->pipe_bpp);
seq_printf(m, "Compressed_BPP: %d\n",
crtc_state->dsc.compressed_bpp);
- } while (try_again);
- drm_modeset_drop_locks(&ctx);
- drm_modeset_acquire_fini(&ctx);
- return ret;
+}
+static ssize_t i915_dsc_bpp_support_write(struct file *file,
const char __user *ubuf,
size_t len, loff_t *offp)
+{
- int dsc_bpp = 0;
- int ret;
- struct drm_connector *connector =
((struct seq_file *)file->private_data)->private;
- struct intel_encoder *encoder = intel_attached_encoder(to_intel_connector(connector));
- struct drm_i915_private *i915 = to_i915(encoder->base.dev);
- struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
- if (len == 0)
return 0;
- drm_dbg(&i915->drm,
"Copied %zu bytes from user to force BPP\n", len);
- ret = kstrtoint_from_user(ubuf, len, 0, &dsc_bpp);
- intel_dp->force_dsc_bpp = dsc_bpp;
- if (ret < 0)
return ret;
- *offp += len;
- return len;
+}
+static int i915_dsc_bpp_support_open(struct inode *inode,
struct file *file)
+{
- return single_open(file, i915_dsc_bpp_support_show,
inode->i_private);
+}
+static const struct file_operations i915_dsc_bpp_support_fops = {
- .owner = THIS_MODULE,
- .open = i915_dsc_bpp_support_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
- .write = i915_dsc_bpp_support_write
+};
/**
- intel_connector_debugfs_add - add i915 specific connector debugfs files
- @connector: pointer to a registered drm_connector
@@ -2427,9 +2521,16 @@ int intel_connector_debugfs_add(struct drm_connector *connector) connector, &i915_hdcp_sink_capability_fops); }
- if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) && ((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && !to_intel_connector(connector)->mst_port) || connector->connector_type == DRM_MODE_CONNECTOR_eDP))
if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) &&
((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort &&
!to_intel_connector(connector)->mst_port) ||
connector->connector_type == DRM_MODE_CONNECTOR_eDP)) {
debugfs_create_file("i915_dsc_fec_support", S_IRUGO, root, connector, &i915_dsc_fec_support_fops);
debugfs_create_file("i915_dsc_bpp_support", 0444,
root, connector,
&i915_dsc_bpp_support_fops);
}
/* Legacy panels doesn't lpsp on any platform */ if ((DISPLAY_VER(dev_priv) >= 9 || IS_HASWELL(dev_priv) ||
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index 29ae1d9b5abc..00320d89d266 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -1627,6 +1627,7 @@ struct intel_dp {
/* Display stream compression testing */ bool force_dsc_en;
int force_dsc_bpp;
bool hobl_failed; bool hobl_active;
Now, this change and changes from patch 51 of this series is taken care as part of the below series https://patchwork.freedesktop.org/series/92750/ and merged.
Thanks, Vandita
-----Original Message----- From: Jani Nikula jani.nikula@linux.intel.com Sent: Friday, July 2, 2021 1:50 PM To: Roper, Matthew D matthew.d.roper@intel.com; intel- gfx@lists.freedesktop.org Cc: Patnana, Venkata Sai venkata.sai.patnana@intel.com; Srivatsa, Anusha anusha.srivatsa@intel.com; dri-devel@lists.freedesktop.org; Navare, Manasi D manasi.d.navare@intel.com; Kulkarni, Vandita vandita.kulkarni@intel.com Subject: Re: [PATCH 50/53] drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable
On Thu, 01 Jul 2021, Matt Roper matthew.d.roper@intel.com wrote:
From: Anusha Srivatsa anusha.srivatsa@intel.com
DSC can be supported per DP connector. This patch creates a per connector debugfs node to expose the Input and Compressed BPP.
The same node can be used from userspace to force DSC to a certain BPP.
force_dsc_bpp is written through this debugfs node to force DSC BPP to all accepted values
I think this patch needs rework, and it's independent of the rest of the series. Please just drop this one and the next.
BR, Jani.
Cc: Vandita Kulkarni vandita.kulkarni@intel.com Cc: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Patnana Venkata Sai venkata.sai.patnana@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
.../drm/i915/display/intel_display_debugfs.c | 103 +++++++++++++++++- .../drm/i915/display/intel_display_types.h | 1 + 2 files changed, 103 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c b/drivers/gpu/drm/i915/display/intel_display_debugfs.c index af9e58619667..1805d70ea817 100644 --- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c +++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c @@ -2389,6 +2389,100 @@ static const struct file_operations
i915_dsc_fec_support_fops = {
.write = i915_dsc_fec_support_write };
+static int i915_dsc_bpp_support_show(struct seq_file *m, void *data) +{
- struct drm_connector *connector = m->private;
- struct drm_device *dev = connector->dev;
- struct drm_crtc *crtc;
- struct intel_dp *intel_dp;
- struct drm_modeset_acquire_ctx ctx;
- struct intel_crtc_state *crtc_state = NULL;
- int ret = 0;
- bool try_again = false;
- drm_modeset_acquire_init(&ctx,
DRM_MODESET_ACQUIRE_INTERRUPTIBLE);
- do {
try_again = false;
ret = drm_modeset_lock(&dev-
mode_config.connection_mutex,
&ctx);
if (ret) {
ret = -EINTR;
break;
}
crtc = connector->state->crtc;
if (connector->status != connector_status_connected ||
!crtc) {
ret = -ENODEV;
break;
}
ret = drm_modeset_lock(&crtc->mutex, &ctx);
if (ret == -EDEADLK) {
ret = drm_modeset_backoff(&ctx);
if (!ret) {
try_again = true;
continue;
}
break;
} else if (ret) {
break;
}
intel_dp =
intel_attached_dp(to_intel_connector(connector));
crtc_state = to_intel_crtc_state(crtc->state);
seq_printf(m, "Input_BPP: %d\n", crtc_state->pipe_bpp);
seq_printf(m, "Compressed_BPP: %d\n",
crtc_state->dsc.compressed_bpp);
- } while (try_again);
- drm_modeset_drop_locks(&ctx);
- drm_modeset_acquire_fini(&ctx);
- return ret;
+}
+static ssize_t i915_dsc_bpp_support_write(struct file *file,
const char __user *ubuf,
size_t len, loff_t *offp)
+{
- int dsc_bpp = 0;
- int ret;
- struct drm_connector *connector =
((struct seq_file *)file->private_data)->private;
- struct intel_encoder *encoder =
intel_attached_encoder(to_intel_connector(connector));
- struct drm_i915_private *i915 = to_i915(encoder->base.dev);
- struct intel_dp *intel_dp = enc_to_intel_dp(encoder);
- if (len == 0)
return 0;
- drm_dbg(&i915->drm,
"Copied %zu bytes from user to force BPP\n", len);
- ret = kstrtoint_from_user(ubuf, len, 0, &dsc_bpp);
- intel_dp->force_dsc_bpp = dsc_bpp;
- if (ret < 0)
return ret;
- *offp += len;
- return len;
+}
+static int i915_dsc_bpp_support_open(struct inode *inode,
struct file *file)
+{
- return single_open(file, i915_dsc_bpp_support_show,
inode->i_private);
+}
+static const struct file_operations i915_dsc_bpp_support_fops = {
- .owner = THIS_MODULE,
- .open = i915_dsc_bpp_support_open,
- .read = seq_read,
- .llseek = seq_lseek,
- .release = single_release,
- .write = i915_dsc_bpp_support_write
+};
/**
- intel_connector_debugfs_add - add i915 specific connector debugfs files
- @connector: pointer to a registered drm_connector @@ -2427,9
+2521,16 @@ int intel_connector_debugfs_add(struct drm_connector
*connector)
connector,
&i915_hdcp_sink_capability_fops);
}
- if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) &&
((connector->connector_type == DRM_MODE_CONNECTOR_DisplayPort && !to_intel_connector(connector)->mst_port) || connector->connector_type == DRM_MODE_CONNECTOR_eDP))
- if ((DISPLAY_VER(dev_priv) >= 11 || IS_CANNONLAKE(dev_priv)) &&
((connector->connector_type ==
DRM_MODE_CONNECTOR_DisplayPort &&
!to_intel_connector(connector)->mst_port) ||
connector->connector_type == DRM_MODE_CONNECTOR_eDP))
{
debugfs_create_file("i915_dsc_fec_support", S_IRUGO,
root,
connector, &i915_dsc_fec_support_fops);
debugfs_create_file("i915_dsc_bpp_support", 0444,
root, connector,
&i915_dsc_bpp_support_fops);
}
/* Legacy panels doesn't lpsp on any platform */ if ((DISPLAY_VER(dev_priv) >= 9 || IS_HASWELL(dev_priv) || diff
--git a/drivers/gpu/drm/i915/display/intel_display_types.h b/drivers/gpu/drm/i915/display/intel_display_types.h index 29ae1d9b5abc..00320d89d266 100644 --- a/drivers/gpu/drm/i915/display/intel_display_types.h +++ b/drivers/gpu/drm/i915/display/intel_display_types.h @@ -1627,6 +1627,7 @@ struct intel_dp {
/* Display stream compression testing */ bool force_dsc_en;
int force_dsc_bpp;
bool hobl_failed; bool hobl_active;
-- Jani Nikula, Intel Open Source Graphics Center
From: Anusha Srivatsa anusha.srivatsa@intel.com
Set compress BPP in kernel while connector DP or eDP
Cc: Vandita Kulkarni vandita.kulkarni@intel.com Cc: Navare Manasi D manasi.d.navare@intel.com Signed-off-by: Anusha Srivatsa anusha.srivatsa@intel.com Signed-off-by: Patnana Venkata Sai venkata.sai.patnana@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_dp.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 5b52beaddada..57aadee69d8b 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -1241,9 +1241,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp, pipe_config->lane_count = limits->max_lane_count;
if (intel_dp_is_edp(intel_dp)) { - pipe_config->dsc.compressed_bpp = - min_t(u16, drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, - pipe_config->pipe_bpp); + if (intel_dp->force_dsc_bpp) { + drm_dbg_kms(&dev_priv->drm, + "DSC BPC forced to %d", intel_dp->force_dsc_bpp); + pipe_config->dsc.compressed_bpp = intel_dp->force_dsc_bpp; + } else { + pipe_config->dsc.compressed_bpp = + min_t(u16, drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, + pipe_config->pipe_bpp); + } pipe_config->dsc.slice_count = drm_dp_dsc_sink_max_slice_count(intel_dp->dsc_dpcd, true); @@ -1269,9 +1275,15 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp, "Compressed BPP/Slice Count not supported\n"); return -EINVAL; } - pipe_config->dsc.compressed_bpp = min_t(u16, + if (intel_dp->force_dsc_bpp) { + drm_dbg_kms(&dev_priv->drm, + "DSC BPC forced to %d\n", intel_dp->force_dsc_bpp); + pipe_config->dsc.compressed_bpp = intel_dp->force_dsc_bpp; + } else { + pipe_config->dsc.compressed_bpp = min_t(u16, dsc_max_output_bpp >> 4, pipe_config->pipe_bpp); + } pipe_config->dsc.slice_count = dsc_dp_slice_count; } /* @@ -1374,7 +1386,8 @@ intel_dp_compute_link_config(struct intel_encoder *encoder, * Pipe joiner needs compression upto display12 due to BW limitation. DG2 * onwards pipe joiner can be enabled without compression. */ - drm_dbg_kms(&i915->drm, "Force DSC en = %d\n", intel_dp->force_dsc_en); + drm_dbg_kms(&i915->drm, "Force DSC en = %d\n Force DSC BPP = %d\n", + intel_dp->force_dsc_en, intel_dp->force_dsc_bpp); if (ret || intel_dp->force_dsc_en || (DISPLAY_VER(i915) < 13 && pipe_config->bigjoiner)) { ret = intel_dp_dsc_compute_config(intel_dp, pipe_config,
From: Animesh Manna animesh.manna@intel.com
In verify_mpllb_state() encoder is retrieved from best_encoder of connector_state. As there will be only one connector_state for bigjoiner and checking encoder may not be needed for bigjoiner-slave. This code path related to mpll is done on dg2 and need this fix to avoid null pointer dereference issue.
Cc: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Animesh Manna animesh.manna@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_display.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 9655f1b1b41b..3f4e811145b6 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state, if (!new_crtc_state->hw.active) return;
+ if (new_crtc_state->bigjoiner_slave) + return; + encoder = intel_get_crtc_new_encoder(state, new_crtc_state); intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state);
On Thu, Jul 01, 2021 at 01:24:26PM -0700, Matt Roper wrote:
From: Animesh Manna animesh.manna@intel.com
In verify_mpllb_state() encoder is retrieved from best_encoder of connector_state. As there will be only one connector_state for bigjoiner and checking encoder may not be needed for bigjoiner-slave. This code path related to mpll is done on dg2 and need this fix to avoid null pointer dereference issue.
Cc: Manasi Navare manasi.d.navare@intel.com Signed-off-by: Animesh Manna animesh.manna@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
Reviewed-by: Manasi Navare manasi.d.navare@intel.com
Manasi
drivers/gpu/drm/i915/display/intel_display.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 9655f1b1b41b..3f4e811145b6 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state, if (!new_crtc_state->hw.active) return;
- if (new_crtc_state->bigjoiner_slave)
return;
- encoder = intel_get_crtc_new_encoder(state, new_crtc_state); intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state);
-- 2.25.4
From: Ankit Nautiyal ankit.k.nautiyal@intel.com
Add the functions to configure HDMI2.1 pcon for DG2, before DP link training.
Signed-off-by: Ankit Nautiyal ankit.k.nautiyal@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/display/intel_ddi.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c index 5499a2975a0e..77f79f3269a1 100644 --- a/drivers/gpu/drm/i915/display/intel_ddi.c +++ b/drivers/gpu/drm/i915/display/intel_ddi.c @@ -2580,6 +2580,7 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, if (!is_mst) intel_dp_set_power(intel_dp, DP_SET_POWER_D0);
+ intel_dp_configure_protocol_converter(intel_dp, crtc_state); intel_dp_sink_set_decompression_state(intel_dp, crtc_state, true); /* * DDI FEC: "anticipates enabling FEC encoding sets the FEC_READY bit @@ -2587,6 +2588,8 @@ static void dg2_ddi_pre_enable_dp(struct intel_atomic_state *state, * training */ intel_dp_sink_set_fec_ready(intel_dp, crtc_state); + intel_dp_check_frl_training(intel_dp); + intel_dp_pcon_dsc_configure(intel_dp, crtc_state);
/* * 5.h Follow DisplayPort specification training sequence (see notes for
dri-devel@lists.freedesktop.org