Our uncore MMIO functions for reading/writing registers have become very complicated over time. There's significant macro magic used to generate several nearly-identical functions that only really differ in terms of which platform-specific shadow register table they should check on write operations. We can significantly simplify our MMIO handlers by storing a reference to the current platform's shadow table within the 'struct intel_uncore' the same way we already do for forcewake; this allows us to consolidate the multiple variants of each 'write' function down to just a single 'fwtable' version that gets the shadow table out of the uncore struct rather than hardcoding the name of a specific platform's table. We can do similar consolidation on the MMIO read side by creating a single-entry forcewake table to replace the open-coded range check they had been using previously.
The final patch of the series adds a new shadow table for DG2; this becomes quite clean and simple now, given the refactoring in the first five patches.
Aside from simplifying the code signficantly, this series reduces the size of the generated .ko in exchange for adding an extra pointer indirection to access the tables. The size deltas (for just the first five patches, before we add an additional table in the final patch) are:
Old: $ size drivers/gpu/drm/i915/i915.ko text data bss dec hex filename 2865921 88972 2912 2957805 2d21ed drivers/gpu/drm/i915/i915.ko
New: $ size drivers/gpu/drm/i915/i915.ko text data bss dec hex filename 2854181 88236 2912 2945329 2cf131 drivers/gpu/drm/i915/i915.ko
The code size deltas will become larger as we add more platforms; we already add one new platform table in the final patch of this series and our next few platforms are all expected to bring new shadow tables as well.
I don't think the impact of the indirect table reference for shadow tables should be a concern for a few reasons: * The stored table + indirect lookup design is already deemed good enough for forcewake, which is used more frequently (both reads and writes, compared to shadow tables which are only used for writes) and operates on much larger tables. * Performance-critical sections of the code or those read/writing lots of registers in a batch usually do an explicit grab of the relevant forcewake domains and then perform their MMIO operations via *_fw() functions without considering shadowed registers and bypassing all of the table lookups. * In v2 of the series, we still apply NEEDS_FORCE_WAKE() checks that will bypass all of the forcewake and shadow logic for display register writes.
v2: - Drop orphaned definition of __gen6_reg_read_fw_domains. (Tvrtko) - Restore NEEDS_FORCE_WAKE() check to __fwtable_reg_{read,write}_fw_domains, but update the definition of NEEDS_FORCE_WAKE to also return 'true' on offsets above GEN11_BSD_RING_BASE for compatibility with gen11+ platforms. (Chris, Tvrtko).
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk
Matt Roper (6): drm/i915/uncore: Convert gen6/gen7 read operations to fwtable drm/i915/uncore: Associate shadow table with uncore drm/i915/uncore: Replace gen8 write functions with general fwtable drm/i915/uncore: Drop gen11/gen12 mmio write handlers drm/i915/uncore: Drop gen11 mmio read handlers drm/i915/dg2: Add DG2-specific shadow register table
drivers/gpu/drm/i915/intel_uncore.c | 200 ++++++++++-------- drivers/gpu/drm/i915/intel_uncore.h | 7 + drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 3 files changed, 115 insertions(+), 93 deletions(-)
On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we simply check whether the register offset is < 0x40000, and return FORCEWAKE_RENDER if it is. To prepare for upcoming refactoring, let's define a single-entry forcewake table from [0x0, 0x3ffff] and switch these platforms over to use the fwtable reader functions.
v2: - Drop __gen6_reg_read_fw_domains which is no longer used. (Tvrtko)
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index f9767054dbdf..8c09af1e9f7a 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -853,16 +853,6 @@ void assert_forcewakes_active(struct intel_uncore *uncore, /* We give fast paths for the really cool registers */ #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
-#define __gen6_reg_read_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset)) \ - __fwd = FORCEWAKE_RENDER; \ - else \ - __fwd = 0; \ - __fwd; \ -}) - static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { if (offset < entry->start) @@ -1064,6 +1054,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) __fwd; \ })
+static const struct intel_forcewake_range __gen6_fw_ranges[] = { + GEN_FW_RANGE(0x0, 0x3ffff, FORCEWAKE_RENDER), +}; + /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ static const struct intel_forcewake_range __chv_fw_ranges[] = { GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER), @@ -1623,7 +1617,6 @@ __gen_read(func, 64)
__gen_reg_read_funcs(gen11_fwtable); __gen_reg_read_funcs(fwtable); -__gen_reg_read_funcs(gen6);
#undef __gen_reg_read_funcs #undef GEN6_READ_FOOTER @@ -2111,15 +2104,17 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 6, 7)) { + ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen6); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); }
uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier;
On 10/09/2021 21:10, Matt Roper wrote:
On gen6-gen8 (except vlv/chv) we don't use a forcewake lookup table; we simply check whether the register offset is < 0x40000, and return FORCEWAKE_RENDER if it is. To prepare for upcoming refactoring, let's define a single-entry forcewake table from [0x0, 0x3ffff] and switch these platforms over to use the fwtable reader functions.
v2:
- Drop __gen6_reg_read_fw_domains which is no longer used. (Tvrtko)
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index f9767054dbdf..8c09af1e9f7a 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -853,16 +853,6 @@ void assert_forcewakes_active(struct intel_uncore *uncore, /* We give fast paths for the really cool registers */ #define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000)
-#define __gen6_reg_read_fw_domains(uncore, offset) \ -({ \
- enum forcewake_domains __fwd; \
- if (NEEDS_FORCE_WAKE(offset)) \
__fwd = FORCEWAKE_RENDER; \
- else \
__fwd = 0; \
- __fwd; \
-})
- static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { if (offset < entry->start)
@@ -1064,6 +1054,10 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) __fwd; \ })
+static const struct intel_forcewake_range __gen6_fw_ranges[] = {
- GEN_FW_RANGE(0x0, 0x3ffff, FORCEWAKE_RENDER),
+};
- /* *Must* be sorted by offset ranges! See intel_fw_table_check(). */ static const struct intel_forcewake_range __chv_fw_ranges[] = { GEN_FW_RANGE(0x2000, 0x3fff, FORCEWAKE_RENDER),
@@ -1623,7 +1617,6 @@ __gen_read(func, 64)
__gen_reg_read_funcs(gen11_fwtable); __gen_reg_read_funcs(fwtable); -__gen_reg_read_funcs(gen6);
#undef __gen_reg_read_funcs #undef GEN6_READ_FOOTER @@ -2111,15 +2104,17 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) {
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
} else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 6, 7)) {ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen6);ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen6);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
}
uncore->pmic_bus_access_nb.notifier_call = i915_pmic_bus_access_notifier;
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
Store a reference to a platform's shadow table inside the uncore, the same as we do with the forcewake table. This will allow us to use a single set of functions that operate on the shadow table reference rather than generating lots of nearly-identical functions via macros that differ only in terms of the table that they reference.
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 40 ++++++++++++++++++++--------- drivers/gpu/drm/i915/intel_uncore.h | 7 +++++ 2 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8c09af1e9f7a..5fa2bf26a948 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1026,17 +1026,19 @@ static int mmio_range_cmp(u32 key, const struct i915_range *range) return 0; }
-#define __is_X_shadowed(x) \ -static bool is_##x##_shadowed(u32 offset) \ -{ \ - const struct i915_range *regs = x##_shadowed_regs; \ - return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \ +static bool +is_shadowed(struct intel_uncore *uncore, u32 offset) +{ + if (drm_WARN_ON(&uncore->i915->drm, !uncore->shadowed_reg_table)) + return false; + + return BSEARCH(offset, + uncore->shadowed_reg_table, + uncore->shadowed_reg_table_entries, mmio_range_cmp); \ }
-__is_X_shadowed(gen8) -__is_X_shadowed(gen11) -__is_X_shadowed(gen12) +
static enum forcewake_domains gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) @@ -1047,7 +1049,7 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) #define __gen8_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset) && !is_gen8_shadowed(offset)) \ + if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \ __fwd = FORCEWAKE_RENDER; \ else \ __fwd = 0; \ @@ -1081,7 +1083,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { #define __fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \ - if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \ + if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \ __fwd = find_fw_domain(uncore, offset); \ __fwd; \ }) @@ -1090,7 +1092,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_gen11_shadowed(__offset)) \ + if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1099,7 +1101,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_gen12_shadowed(__offset)) \ + if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1705,6 +1707,7 @@ __gen_write(func, 8) \ __gen_write(func, 16) \ __gen_write(func, 32)
+ __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); @@ -1969,6 +1972,12 @@ static int intel_uncore_fw_domains_init(struct intel_uncore *uncore) (uncore)->fw_domains_table_entries = ARRAY_SIZE((d)); \ }
+#define ASSIGN_SHADOW_TABLE(uncore, d) \ +{ \ + (uncore)->shadowed_reg_table = d; \ + (uncore)->shadowed_reg_table_entries = ARRAY_SIZE((d)); \ +} + static int i915_pmic_bus_access_notifier(struct notifier_block *nb, unsigned long action, void *data) { @@ -2081,30 +2090,37 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_CHERRYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __chv_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); + ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h index 531665b08039..2f31c50eeae2 100644 --- a/drivers/gpu/drm/i915/intel_uncore.h +++ b/drivers/gpu/drm/i915/intel_uncore.h @@ -142,6 +142,13 @@ struct intel_uncore { const struct intel_forcewake_range *fw_domains_table; unsigned int fw_domains_table_entries;
+ /* + * Shadowed registers are special cases where we can safely write + * to the register *without* grabbing forcewake. + */ + const struct i915_range *shadowed_reg_table; + unsigned int shadowed_reg_table_entries; + struct notifier_block pmic_bus_access_nb; struct intel_uncore_funcs funcs;
On 10/09/2021 21:10, Matt Roper wrote:
Store a reference to a platform's shadow table inside the uncore, the same as we do with the forcewake table. This will allow us to use a single set of functions that operate on the shadow table reference rather than generating lots of nearly-identical functions via macros that differ only in terms of the table that they reference.
Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 40 ++++++++++++++++++++--------- drivers/gpu/drm/i915/intel_uncore.h | 7 +++++ 2 files changed, 35 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 8c09af1e9f7a..5fa2bf26a948 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1026,17 +1026,19 @@ static int mmio_range_cmp(u32 key, const struct i915_range *range) return 0; }
-#define __is_X_shadowed(x) \ -static bool is_##x##_shadowed(u32 offset) \ -{ \
- const struct i915_range *regs = x##_shadowed_regs; \
- return BSEARCH(offset, regs, ARRAY_SIZE(x##_shadowed_regs), \
+static bool +is_shadowed(struct intel_uncore *uncore, u32 offset)
Fits in one line if you want.
+{
- if (drm_WARN_ON(&uncore->i915->drm, !uncore->shadowed_reg_table))
return false;
- return BSEARCH(offset,
uncore->shadowed_reg_table,
}uncore->shadowed_reg_table_entries, mmio_range_cmp); \
-__is_X_shadowed(gen8) -__is_X_shadowed(gen11) -__is_X_shadowed(gen12)
If you want to tidy.
static enum forcewake_domains gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) @@ -1047,7 +1049,7 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) #define __gen8_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd; \
- if (NEEDS_FORCE_WAKE(offset) && !is_gen8_shadowed(offset)) \
- if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \ __fwd = FORCEWAKE_RENDER; \ else \ __fwd = 0; \
@@ -1081,7 +1083,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { #define __fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \
- if (NEEDS_FORCE_WAKE((offset)) && !is_gen8_shadowed(offset)) \
- if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \ __fwd = find_fw_domain(uncore, offset); \ __fwd; \ })
@@ -1090,7 +1092,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \
- if (!is_gen11_shadowed(__offset)) \
- if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ })
@@ -1099,7 +1101,7 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \
- if (!is_gen12_shadowed(__offset)) \
- if (!is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ })
@@ -1705,6 +1707,7 @@ __gen_write(func, 8) \ __gen_write(func, 16) \ __gen_write(func, 32)
Ditto.
__gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); @@ -1969,6 +1972,12 @@ static int intel_uncore_fw_domains_init(struct intel_uncore *uncore) (uncore)->fw_domains_table_entries = ARRAY_SIZE((d)); \ }
+#define ASSIGN_SHADOW_TABLE(uncore, d) \ +{ \
- (uncore)->shadowed_reg_table = d; \
- (uncore)->shadowed_reg_table_entries = ARRAY_SIZE((d)); \
+}
- static int i915_pmic_bus_access_notifier(struct notifier_block *nb, unsigned long action, void *data) {
@@ -2081,30 +2090,37 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_CHERRYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __chv_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 8) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges);ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
Note for later - I think by the end of the series all "assign read" in this block become fwtable so you could consolidate them if wanted.
} else if (IS_VALLEYVIEW(i915)) { diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h index 531665b08039..2f31c50eeae2 100644 --- a/drivers/gpu/drm/i915/intel_uncore.h +++ b/drivers/gpu/drm/i915/intel_uncore.h @@ -142,6 +142,13 @@ struct intel_uncore { const struct intel_forcewake_range *fw_domains_table; unsigned int fw_domains_table_entries;
- /*
* Shadowed registers are special cases where we can safely write
* to the register *without* grabbing forcewake.
*/
- const struct i915_range *shadowed_reg_table;
- unsigned int shadowed_reg_table_entries;
- struct notifier_block pmic_bus_access_nb; struct intel_uncore_funcs funcs;
Looks fine and all tidies are optional.
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
Now that we have both a standard forcewake table (albeit a single-entry table) and the shadow table stored in the uncore, we can drop the gen8-specific write handlers in favor of the general fwtable version.
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 5fa2bf26a948..4c6898746d10 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1046,16 +1046,6 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) return FORCEWAKE_RENDER; }
-#define __gen8_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd; \ - if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \ - __fwd = FORCEWAKE_RENDER; \ - else \ - __fwd = 0; \ - __fwd; \ -}) - static const struct intel_forcewake_range __gen6_fw_ranges[] = { GEN_FW_RANGE(0x0, 0x3ffff, FORCEWAKE_RENDER), }; @@ -1711,7 +1701,6 @@ __gen_write(func, 32) __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); -__gen_reg_write_funcs(gen8);
#undef __gen_reg_write_funcs #undef GEN6_WRITE_FOOTER @@ -2121,7 +2110,7 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) } else if (GRAPHICS_VER(i915) == 8) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);
On 10/09/2021 21:10, Matt Roper wrote:
Now that we have both a standard forcewake table (albeit a single-entry table) and the shadow table stored in the uncore, we can drop the gen8-specific write handlers in favor of the general fwtable version.
Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 13 +------------ 1 file changed, 1 insertion(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 5fa2bf26a948..4c6898746d10 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1046,16 +1046,6 @@ gen6_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) return FORCEWAKE_RENDER; }
-#define __gen8_reg_write_fw_domains(uncore, offset) \ -({ \
- enum forcewake_domains __fwd; \
- if (NEEDS_FORCE_WAKE(offset) && !is_shadowed(uncore, offset)) \
__fwd = FORCEWAKE_RENDER; \
- else \
__fwd = 0; \
- __fwd; \
-})
- static const struct intel_forcewake_range __gen6_fw_ranges[] = { GEN_FW_RANGE(0x0, 0x3ffff, FORCEWAKE_RENDER), };
@@ -1711,7 +1701,6 @@ __gen_write(func, 32) __gen_reg_write_funcs(gen12_fwtable); __gen_reg_write_funcs(gen11_fwtable); __gen_reg_write_funcs(fwtable); -__gen_reg_write_funcs(gen8);
#undef __gen_reg_write_funcs #undef GEN6_WRITE_FOOTER @@ -2121,7 +2110,7 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) } else if (GRAPHICS_VER(i915) == 8) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen6_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen8);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_VALLEYVIEW(i915)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __vlv_fw_ranges);ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
Now that the reference to the shadow table is stored within the uncore, we don't need to generate separate fwtable, gen11_fwtable, and gen12_fwtable variants of the register write functions; a single 'fwtable' implementation will work for all of those platforms now.
While consolidating the functions, gen11/gen12 pick up a NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to bypass a lot of forcewake/shadow checking for non-GT registers (e.g., display). However since these later platforms also introduce media engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is extended to also consider register offsets above GEN11_BSD_RING_BASE.
v2: - Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE. (Chris, Tvrtko)
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 61 ++++++++++------------------- 1 file changed, 21 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 4c6898746d10..bfb2a6337f9d 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -851,7 +851,10 @@ void assert_forcewakes_active(struct intel_uncore *uncore, }
/* We give fast paths for the really cool registers */ -#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000) +#define NEEDS_FORCE_WAKE(reg) ({ \ + u32 __reg = (reg); \ + __reg < 0x40000 || __reg >= GEN11_BSD_RING_BASE; \ +})
static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { @@ -1071,27 +1074,10 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { };
#define __fwtable_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd = 0; \ - if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \ - __fwd = find_fw_domain(uncore, offset); \ - __fwd; \ -}) - -#define __gen11_fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \ - if (!is_shadowed(uncore, __offset)) \ - __fwd = find_fw_domain(uncore, __offset); \ - __fwd; \ -}) - -#define __gen12_fwtable_reg_write_fw_domains(uncore, offset) \ -({ \ - enum forcewake_domains __fwd = 0; \ - const u32 __offset = (offset); \ - if (!is_shadowed(uncore, __offset)) \ + if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ }) @@ -1675,34 +1661,29 @@ __gen6_write(8) __gen6_write(16) __gen6_write(32)
-#define __gen_write(func, x) \ +#define __gen_fwtable_write(x) \ static void \ -func##_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ +fwtable_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ enum forcewake_domains fw_engine; \ GEN6_WRITE_HEADER; \ - fw_engine = __##func##_reg_write_fw_domains(uncore, offset); \ + fw_engine = __fwtable_reg_write_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ __raw_uncore_write##x(uncore, reg, val); \ GEN6_WRITE_FOOTER; \ }
-#define __gen_reg_write_funcs(func) \ -static enum forcewake_domains \ -func##_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \ - return __##func##_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg)); \ -} \ -\ -__gen_write(func, 8) \ -__gen_write(func, 16) \ -__gen_write(func, 32) - +static enum forcewake_domains +fwtable_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) +{ + return __fwtable_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg)); +}
-__gen_reg_write_funcs(gen12_fwtable); -__gen_reg_write_funcs(gen11_fwtable); -__gen_reg_write_funcs(fwtable); +__gen_fwtable_write(8) +__gen_fwtable_write(16) +__gen_fwtable_write(32)
-#undef __gen_reg_write_funcs +#undef __gen_fwtable_write #undef GEN6_WRITE_FOOTER #undef GEN6_WRITE_HEADER
@@ -2080,22 +2061,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); - ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);
On 10/09/2021 21:10, Matt Roper wrote:
Now that the reference to the shadow table is stored within the uncore, we don't need to generate separate fwtable, gen11_fwtable, and gen12_fwtable variants of the register write functions; a single 'fwtable' implementation will work for all of those platforms now.
While consolidating the functions, gen11/gen12 pick up a NEEDS_FORCE_WAKE() check that they didn't have before, allowing them to bypass a lot of forcewake/shadow checking for non-GT registers (e.g., display). However since these later platforms also introduce media engines at higher MMIO offsets, the definition of NEEDS_FORCE_WAKE() is extended to also consider register offsets above GEN11_BSD_RING_BASE.
v2:
- Restore NEEDS_FORCE_WAKE(), but extend it for compatibility with the gen11+ platforms by also passing offsets above GEN11_BSD_RING_BASE. (Chris, Tvrtko)
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 61 ++++++++++------------------- 1 file changed, 21 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 4c6898746d10..bfb2a6337f9d 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -851,7 +851,10 @@ void assert_forcewakes_active(struct intel_uncore *uncore, }
/* We give fast paths for the really cool registers */ -#define NEEDS_FORCE_WAKE(reg) ((reg) < 0x40000) +#define NEEDS_FORCE_WAKE(reg) ({ \
- u32 __reg = (reg); \
- __reg < 0x40000 || __reg >= GEN11_BSD_RING_BASE; \
+})
static int fw_range_cmp(u32 offset, const struct intel_forcewake_range *entry) { @@ -1071,27 +1074,10 @@ static const struct intel_forcewake_range __chv_fw_ranges[] = { };
#define __fwtable_reg_write_fw_domains(uncore, offset) \ -({ \
- enum forcewake_domains __fwd = 0; \
- if (NEEDS_FORCE_WAKE((offset)) && !is_shadowed(uncore, offset)) \
__fwd = find_fw_domain(uncore, offset); \
- __fwd; \
-})
-#define __gen11_fwtable_reg_write_fw_domains(uncore, offset) \ ({ \ enum forcewake_domains __fwd = 0; \ const u32 __offset = (offset); \
- if (!is_shadowed(uncore, __offset)) \
__fwd = find_fw_domain(uncore, __offset); \
- __fwd; \
-})
-#define __gen12_fwtable_reg_write_fw_domains(uncore, offset) \ -({ \
- enum forcewake_domains __fwd = 0; \
- const u32 __offset = (offset); \
- if (!is_shadowed(uncore, __offset)) \
- if (NEEDS_FORCE_WAKE((__offset)) && !is_shadowed(uncore, __offset)) \ __fwd = find_fw_domain(uncore, __offset); \ __fwd; \ })
@@ -1675,34 +1661,29 @@ __gen6_write(8) __gen6_write(16) __gen6_write(32)
-#define __gen_write(func, x) \ +#define __gen_fwtable_write(x) \ static void \ -func##_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ +fwtable_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { \ enum forcewake_domains fw_engine; \ GEN6_WRITE_HEADER; \
- fw_engine = __##func##_reg_write_fw_domains(uncore, offset); \
- fw_engine = __fwtable_reg_write_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ __raw_uncore_write##x(uncore, reg, val); \ GEN6_WRITE_FOOTER; \ }
-#define __gen_reg_write_funcs(func) \ -static enum forcewake_domains \ -func##_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
- return __##func##_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg)); \
-} \ -\ -__gen_write(func, 8) \ -__gen_write(func, 16) \ -__gen_write(func, 32)
+static enum forcewake_domains +fwtable_reg_write_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) +{
- return __fwtable_reg_write_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
-__gen_reg_write_funcs(gen12_fwtable); -__gen_reg_write_funcs(gen11_fwtable); -__gen_reg_write_funcs(fwtable); +__gen_fwtable_write(8) +__gen_fwtable_write(16) +__gen_fwtable_write(32)
-#undef __gen_reg_write_funcs +#undef __gen_fwtable_write #undef GEN6_WRITE_FOOTER #undef GEN6_WRITE_HEADER
@@ -2080,22 +2061,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs);ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen12_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs);ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_WRITE_MMIO_VFUNCS(uncore, gen11_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges);ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
Consolidate down to just a single 'fwtable' implementation. For reads we don't need to worry about shadow tables. Also, the NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation can be dropped --- if a register is outside that range on one of the old platforms, then it won't belong to any forcewake range and 0 will be returned anyway.
v2: - Restore NEEDS_FORCE_WAKE() check. (Chris, Tvrtko)
Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 40 ++++++++++++----------------- 1 file changed, 17 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index bfb2a6337f9d..10f124297e7c 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = { __fwd; \ })
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \ - find_fw_domain(uncore, offset) - /* *Must* be sorted by offset! See intel_shadow_table_check(). */ static const struct i915_range gen8_shadowed_regs[] = { { .start = 0x2030, .end = 0x2030 }, @@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct intel_uncore *uncore, ___force_wake_auto(uncore, fw_domains); }
-#define __gen_read(func, x) \ +#define __gen_fwtable_read(x) \ static u##x \ -func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \ +fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \ +{ \ enum forcewake_domains fw_engine; \ GEN6_READ_HEADER(x); \ - fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \ + fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ val = __raw_uncore_read##x(uncore, reg); \ GEN6_READ_FOOTER; \ }
-#define __gen_reg_read_funcs(func) \ -static enum forcewake_domains \ -func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \ - return __##func##_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); \ -} \ -\ -__gen_read(func, 8) \ -__gen_read(func, 16) \ -__gen_read(func, 32) \ -__gen_read(func, 64) +static enum forcewake_domains +fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { + return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); +}
-__gen_reg_read_funcs(gen11_fwtable); -__gen_reg_read_funcs(fwtable); +__gen_fwtable_read(8) +__gen_fwtable_read(16) +__gen_fwtable_read(32) +__gen_fwtable_read(64)
-#undef __gen_reg_read_funcs +#undef __gen_fwtable_read #undef GEN6_READ_FOOTER #undef GEN6_READ_HEADER
@@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); - ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable); + ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);
On 10/09/2021 21:10, Matt Roper wrote:
Consolidate down to just a single 'fwtable' implementation. For reads we don't need to worry about shadow tables. Also, the NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation can be dropped --- if a register is outside that range on one of the old platforms, then it won't belong to any forcewake range and 0 will be returned anyway.
v2:
- Restore NEEDS_FORCE_WAKE() check. (Chris, Tvrtko)
I started liking rewording the commit message when the revision log starts contradicting it, but it is just a suggestion.
Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 40 ++++++++++++----------------- 1 file changed, 17 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index bfb2a6337f9d..10f124297e7c 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = { __fwd; \ })
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
- find_fw_domain(uncore, offset)
- /* *Must* be sorted by offset! See intel_shadow_table_check(). */ static const struct i915_range gen8_shadowed_regs[] = { { .start = 0x2030, .end = 0x2030 },
@@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct intel_uncore *uncore, ___force_wake_auto(uncore, fw_domains); }
-#define __gen_read(func, x) \ +#define __gen_fwtable_read(x) \ static u##x \ -func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \ +fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \ +{ \ enum forcewake_domains fw_engine; \ GEN6_READ_HEADER(x); \
- fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
- fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ val = __raw_uncore_read##x(uncore, reg); \ GEN6_READ_FOOTER; \ }
-#define __gen_reg_read_funcs(func) \ -static enum forcewake_domains \ -func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
- return __##func##_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); \
-} \ -\ -__gen_read(func, 8) \ -__gen_read(func, 16) \ -__gen_read(func, 32) \ -__gen_read(func, 64) +static enum forcewake_domains +fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
- return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
-__gen_reg_read_funcs(gen11_fwtable); -__gen_reg_read_funcs(fwtable); +__gen_fwtable_read(8) +__gen_fwtable_read(16) +__gen_fwtable_read(32) +__gen_fwtable_read(64)
-#undef __gen_reg_read_funcs +#undef __gen_fwtable_read #undef GEN6_READ_FOOTER #undef GEN6_READ_HEADER
@@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
On Fri, Sep 10, 2021 at 01:10:29PM -0700, Matt Roper wrote:
Consolidate down to just a single 'fwtable' implementation. For reads we don't need to worry about shadow tables. Also, the NEEDS_FORCE_WAKE() check we previously had in the fwtable implementation can be dropped --- if a register is outside that range on one of the old platforms, then it won't belong to any forcewake range and 0 will be returned anyway.
v2:
- Restore NEEDS_FORCE_WAKE() check. (Chris, Tvrtko)
Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com
drivers/gpu/drm/i915/intel_uncore.c | 40 ++++++++++++----------------- 1 file changed, 17 insertions(+), 23 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index bfb2a6337f9d..10f124297e7c 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -935,9 +935,6 @@ static const struct intel_forcewake_range __vlv_fw_ranges[] = { __fwd; \ })
-#define __gen11_fwtable_reg_read_fw_domains(uncore, offset) \
- find_fw_domain(uncore, offset)
/* *Must* be sorted by offset! See intel_shadow_table_check(). */ static const struct i915_range gen8_shadowed_regs[] = { { .start = 0x2030, .end = 0x2030 }, @@ -1570,33 +1567,30 @@ static inline void __force_wake_auto(struct intel_uncore *uncore, ___force_wake_auto(uncore, fw_domains); }
-#define __gen_read(func, x) \ +#define __gen_fwtable_read(x) \ static u##x \ -func##_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { \ +fwtable_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) \ +{ \ enum forcewake_domains fw_engine; \ GEN6_READ_HEADER(x); \
- fw_engine = __##func##_reg_read_fw_domains(uncore, offset); \
- fw_engine = __fwtable_reg_read_fw_domains(uncore, offset); \ if (fw_engine) \ __force_wake_auto(uncore, fw_engine); \ val = __raw_uncore_read##x(uncore, reg); \ GEN6_READ_FOOTER; \
}
-#define __gen_reg_read_funcs(func) \ -static enum forcewake_domains \ -func##_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) { \
- return __##func##_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg)); \
-} \ -\ -__gen_read(func, 8) \ -__gen_read(func, 16) \ -__gen_read(func, 32) \ -__gen_read(func, 64) +static enum forcewake_domains +fwtable_reg_read_fw_domains(struct intel_uncore *uncore, i915_reg_t reg) {
- return __fwtable_reg_read_fw_domains(uncore, i915_mmio_reg_offset(reg));
+}
-__gen_reg_read_funcs(gen11_fwtable); -__gen_reg_read_funcs(fwtable); +__gen_fwtable_read(8) +__gen_fwtable_read(16) +__gen_fwtable_read(32) +__gen_fwtable_read(64)
-#undef __gen_reg_read_funcs +#undef __gen_fwtable_read #undef GEN6_READ_FOOTER #undef GEN6_READ_HEADER
@@ -2062,22 +2056,22 @@ static int uncore_forcewake_init(struct intel_uncore *uncore) ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __xehp_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) >= 12) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen12_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
} else if (GRAPHICS_VER(i915) == 11) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen11_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen11_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable);ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, gen11_fwtable);
ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable);
these together with patch 1 make the fwtable variant be the only one used for reads.... so, should we pull these assignments out of if/else chain? gen < 6 don't have forcewake so afaics we cover all the platforms.
this can be done as a separate patch though.
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
} else if (IS_GRAPHICS_VER(i915, 9, 10)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __gen9_fw_ranges); ASSIGN_SHADOW_TABLE(uncore, gen8_shadowed_regs); -- 2.25.4
We thought the DG2 table of shadowed registers would be the same as the gen12/xehp table, but it turns out that there are a few minor differences that require us to define a new DG2-specific table: * One register is removed (0xC4D4) * One register is added (0xC4E0)
Signed-off-by: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_uncore.c | 41 ++++++++++++++++++- drivers/gpu/drm/i915/selftests/intel_uncore.c | 1 + 2 files changed, 41 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 10f124297e7c..b3ba710d4310 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -1016,6 +1016,45 @@ static const struct i915_range gen12_shadowed_regs[] = { { .start = 0x1F8510, .end = 0x1F8550 }, };
+static const struct i915_range dg2_shadowed_regs[] = { + { .start = 0x2030, .end = 0x2030 }, + { .start = 0x2510, .end = 0x2550 }, + { .start = 0xA008, .end = 0xA00C }, + { .start = 0xA188, .end = 0xA188 }, + { .start = 0xA278, .end = 0xA278 }, + { .start = 0xA540, .end = 0xA56C }, + { .start = 0xC4C8, .end = 0xC4C8 }, + { .start = 0xC4E0, .end = 0xC4E0 }, + { .start = 0xC600, .end = 0xC600 }, + { .start = 0xC658, .end = 0xC658 }, + { .start = 0x22030, .end = 0x22030 }, + { .start = 0x22510, .end = 0x22550 }, + { .start = 0x1C0030, .end = 0x1C0030 }, + { .start = 0x1C0510, .end = 0x1C0550 }, + { .start = 0x1C4030, .end = 0x1C4030 }, + { .start = 0x1C4510, .end = 0x1C4550 }, + { .start = 0x1C8030, .end = 0x1C8030 }, + { .start = 0x1C8510, .end = 0x1C8550 }, + { .start = 0x1D0030, .end = 0x1D0030 }, + { .start = 0x1D0510, .end = 0x1D0550 }, + { .start = 0x1D4030, .end = 0x1D4030 }, + { .start = 0x1D4510, .end = 0x1D4550 }, + { .start = 0x1D8030, .end = 0x1D8030 }, + { .start = 0x1D8510, .end = 0x1D8550 }, + { .start = 0x1E0030, .end = 0x1E0030 }, + { .start = 0x1E0510, .end = 0x1E0550 }, + { .start = 0x1E4030, .end = 0x1E4030 }, + { .start = 0x1E4510, .end = 0x1E4550 }, + { .start = 0x1E8030, .end = 0x1E8030 }, + { .start = 0x1E8510, .end = 0x1E8550 }, + { .start = 0x1F0030, .end = 0x1F0030 }, + { .start = 0x1F0510, .end = 0x1F0550 }, + { .start = 0x1F4030, .end = 0x1F4030 }, + { .start = 0x1F4510, .end = 0x1F4550 }, + { .start = 0x1F8030, .end = 0x1F8030 }, + { .start = 0x1F8510, .end = 0x1F8550 }, +}; + static int mmio_range_cmp(u32 key, const struct i915_range *range) { if (key < range->start) @@ -2054,7 +2093,7 @@ static int uncore_forcewake_init(struct intel_uncore *uncore)
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) { ASSIGN_FW_DOMAINS_TABLE(uncore, __dg2_fw_ranges); - ASSIGN_SHADOW_TABLE(uncore, gen12_shadowed_regs); + ASSIGN_SHADOW_TABLE(uncore, dg2_shadowed_regs); ASSIGN_WRITE_MMIO_VFUNCS(uncore, fwtable); ASSIGN_READ_MMIO_VFUNCS(uncore, fwtable); } else if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { diff --git a/drivers/gpu/drm/i915/selftests/intel_uncore.c b/drivers/gpu/drm/i915/selftests/intel_uncore.c index 22ef2c87df1a..bc8128170a99 100644 --- a/drivers/gpu/drm/i915/selftests/intel_uncore.c +++ b/drivers/gpu/drm/i915/selftests/intel_uncore.c @@ -68,6 +68,7 @@ static int intel_shadow_table_check(void) { gen8_shadowed_regs, ARRAY_SIZE(gen8_shadowed_regs) }, { gen11_shadowed_regs, ARRAY_SIZE(gen11_shadowed_regs) }, { gen12_shadowed_regs, ARRAY_SIZE(gen12_shadowed_regs) }, + { dg2_shadowed_regs, ARRAY_SIZE(dg2_shadowed_regs) }, }; const struct i915_range *range; unsigned int i, j;
On Fri, Sep 10, 2021 at 01:10:30PM -0700, Matt Roper wrote:
We thought the DG2 table of shadowed registers would be the same as the gen12/xehp table, but it turns out that there are a few minor differences that require us to define a new DG2-specific table:
- One register is removed (0xC4D4)
- One register is added (0xC4E0)
Signed-off-by: Matt Roper matthew.d.roper@intel.com
I did the conversion from the table to this array and arrived at the same result.
Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com
Lucas De Marchi
dri-devel@lists.freedesktop.org