[PATCH v6 0/6] Finally fix watermarks

List overview All Threads
Download

newer

older

simpledrm problem: Kconfig:error:...

[PATCH 00/12] Add support for...

Lyude

2 Aug 2016 2 Aug '16

10:37 p.m.

Latest version of https://patchwork.freedesktop.org/patch/102581/ .

Lyude (5): drm/i915/skl: Add support for the SAGV, fix underrun hangs drm/i915/skl: Update plane watermarks atomically during plane updates drm/i915/skl: Ensure pipes with changed wms get added to the state drm/i915: Move CRTC updating in atomic_commit into it's own hook drm/i915/skl: Update DDB values atomically with wms/plane attrs

Matt Roper (1): drm/i915/gen9: Only copy WM results for changed pipes to skl_hw

drivers/gpu/drm/i915/i915_drv.h | 4 + drivers/gpu/drm/i915/i915_reg.h | 5 + drivers/gpu/drm/i915/intel_display.c | 197 ++++++++++++--- drivers/gpu/drm/i915/intel_drv.h | 17 ++ drivers/gpu/drm/i915/intel_pm.c | 467 +++++++++++++++++++++++------------ drivers/gpu/drm/i915/intel_sprite.c | 6 + 6 files changed, 501 insertions(+), 195 deletions(-)

-- 2.7.4

Show replies by date

Lyude

2 Aug 2 Aug

10:37 p.m.

New subject: [PATCH v6 1/6] drm/i915/skl: Add support for the SAGV, fix underrun hangs

Since the watermark calculations for Skylake are still broken, we're apt to hitting underruns very easily under multi-monitor configurations. While it would be lovely if this was fixed, it's not. Another problem that's been coming from this however, is the mysterious issue of underruns causing full system hangs. An easy way to reproduce this with a skylake system:

- Get a laptop with a skylake GPU, and hook up two external monitors to it - Move the cursor from the built-in LCD to one of the external displays as quickly as you can - You'll get a few pipe underruns, and eventually the entire system will just freeze.

After doing a lot of investigation and reading through the bspec, I found the existence of the SAGV, which is responsible for adjusting the system agent voltage and clock frequencies depending on how much power we need. According to the bspec:

"The display engine access to system memory is blocked during the adjustment time. SAGV defaults to enabled. Software must use the GT-driver pcode mailbox to disable SAGV when the display engine is not able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave the SAGV enabled, and disable it when we use interlaced pipes/have more then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe underruns on Skylake have completely vanished on my T460s. Additionally, the bspec mentions turning off the SAGV with more then one pipe enabled as a workaround for display underruns. While this patch doesn't entirely fix that, it looks like it does improve the situation a little bit so it's likely this is going to be required to make watermarks on Skylake fully functional.

Changes since v5: - Don't use is_power_of_2. Makes things confusing - Don't use the old state to figure out whether or not to enable/disable the sagv, use the new one - Split the loop in skl_disable_sagv into it's own function - Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail() Changes since v4: - Use is_power_of_2 against active_crtcs to check whether we have > 1 pipe enabled - Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0 enabled - Call skl_sagv_enable/disable() from pre/post-plane updates Changes since v3: - Use time_before() to compare timeout to jiffies Changes since v2: - Really apply minor style nitpicks to patch this time Changes since v1: - Added comments about this probably being one of the requirements to fixing Skylake's watermark issues - Minor style nitpicks from Matt Roper - Disable these functions on Broxton, since it doesn't have an SAGV

Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lyude cpaul@redhat.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_reg.h | 5 ++ drivers/gpu/drm/i915/intel_display.c | 11 ++++ drivers/gpu/drm/i915/intel_drv.h | 2 + drivers/gpu/drm/i915/intel_pm.c | 112 +++++++++++++++++++++++++++++++++++ 5 files changed, 132 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 65ada5d..87018d3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1962,6 +1962,8 @@ struct drm_i915_private { struct i915_suspend_saved_registers regfile; struct vlv_s0ix_state vlv_s0ix_state;

+ bool skl_sagv_enabled; + struct { /* * Raw watermark latency values: diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2f93d4a..5fb1c63 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7170,6 +7170,11 @@ enum { #define HSW_PCODE_DE_WRITE_FREQ_REQ 0x17 #define DISPLAY_IPS_CONTROL 0x19 #define HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL 0x1A +#define GEN9_PCODE_SAGV_CONTROL 0x21 +#define GEN9_SAGV_DISABLE 0x0 +#define GEN9_SAGV_LOW_FREQ 0x1 +#define GEN9_SAGV_HIGH_FREQ 0x2 +#define GEN9_SAGV_DYNAMIC_FREQ 0x3 #define GEN6_PCODE_DATA _MMIO(0x138128) #define GEN6_PCODE_FREQ_IA_RATIO_SHIFT 8 #define GEN6_PCODE_FREQ_RING_RATIO_SHIFT 16 diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a8e8cc8..001c885 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13692,6 +13692,14 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) intel_state->cdclk_pll_vco != dev_priv->cdclk_pll.vco)) dev_priv->display.modeset_commit_cdclk(state);

+ /* + * SKL workaround: bspec recommends we disable the SAGV when we + * have more then one pipe enabled + */ + if (IS_SKYLAKE(dev_priv) && + hweight32(intel_state->active_crtcs) > 1) + skl_disable_sagv(dev_priv); + intel_modeset_verify_disabled(dev); }

@@ -13765,6 +13773,9 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) intel_modeset_verify_crtc(crtc, old_crtc_state, crtc->state); }

+ if (hweight32(intel_state->active_crtcs) <= 1) + skl_enable_sagv(dev_priv); + drm_atomic_helper_commit_hw_done(state);

if (intel_state->modeset) diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 50cdc89..6b0532a 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1709,6 +1709,8 @@ void ilk_wm_get_hw_state(struct drm_device *dev); void skl_wm_get_hw_state(struct drm_device *dev); void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv, struct skl_ddb_allocation *ddb /* out */); +int skl_enable_sagv(struct drm_i915_private *dev_priv); +int skl_disable_sagv(struct drm_i915_private *dev_priv); uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config); bool ilk_disable_lp_wm(struct drm_device *dev); int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index f610b71..68721a5 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2884,6 +2884,116 @@ skl_wm_plane_id(const struct intel_plane *plane) }

static void +skl_sagv_get_hw_state(struct drm_i915_private *dev_priv) +{ + u32 temp; + int ret; + + if (IS_BROXTON(dev_priv)) + return; + + mutex_lock(&dev_priv->rps.hw_lock); + ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL, &temp); + mutex_unlock(&dev_priv->rps.hw_lock); + + if (!ret) { + dev_priv->skl_sagv_enabled = !(temp & 0x1); + } else { + /* + * If for some reason we can't access the SAGV state, follow + * the bspec and assume it's enabled + */ + DRM_ERROR("Failed to get SAGV state, assuming enabled\n"); + dev_priv->skl_sagv_enabled = true; + } +} + +/* + * SAGV dynamically adjusts the system agent voltage and clock frequencies + * depending on power and performance requirements. The display engine access + * to system memory is blocked during the adjustment time. Having this enabled + * in multi-pipe configurations can cause issues (such as underruns causing + * full system hangs), and the bspec also suggests that software disable it + * when more then one pipe is enabled. + */ +int +skl_enable_sagv(struct drm_i915_private *dev_priv) +{ + int ret; + + if (IS_BROXTON(dev_priv)) + return 0; + if (dev_priv->skl_sagv_enabled) + return 0; + + mutex_lock(&dev_priv->rps.hw_lock); + DRM_DEBUG_KMS("Enabling the SAGV\n"); + + ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL, + GEN9_SAGV_DYNAMIC_FREQ); + if (!ret) + dev_priv->skl_sagv_enabled = true; + else + DRM_ERROR("Failed to enable the SAGV\n"); + + /* We don't need to wait for SAGV when enabling */ + mutex_unlock(&dev_priv->rps.hw_lock); + return ret; +} + +static int +skl_do_sagv_disable(struct drm_i915_private *dev_priv) +{ + int ret; + uint32_t temp; + + ret = sandybridge_pcode_write(dev_priv, GEN9_PCODE_SAGV_CONTROL, + GEN9_SAGV_DISABLE); + if (ret) { + DRM_ERROR("Failed to disable the SAGV\n"); + return ret; + } + + ret = sandybridge_pcode_read(dev_priv, GEN9_PCODE_SAGV_CONTROL, + &temp); + if (ret) { + DRM_ERROR("Failed to check the status of the SAGV\n"); + return ret; + } + + return temp & 0x1; +} + +int +skl_disable_sagv(struct drm_i915_private *dev_priv) +{ + int ret, result; + + if (IS_BROXTON(dev_priv)) + return 0; + if (!dev_priv->skl_sagv_enabled) + return 0; + + mutex_lock(&dev_priv->rps.hw_lock); + DRM_DEBUG_KMS("Disabling the SAGV\n"); + + /* bspec says to keep retrying for at least 1 ms */ + ret = wait_for(result = skl_do_sagv_disable(dev_priv), 1); + mutex_unlock(&dev_priv->rps.hw_lock); + + if (ret == -ETIMEDOUT) + DRM_ERROR("Request to disable SAGV timed out\n"); + else { + if (result == 1) + dev_priv->skl_sagv_enabled = false; + + ret = result; + } + + return ret; +} + +static void skl_ddb_get_pipe_allocation_limits(struct drm_device *dev, const struct intel_crtc_state *cstate, struct skl_ddb_entry *alloc, /* out */ @@ -4236,6 +4346,8 @@ void skl_wm_get_hw_state(struct drm_device *dev) /* Easy/common case; just sanitize DDB now if everything off */ memset(ddb, 0, sizeof(*ddb)); } + + skl_sagv_get_hw_state(dev_priv); }

static void ilk_pipe_wm_get_hw_state(struct drm_crtc *crtc)

-- 2.7.4

Maarten Lankhorst

3 Aug 3 Aug

7:50 a.m.

New subject: [PATCH v6 1/6] drm/i915/skl: Add support for the SAGV, fix underrun hangs

Op 03-08-16 om 00:37 schreef Lyude:

...

Since the watermark calculations for Skylake are still broken, we're apt to hitting underruns very easily under multi-monitor configurations. While it would be lovely if this was fixed, it's not. Another problem that's been coming from this however, is the mysterious issue of underruns causing full system hangs. An easy way to reproduce this with a skylake system:

Get a laptop with a skylake GPU, and hook up two external monitors to it

Move the cursor from the built-in LCD to one of the external displays as quickly as you can

You'll get a few pipe underruns, and eventually the entire system will just freeze.

After doing a lot of investigation and reading through the bspec, I found the existence of the SAGV, which is responsible for adjusting the system agent voltage and clock frequencies depending on how much power we need. According to the bspec:

"The display engine access to system memory is blocked during the adjustment time. SAGV defaults to enabled. Software must use the GT-driver pcode mailbox to disable SAGV when the display engine is not able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave the SAGV enabled, and disable it when we use interlaced pipes/have more then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe underruns on Skylake have completely vanished on my T460s. Additionally, the bspec mentions turning off the SAGV with more then one pipe enabled as a workaround for display underruns. While this patch doesn't entirely fix that, it looks like it does improve the situation a little bit so it's likely this is going to be required to make watermarks on Skylake fully functional.

Changes since v5:

Don't use is_power_of_2. Makes things confusing

Don't use the old state to figure out whether or not to enable/disable the sagv, use the new one

Split the loop in skl_disable_sagv into it's own function

Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail()

Changes since v4:

Use is_power_of_2 against active_crtcs to check whether we have > 1 pipe enabled

Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0 enabled

Call skl_sagv_enable/disable() from pre/post-plane updates

Changes since v3:

Use time_before() to compare timeout to jiffies

Changes since v2:

Really apply minor style nitpicks to patch this time

Changes since v1:

Added comments about this probably being one of the requirements to fixing Skylake's watermark issues

Minor style nitpicks from Matt Roper

Disable these functions on Broxton, since it doesn't have an SAGV

Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lyude cpaul@redhat.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org

drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_reg.h | 5 ++ drivers/gpu/drm/i915/intel_display.c | 11 ++++ drivers/gpu/drm/i915/intel_drv.h | 2 + drivers/gpu/drm/i915/intel_pm.c | 112 +++++++++++++++++++++++++++++++++++ 5 files changed, 132 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 65ada5d..87018d3 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1962,6 +1962,8 @@ struct drm_i915_private { struct i915_suspend_saved_registers regfile; struct vlv_s0ix_state vlv_s0ix_state;

bool skl_sagv_enabled;

struct { /*

Raw watermark latency values:

diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 2f93d4a..5fb1c63 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -7170,6 +7170,11 @@ enum { #define HSW_PCODE_DE_WRITE_FREQ_REQ 0x17 #define DISPLAY_IPS_CONTROL 0x19 #define HSW_PCODE_DYNAMIC_DUTY_CYCLE_CONTROL 0x1A +#define GEN9_PCODE_SAGV_CONTROL 0x21 +#define GEN9_SAGV_DISABLE 0x0 +#define GEN9_SAGV_LOW_FREQ 0x1 +#define GEN9_SAGV_HIGH_FREQ 0x2 +#define GEN9_SAGV_DYNAMIC_FREQ 0x3 #define GEN6_PCODE_DATA _MMIO(0x138128) #define GEN6_PCODE_FREQ_IA_RATIO_SHIFT 8 #define GEN6_PCODE_FREQ_RING_RATIO_SHIFT 16 diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index a8e8cc8..001c885 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13692,6 +13692,14 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) intel_state->cdclk_pll_vco != dev_priv->cdclk_pll.vco)) dev_priv->display.modeset_commit_cdclk(state);
/*
 * SKL workaround: bspec recommends we disable the SAGV when we
 * have more then one pipe enabled
 */
if (IS_SKYLAKE(dev_priv) &&
    hweight32(intel_state->active_crtcs) > 1)
	skl_disable_sagv(dev_priv);
intel_modeset_verify_disabled(dev); }
@@ -13765,6 +13773,9 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) intel_modeset_verify_crtc(crtc, old_crtc_state, crtc->state); }
if (hweight32(intel_state->active_crtcs) <= 1)
skl_enable_sagv(dev_priv);

Should be guarded with a if (intel_state->modeset &&

Looks ok otherwise. :-)

With that fixed:

Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com

Lyude

2 Aug 2 Aug

10:37 p.m.

New subject: [PATCH v6 2/6] drm/i915/gen9: Only copy WM results for changed pipes to skl_hw

From: Matt Roper matthew.d.roper@intel.com

When we write watermark values to the hardware, those values are stored in dev_priv->wm.skl_hw. However with recent watermark changes, the results structure we're copying from only contains valid watermark and DDB values for the pipes that are actually changing; the values for other pipes remain 0. Thus a blind copy of the entire skl_wm_values structure will clobber the values for unchanged pipes...we need to be more selective and only copy over the values for the changing pipes.

This mistake was hidden until recently due to another bug that caused us to erroneously re-calculate watermarks for all active pipes rather than changing pipes. Only when that bug was fixed was the impact of this bug discovered (e.g., modesets failing with "Requested display configuration exceeds system watermark limitations" messages and leaving watermarks non-functional, even ones initiated by intel_fbdev_restore_mode).

Changes since v1: - Add a function for copying a pipe's wm values (skl_copy_wm_for_pipe()) so we can reuse this later

Fixes: 734fa01f3a17 ("drm/i915/gen9: Calculate watermarks during atomic 'check' (v2)") Fixes: 9b6130227495 ("drm/i915/gen9: Re-allocate DDB only for changed pipes") Signed-off-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Lyude cpaul@redhat.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Cc: stable@vger.kernel.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com --- drivers/gpu/drm/i915/intel_pm.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 68721a5..7fd299e 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4064,6 +4064,24 @@ skl_compute_ddb(struct drm_atomic_state *state) return 0; }

+static void +skl_copy_wm_for_pipe(struct skl_wm_values *dst, + struct skl_wm_values *src, + enum pipe pipe) +{ + dst->wm_linetime[pipe] = src->wm_linetime[pipe]; + memcpy(dst->plane[pipe], src->plane[pipe], + sizeof(dst->plane[pipe])); + memcpy(dst->plane_trans[pipe], src->plane_trans[pipe], + sizeof(dst->plane_trans[pipe])); + + dst->ddb.pipe[pipe] = src->ddb.pipe[pipe]; + memcpy(dst->ddb.y_plane[pipe], src->ddb.y_plane[pipe], + sizeof(dst->ddb.y_plane[pipe])); + memcpy(dst->ddb.plane[pipe], src->ddb.plane[pipe], + sizeof(dst->ddb.plane[pipe])); +} + static int skl_compute_wm(struct drm_atomic_state *state) { @@ -4136,8 +4154,10 @@ static void skl_update_wm(struct drm_crtc *crtc) struct drm_device *dev = crtc->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct skl_wm_values *results = &dev_priv->wm.skl_results; + struct skl_wm_values *hw_vals = &dev_priv->wm.skl_hw; struct intel_crtc_state *cstate = to_intel_crtc_state(crtc->state); struct skl_pipe_wm *pipe_wm = &cstate->wm.skl.optimal; + int pipe;

if ((results->dirty_pipes & drm_crtc_mask(crtc)) == 0) return; @@ -4149,8 +4169,12 @@ static void skl_update_wm(struct drm_crtc *crtc) skl_write_wm_values(dev_priv, results); skl_flush_wm_values(dev_priv, results);

- /* store the new configuration */ - dev_priv->wm.skl_hw = *results; + /* + * Store the new configuration (but only for the pipes that have + * changed; the other values weren't recomputed). + */ + for_each_pipe_masked(dev_priv, pipe, results->dirty_pipes) + skl_copy_wm_for_pipe(hw_vals, results, pipe);

mutex_unlock(&dev_priv->wm.wm_mutex); }

-- 2.7.4

Lyude

10:37 p.m.

New subject: [PATCH v6 3/6] drm/i915/skl: Update plane watermarks atomically during plane updates

Thanks to Ville for suggesting this as a potential solution to pipe underruns on Skylake.

On Skylake all of the registers for configuring planes, including the registers for configuring their watermarks, are double buffered. New values written to them won't take effect until said registers are "armed", which is done by writing to the PLANE_SURF (or in the case of cursor planes, the CURBASE register) register.

With this in mind, up until now we've been updating watermarks on skl like this:

non-modeset { - calculate (during atomic check phase) - finish_atomic_commit: - intel_pre_plane_update: - intel_update_watermarks() - {vblank happens; new watermarks + old plane values => underrun } - drm_atomic_helper_commit_planes_on_crtc: - start vblank evasion - write new plane registers - end vblank evasion }

modeset { - calculate (during atomic check phase) - finish_atomic_commit: - crtc_enable: - intel_update_watermarks() - {vblank happens; new watermarks + old plane values => underrun } - drm_atomic_helper_commit_planes_on_crtc: - start vblank evasion - write new plane registers - end vblank evasion }

Now we update watermarks atomically like this:

non-modeset { - calculate (during atomic check phase) - finish_atomic_commit: - intel_pre_plane_update: - intel_update_watermarks() (wm values aren't written yet) - drm_atomic_helper_commit_planes_on_crtc: - start vblank evasion - write new plane registers - write new wm values - end vblank evasion }

modeset { - calculate (during atomic check phase) - finish_atomic_commit: - crtc_enable: - intel_update_watermarks() (actual wm values aren't written yet) - drm_atomic_helper_commit_planes_on_crtc: - start vblank evasion - write new plane registers - write new wm values - end vblank evasion }

So this patch moves all of the watermark writes into the right place; inside of the vblank evasion where we update all of the registers for each plane. While this patch doesn't fix everything, it does allow us to update the watermark values in the way the hardware expects us to.

Changes since original patch series: - Remove mutex_lock/mutex_unlock since they don't do anything and we're not touching global state - Move skl_write_cursor_wm/skl_write_plane_wm functions into intel_pm.c, make externally visible - Add skl_write_plane_wm calls to skl_update_plane - Fix conditional for for loop in skl_write_plane_wm (level < max_level should be level <= max_level) - Make diagram in commit more accurate to what's actually happening - Add Fixes:

Changes since v1: - Use IS_GEN9() instead of IS_SKYLAKE() since these fixes apply to more then just Skylake - Update description to make it clear this patch doesn't fix everything - Check if pipes were actually changed before writing watermarks

Changes since v2: - Write PIPE_WM_LINETIME during vblank evasion

Changes since v3: - Rebase against new SAGV patch changes

Changes since v4: - Add a parameter to choose what skl_wm_values struct to use when writing new plane watermarks

Changes since v5: - Remove cursor ddb entry write in skl_write_cursor_wm(), defer until patch 6 - Write WM_LINETIME in intel_begin_crtc_commit()

Fixes: 2d41c0b59afc ("drm/i915/skl: SKL Watermark Computation") Signed-off-by: Lyude cpaul@redhat.com Cc: stable@vger.kernel.org Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_display.c | 16 ++++++++++- drivers/gpu/drm/i915/intel_drv.h | 5 ++++ drivers/gpu/drm/i915/intel_pm.c | 53 +++++++++++++++++++++++++----------- drivers/gpu/drm/i915/intel_sprite.c | 6 ++++ 4 files changed, 63 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 001c885..76efe53 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -2980,6 +2980,7 @@ static void skylake_update_primary_plane(struct drm_plane *plane, struct intel_crtc *intel_crtc = to_intel_crtc(crtc_state->base.crtc); struct drm_framebuffer *fb = plane_state->base.fb; struct drm_i915_gem_object *obj = intel_fb_obj(fb); + struct skl_wm_values *wm = &dev_priv->wm.skl_results; int pipe = intel_crtc->pipe; u32 plane_ctl, stride_div, stride; u32 tile_height, plane_offset, plane_size; @@ -3031,6 +3032,9 @@ static void skylake_update_primary_plane(struct drm_plane *plane, intel_crtc->adjusted_x = x_offset; intel_crtc->adjusted_y = y_offset;

+ if (wm->dirty_pipes & drm_crtc_mask(&intel_crtc->base)) + skl_write_plane_wm(intel_crtc, wm, 0); + I915_WRITE(PLANE_CTL(pipe, 0), plane_ctl); I915_WRITE(PLANE_OFFSET(pipe, 0), plane_offset); I915_WRITE(PLANE_SIZE(pipe, 0), plane_size); @@ -10231,9 +10235,13 @@ static void i9xx_update_cursor(struct drm_crtc *crtc, u32 base, struct drm_device *dev = crtc->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_crtc *intel_crtc = to_intel_crtc(crtc); + struct skl_wm_values *wm = &dev_priv->wm.skl_results; int pipe = intel_crtc->pipe; uint32_t cntl = 0;

+ if (IS_GEN9(dev_priv) && wm->dirty_pipes & drm_crtc_mask(crtc)) + skl_write_cursor_wm(intel_crtc, wm); + if (plane_state && plane_state->visible) { cntl = MCURSOR_GAMMA_ENABLE; switch (plane_state->base.crtc_w) { @@ -14156,10 +14164,12 @@ static void intel_begin_crtc_commit(struct drm_crtc *crtc, struct drm_crtc_state *old_crtc_state) { struct drm_device *dev = crtc->dev; + struct drm_i915_private *dev_priv = to_i915(dev); struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *old_intel_state = to_intel_crtc_state(old_crtc_state); bool modeset = needs_modeset(crtc->state); + enum pipe pipe = intel_crtc->pipe;

/* Perform vblank evasion around commit operation */ intel_pipe_update_start(intel_crtc); @@ -14174,8 +14184,12 @@ static void intel_begin_crtc_commit(struct drm_crtc *crtc,

if (to_intel_crtc_state(crtc->state)->update_pipe) intel_update_pipe_config(intel_crtc, old_intel_state); - else if (INTEL_INFO(dev)->gen >= 9) + else if (INTEL_INFO(dev)->gen >= 9) { skl_detach_scalers(intel_crtc); + + I915_WRITE(PIPE_WM_LINETIME(pipe), + dev_priv->wm.skl_hw.wm_linetime[pipe]); + } }

static void intel_finish_crtc_commit(struct drm_crtc *crtc, diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 6b0532a..1b444d3 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -1711,6 +1711,11 @@ void skl_ddb_get_hw_state(struct drm_i915_private *dev_priv, struct skl_ddb_allocation *ddb /* out */); int skl_enable_sagv(struct drm_i915_private *dev_priv); int skl_disable_sagv(struct drm_i915_private *dev_priv); +void skl_write_cursor_wm(struct intel_crtc *intel_crtc, + const struct skl_wm_values *wm); +void skl_write_plane_wm(struct intel_crtc *intel_crtc, + const struct skl_wm_values *wm, + int plane); uint32_t ilk_pipe_pixel_rate(const struct intel_crtc_state *pipe_config); bool ilk_disable_lp_wm(struct drm_device *dev); int sanitize_rc6_option(struct drm_i915_private *dev_priv, int enable_rc6); diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 7fd299e..2c12b66 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3798,6 +3798,42 @@ static void skl_ddb_entry_write(struct drm_i915_private *dev_priv, I915_WRITE(reg, 0); }

+void skl_write_plane_wm(struct intel_crtc *intel_crtc, + const struct skl_wm_values *wm, + int plane) +{ + struct drm_crtc *crtc = &intel_crtc->base; + struct drm_device *dev = crtc->dev; + struct drm_i915_private *dev_priv = to_i915(dev); + int level, max_level = ilk_wm_max_level(dev); + enum pipe pipe = intel_crtc->pipe; + + if (!(wm->dirty_pipes & drm_crtc_mask(crtc))) + return; + + for (level = 0; level <= max_level; level++) { + I915_WRITE(PLANE_WM(pipe, plane, level), + wm->plane[pipe][plane][level]); + } + I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm->plane_trans[pipe][plane]); +} + +void skl_write_cursor_wm(struct intel_crtc *intel_crtc, + const struct skl_wm_values *wm) +{ + struct drm_crtc *crtc = &intel_crtc->base; + struct drm_device *dev = crtc->dev; + struct drm_i915_private *dev_priv = to_i915(dev); + int level, max_level = ilk_wm_max_level(dev); + enum pipe pipe = intel_crtc->pipe; + + for (level = 0; level <= max_level; level++) { + I915_WRITE(CUR_WM(pipe, level), + wm->plane[pipe][PLANE_CURSOR][level]); + } + I915_WRITE(CUR_WM_TRANS(pipe), wm->plane_trans[pipe][PLANE_CURSOR]); +} + static void skl_write_wm_values(struct drm_i915_private *dev_priv, const struct skl_wm_values *new) { @@ -3805,7 +3841,7 @@ static void skl_write_wm_values(struct drm_i915_private *dev_priv, struct intel_crtc *crtc;

for_each_intel_crtc(dev, crtc) { - int i, level, max_level = ilk_wm_max_level(dev); + int i; enum pipe pipe = crtc->pipe;

if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0) @@ -3813,21 +3849,6 @@ static void skl_write_wm_values(struct drm_i915_private *dev_priv, if (!crtc->active) continue;

- I915_WRITE(PIPE_WM_LINETIME(pipe), new->wm_linetime[pipe]); - - for (level = 0; level <= max_level; level++) { - for (i = 0; i < intel_num_planes(crtc); i++) - I915_WRITE(PLANE_WM(pipe, i, level), - new->plane[pipe][i][level]); - I915_WRITE(CUR_WM(pipe, level), - new->plane[pipe][PLANE_CURSOR][level]); - } - for (i = 0; i < intel_num_planes(crtc); i++) - I915_WRITE(PLANE_WM_TRANS(pipe, i), - new->plane_trans[pipe][i]); - I915_WRITE(CUR_WM_TRANS(pipe), - new->plane_trans[pipe][PLANE_CURSOR]); - for (i = 0; i < intel_num_planes(crtc); i++) { skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, i), diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c index 0de935a..55d173f 100644 --- a/drivers/gpu/drm/i915/intel_sprite.c +++ b/drivers/gpu/drm/i915/intel_sprite.c @@ -203,6 +203,9 @@ skl_update_plane(struct drm_plane *drm_plane, struct intel_plane *intel_plane = to_intel_plane(drm_plane); struct drm_framebuffer *fb = plane_state->base.fb; struct drm_i915_gem_object *obj = intel_fb_obj(fb); + struct skl_wm_values *wm = &dev_priv->wm.skl_results; + struct drm_crtc *crtc = crtc_state->base.crtc; + struct intel_crtc *intel_crtc = to_intel_crtc(crtc); const int pipe = intel_plane->pipe; const int plane = intel_plane->plane + 1; u32 plane_ctl, stride_div, stride; @@ -238,6 +241,9 @@ skl_update_plane(struct drm_plane *drm_plane, crtc_w--; crtc_h--;

+ if (wm->dirty_pipes & drm_crtc_mask(crtc)) + skl_write_plane_wm(intel_crtc, wm, plane); + if (key->flags) { I915_WRITE(PLANE_KEYVAL(pipe, plane), key->min_value); I915_WRITE(PLANE_KEYMAX(pipe, plane), key->max_value);

-- 2.7.4

Lyude

10:37 p.m.

New subject: [PATCH v6 4/6] drm/i915/skl: Ensure pipes with changed wms get added to the state

If we're enabling a pipe, we'll need to modify the watermarks on all active planes. Since those planes won't be added to the state on their own, we need to add them ourselves.

Signed-off-by: Lyude cpaul@redhat.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Cc: stable@vger.kernel.org Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com --- drivers/gpu/drm/i915/intel_pm.c | 4 ++++ 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 2c12b66..6f5beb3 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4080,6 +4080,10 @@ skl_compute_ddb(struct drm_atomic_state *state) ret = skl_allocate_pipe_ddb(cstate, ddb); if (ret) return ret; + + ret = drm_atomic_add_affected_planes(state, &intel_crtc->base); + if (ret) + return ret; }

return 0;

-- 2.7.4

Lyude

10:37 p.m.

New subject: [PATCH v6 5/6] drm/i915: Move CRTC updating in atomic_commit into it's own hook

Since we have to write ddb allocations at the same time as we do other plane updates, we're going to need to be able to control the order in which we execute modesets on each pipe. The easiest way to do this is to just factor this section of intel_atomic_commit_tail() (intel_atomic_commit() for stable branches) into it's own function, and add an appropriate display function hook for it.

Based off of Matt Rope's suggestions

Changes since v1: - Drop pipe_config->base.active check in intel_update_crtcs() since we check that before calling the function

Signed-off-by: Lyude cpaul@redhat.com Reviewed-by: Matt Roper matthew.d.roper@intel.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/intel_display.c | 74 +++++++++++++++++++++++++----------- 2 files changed, 54 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 87018d3..4964e3f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -632,6 +632,8 @@ struct drm_i915_display_funcs { struct intel_crtc_state *crtc_state); void (*crtc_enable)(struct drm_crtc *crtc); void (*crtc_disable)(struct drm_crtc *crtc); + void (*update_crtcs)(struct drm_atomic_state *state, + unsigned int *crtc_vblank_mask); void (*audio_codec_enable)(struct drm_connector *connector, struct intel_encoder *encoder, const struct drm_display_mode *adjusted_mode); diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 76efe53..59cf513 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13612,6 +13612,52 @@ static bool needs_vblank_wait(struct intel_crtc_state *crtc_state) return false; }

+static void intel_update_crtc(struct drm_crtc *crtc, + struct drm_atomic_state *state, + struct drm_crtc_state *old_crtc_state, + unsigned int *crtc_vblank_mask) +{ + struct drm_device *dev = crtc->dev; + struct drm_i915_private *dev_priv = to_i915(dev); + struct intel_crtc *intel_crtc = to_intel_crtc(crtc); + struct intel_crtc_state *pipe_config = to_intel_crtc_state(crtc->state); + bool modeset = needs_modeset(crtc->state); + + if (modeset) { + update_scanline_offset(intel_crtc); + dev_priv->display.crtc_enable(crtc); + } else { + intel_pre_plane_update(to_intel_crtc_state(old_crtc_state)); + } + + if (drm_atomic_get_existing_plane_state(state, crtc->primary)) { + intel_fbc_enable( + intel_crtc, pipe_config, + to_intel_plane_state(crtc->primary->state)); + } + + drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); + + if (needs_vblank_wait(pipe_config)) + *crtc_vblank_mask |= drm_crtc_mask(crtc); +} + +static void intel_update_crtcs(struct drm_atomic_state *state, + unsigned int *crtc_vblank_mask) +{ + struct drm_crtc *crtc; + struct drm_crtc_state *old_crtc_state; + int i; + + for_each_crtc_in_state(state, crtc, old_crtc_state, i) { + if (!crtc->state->active) + continue; + + intel_update_crtc(crtc, state, old_crtc_state, + crtc_vblank_mask); + } +} + static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -13711,17 +13757,9 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state) intel_modeset_verify_disabled(dev); }

- /* Now enable the clocks, plane, pipe, and connectors that we set up. */ + /* Complete the events for pipes that have now been disabled */ for_each_crtc_in_state(state, crtc, old_crtc_state, i) { - struct intel_crtc *intel_crtc = to_intel_crtc(crtc); bool modeset = needs_modeset(crtc->state); - struct intel_crtc_state *pipe_config = - to_intel_crtc_state(crtc->state); - - if (modeset && crtc->state->active) { - update_scanline_offset(to_intel_crtc(crtc)); - dev_priv->display.crtc_enable(crtc); - }

/* Complete events for now disable pipes here. */ if (modeset && !crtc->state->active && crtc->state->event) { @@ -13731,21 +13769,11 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)

crtc->state->event = NULL; } - - if (!modeset) - intel_pre_plane_update(to_intel_crtc_state(old_crtc_state)); - - if (crtc->state->active && - drm_atomic_get_existing_plane_state(state, crtc->primary)) - intel_fbc_enable(intel_crtc, pipe_config, to_intel_plane_state(crtc->primary->state)); - - if (crtc->state->active) - drm_atomic_helper_commit_planes_on_crtc(old_crtc_state); - - if (pipe_config->base.active && needs_vblank_wait(pipe_config)) - crtc_vblank_mask |= 1 << i; }

+ /* Now enable the clocks, plane, pipe, and connectors that we set up. */ + dev_priv->display.update_crtcs(state, &crtc_vblank_mask); + /* FIXME: We should call drm_atomic_helper_commit_hw_done() here * already, but still need the state for the delayed optimization. To * fix this: @@ -15207,6 +15235,8 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

+ dev_priv->display.update_crtcs = intel_update_crtcs; + /* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed =

-- 2.7.4

Lyude

10:37 p.m.

New subject: [PATCH v6 6/6] drm/i915/skl: Update DDB values atomically with wms/plane attrs

Now that we can hook into update_crtcs and control the order in which we update CRTCs at each modeset, we can finish the final step of fixing Skylake's watermark handling by performing DDB updates at the same time as plane updates and watermark updates.

The first major change in this patch is skl_update_crtcs(), which handles ensuring that we order each CRTC update in our atomic commits properly so that they honor the DDB flush order.

The second major change in this patch is the order in which we flush the pipes. While the previous order may have worked, it can't be used in this approach since it no longer will do the right thing. For example, using the old ddb flush order:

We have pipes A, B, and C enabled, and we're disabling C. Initial ddb allocation looks like this:

| A | B |xxxxxxx|

Since we're performing the ddb updates after performing any CRTC disablements in intel_atomic_commit_tail(), the space to the right of pipe B is unallocated.

1. Flush pipes with new allocation contained into old space. None apply, so we skip this 2. Flush pipes having their allocation reduced, but overlapping with a previous allocation. None apply, so we also skip this 3. Flush pipes that got more space allocated. This applies to A and B, giving us the following update order: A, B

This is wrong, since updating pipe A first will cause it to overlap with B and potentially burst into flames. Our new order (see the code comments for details) would update the pipes in the proper order: B, A.

As well, we calculate the order for each DDB update during the check phase, and reference it later in the commit phase when we hit skl_update_crtcs().

This long overdue patch fixes the rest of the underruns on Skylake.

Changes since v1: - Add skl_ddb_entry_write() for cursor into skl_write_cursor_wm()

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration") Fixes: 8211bd5bdf5e ("drm/i915/skl: Program the DDB allocation") Signed-off-by: Lyude cpaul@redhat.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com --- drivers/gpu/drm/i915/intel_display.c | 100 ++++++++++-- drivers/gpu/drm/i915/intel_drv.h | 10 ++ drivers/gpu/drm/i915/intel_pm.c | 288 ++++++++++++++++------------------- 3 files changed, 233 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59cf513..06295f7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12897,16 +12897,23 @@ static void verify_wm_state(struct drm_crtc *crtc, hw_entry->start, hw_entry->end); }

- /* cursor */ - hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR]; - sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR]; - - if (!skl_ddb_entry_equal(hw_entry, sw_entry)) { - DRM_ERROR("mismatch in DDB state pipe %c cursor " - "(expected (%u,%u), found (%u,%u))\n", - pipe_name(pipe), - sw_entry->start, sw_entry->end, - hw_entry->start, hw_entry->end); + /* + * cursor + * If the cursor plane isn't active, we may not have updated it's ddb + * allocation. In that case since the ddb allocation will be updated + * once the plane becomes visible, we can skip this check + */ + if (intel_crtc->cursor_addr) { + hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR]; + sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR]; + + if (!skl_ddb_entry_equal(hw_entry, sw_entry)) { + DRM_ERROR("mismatch in DDB state pipe %c cursor " + "(expected (%u,%u), found (%u,%u))\n", + pipe_name(pipe), + sw_entry->start, sw_entry->end, + hw_entry->start, hw_entry->end); + } } }

@@ -13658,6 +13665,72 @@ static void intel_update_crtcs(struct drm_atomic_state *state, } }

+static inline void +skl_do_ddb_step(struct drm_atomic_state *state, + enum skl_ddb_step step) +{ + struct intel_atomic_state *intel_state = to_intel_atomic_state(state); + struct drm_crtc *crtc; + struct drm_crtc_state *old_crtc_state; + unsigned int crtc_vblank_mask; /* unused */ + int i; + + for_each_crtc_in_state(state, crtc, old_crtc_state, i) { + struct intel_crtc *intel_crtc = to_intel_crtc(crtc); + struct intel_crtc_state *cstate = + to_intel_crtc_state(crtc->state); + bool vblank_wait = false; + + if (cstate->wm.skl.ddb_realloc != step || !crtc->state->active) + continue; + + /* + * If we're changing the ddb allocation of this pipe to make + * room for another pipe, we have to wait for the pipe's ddb + * allocations to actually update by waiting for a vblank. + * Otherwise we risk the next pipe updating before this pipe + * finishes, resulting in the pipe fetching from ddb space for + * the wrong pipe. + * + * However, if we know we don't have any more pipes to move + * around, we can skip this wait and the new ddb allocation + * will take effect at the start of the next vblank. + */ + switch (step) { + case SKL_DDB_STEP_NO_OVERLAP: + case SKL_DDB_STEP_OVERLAP: + if (step != intel_state->last_ddb_step) + vblank_wait = true; + + /* drop through */ + case SKL_DDB_STEP_FINAL: + DRM_DEBUG_KMS( + "Updating [CRTC:%d:pipe %c] for DDB step %d\n", + crtc->base.id, pipe_name(intel_crtc->pipe), + step); + + case SKL_DDB_STEP_NONE: + break; + } + + intel_update_crtc(crtc, state, old_crtc_state, + &crtc_vblank_mask); + + if (vblank_wait) + intel_wait_for_vblank(state->dev, intel_crtc->pipe); + } +} + +static void skl_update_crtcs(struct drm_atomic_state *state, + unsigned int *crtc_vblank_mask) +{ + struct intel_atomic_state *intel_state = to_intel_atomic_state(state); + enum skl_ddb_step step; + + for (step = 0; step <= intel_state->last_ddb_step; step++) + skl_do_ddb_step(state, step); +} + static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -15235,8 +15308,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

- dev_priv->display.update_crtcs = intel_update_crtcs; - /* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed = @@ -15326,6 +15397,11 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) skl_modeset_calc_cdclk; }

+ if (dev_priv->info.gen >= 9) + dev_priv->display.update_crtcs = skl_update_crtcs; + else + dev_priv->display.update_crtcs = intel_update_crtcs; + switch (INTEL_INFO(dev_priv)->gen) { case 2: dev_priv->display.queue_flip = intel_gen2_queue_flip; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1b444d3..cf5da83 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -334,6 +334,7 @@ struct intel_atomic_state {

/* Gen9+ only */ struct skl_wm_values wm_results; + int last_ddb_step; };

struct intel_plane_state { @@ -437,6 +438,13 @@ struct skl_pipe_wm { uint32_t linetime; };

+enum skl_ddb_step { + SKL_DDB_STEP_NONE = 0, + SKL_DDB_STEP_NO_OVERLAP, + SKL_DDB_STEP_OVERLAP, + SKL_DDB_STEP_FINAL +}; + struct intel_crtc_wm_state { union { struct { @@ -467,6 +475,8 @@ struct intel_crtc_wm_state { /* minimum block allocation */ uint16_t minimum_blocks[I915_MAX_PLANES]; uint16_t minimum_y_blocks[I915_MAX_PLANES]; + + enum skl_ddb_step ddb_realloc; } skl; };

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6f5beb3..636c90a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3816,6 +3816,11 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][plane][level]); } I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm->plane_trans[pipe][plane]); + + skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, plane), + &wm->ddb.plane[pipe][plane]); + skl_ddb_entry_write(dev_priv, PLANE_NV12_BUF_CFG(pipe, plane), + &wm->ddb.y_plane[pipe][plane]); }

void skl_write_cursor_wm(struct intel_crtc *intel_crtc, @@ -3832,170 +3837,51 @@ void skl_write_cursor_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][PLANE_CURSOR][level]); } I915_WRITE(CUR_WM_TRANS(pipe), wm->plane_trans[pipe][PLANE_CURSOR]); -} - -static void skl_write_wm_values(struct drm_i915_private *dev_priv, - const struct skl_wm_values *new) -{ - struct drm_device *dev = &dev_priv->drm; - struct intel_crtc *crtc; - - for_each_intel_crtc(dev, crtc) { - int i; - enum pipe pipe = crtc->pipe; - - if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0) - continue; - if (!crtc->active) - continue;

- for (i = 0; i < intel_num_planes(crtc); i++) { - skl_ddb_entry_write(dev_priv, - PLANE_BUF_CFG(pipe, i), - &new->ddb.plane[pipe][i]); - skl_ddb_entry_write(dev_priv, - PLANE_NV12_BUF_CFG(pipe, i), - &new->ddb.y_plane[pipe][i]); - } - - skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe), - &new->ddb.plane[pipe][PLANE_CURSOR]); - } + skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe), + &wm->ddb.plane[pipe][PLANE_CURSOR]); }

-/* - * When setting up a new DDB allocation arrangement, we need to correctly - * sequence the times at which the new allocations for the pipes are taken into - * account or we'll have pipes fetching from space previously allocated to - * another pipe. - * - * Roughly the sequence looks like: - * 1. re-allocate the pipe(s) with the allocation being reduced and not - * overlapping with a previous light-up pipe (another way to put it is: - * pipes with their new allocation strickly included into their old ones). - * 2. re-allocate the other pipes that get their allocation reduced - * 3. allocate the pipes having their allocation increased - * - * Steps 1. and 2. are here to take care of the following case: - * - Initially DDB looks like this: - * | B | C | - * - enable pipe A. - * - pipe B has a reduced DDB allocation that overlaps with the old pipe C - * allocation - * | A | B | C | - * - * We need to sequence the re-allocation: C, B, A (and not B, C, A). - */ - -static void -skl_wm_flush_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, int pass) +static bool +skl_ddb_allocation_equals(const struct skl_ddb_allocation *old, + const struct skl_ddb_allocation *new, + enum pipe pipe) { - int plane; - - DRM_DEBUG_KMS("flush pipe %c (pass %d)\n", pipe_name(pipe), pass); - - for_each_plane(dev_priv, pipe, plane) { - I915_WRITE(PLANE_SURF(pipe, plane), - I915_READ(PLANE_SURF(pipe, plane))); - } - I915_WRITE(CURBASE(pipe), I915_READ(CURBASE(pipe))); + return new->pipe[pipe].start == old->pipe[pipe].start && + new->pipe[pipe].end == old->pipe[pipe].end; }

static bool -skl_ddb_allocation_included(const struct skl_ddb_allocation *old, +skl_ddb_allocation_overlaps(struct drm_atomic_state *state, + const struct skl_ddb_allocation *old, const struct skl_ddb_allocation *new, enum pipe pipe) { - uint16_t old_size, new_size; - - old_size = skl_ddb_entry_size(&old->pipe[pipe]); - new_size = skl_ddb_entry_size(&new->pipe[pipe]); - - return old_size != new_size && - new->pipe[pipe].start >= old->pipe[pipe].start && - new->pipe[pipe].end <= old->pipe[pipe].end; -} - -static void skl_flush_wm_values(struct drm_i915_private *dev_priv, - struct skl_wm_values *new_values) -{ - struct drm_device *dev = &dev_priv->drm; - struct skl_ddb_allocation *cur_ddb, *new_ddb; - bool reallocated[I915_MAX_PIPES] = {}; - struct intel_crtc *crtc; - enum pipe pipe; - - new_ddb = &new_values->ddb; - cur_ddb = &dev_priv->wm.skl_hw.ddb; - - /* - * First pass: flush the pipes with the new allocation contained into - * the old space. - * - * We'll wait for the vblank on those pipes to ensure we can safely - * re-allocate the freed space without this pipe fetching from it. - */ - for_each_intel_crtc(dev, crtc) { - if (!crtc->active) - continue; - - pipe = crtc->pipe; - - if (!skl_ddb_allocation_included(cur_ddb, new_ddb, pipe)) - continue; - - skl_wm_flush_pipe(dev_priv, pipe, 1); - intel_wait_for_vblank(dev, pipe); - - reallocated[pipe] = true; - } - - - /* - * Second pass: flush the pipes that are having their allocation - * reduced, but overlapping with a previous allocation. - * - * Here as well we need to wait for the vblank to make sure the freed - * space is not used anymore. - */ - for_each_intel_crtc(dev, crtc) { - if (!crtc->active) - continue; - - pipe = crtc->pipe; - - if (reallocated[pipe]) - continue; - - if (skl_ddb_entry_size(&new_ddb->pipe[pipe]) < - skl_ddb_entry_size(&cur_ddb->pipe[pipe])) { - skl_wm_flush_pipe(dev_priv, pipe, 2); - intel_wait_for_vblank(dev, pipe); - reallocated[pipe] = true; - } - } - - /* - * Third pass: flush the pipes that got more space allocated. - * - * We don't need to actively wait for the update here, next vblank - * will just get more DDB space with the correct WM values. - */ - for_each_intel_crtc(dev, crtc) { - if (!crtc->active) - continue; + struct drm_device *dev = state->dev; + struct intel_crtc *intel_crtc; + enum pipe otherp;

- pipe = crtc->pipe; + for_each_intel_crtc(dev, intel_crtc) { + otherp = intel_crtc->pipe;

/* - * At this point, only the pipes more space than before are - * left to re-allocate. + * When checking for overlaps, we don't want to: + * - Compare against ourselves + * - Compare against pipes that will be disabled in step 0 + * - Compare against pipes that won't be enabled until step 3 */ - if (reallocated[pipe]) + if (otherp == pipe || !new->pipe[otherp].end || + !old->pipe[otherp].end) continue;

- skl_wm_flush_pipe(dev_priv, pipe, 3); + if ((new->pipe[pipe].start >= old->pipe[otherp].start && + new->pipe[pipe].start < old->pipe[otherp].end) || + (old->pipe[otherp].start >= new->pipe[pipe].start && + old->pipe[otherp].start < new->pipe[pipe].end)) + return true; } + + return false; }

static int skl_update_pipe_wm(struct drm_crtc_state *cstate, @@ -4038,8 +3924,10 @@ skl_compute_ddb(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_atomic_state *intel_state = to_intel_atomic_state(state); + struct intel_crtc_state *cstate; struct intel_crtc *intel_crtc; - struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb; + struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb; + struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb; uint32_t realloc_pipes = pipes_modified(state); int ret;

@@ -4071,13 +3959,11 @@ skl_compute_ddb(struct drm_atomic_state *state) }

for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) { - struct intel_crtc_state *cstate; - cstate = intel_atomic_get_crtc_state(state, intel_crtc); if (IS_ERR(cstate)) return PTR_ERR(cstate);

- ret = skl_allocate_pipe_ddb(cstate, ddb); + ret = skl_allocate_pipe_ddb(cstate, new_ddb); if (ret) return ret;

@@ -4086,6 +3972,73 @@ skl_compute_ddb(struct drm_atomic_state *state) return ret; }

+ /* + * When setting up a new DDB allocation arrangement, we need to + * correctly sequence the times at which the new allocations for the + * pipes are taken into account or we'll have pipes fetching from space + * previously allocated to another pipe. + * + * Roughly the final sequence we want looks like this: + * 1. Disable any pipes we're not going to be using anymore + * 2. Reallocate all of the active pipes whose new ddb allocations + * won't overlap with another active pipe's ddb allocation. + * 3. Reallocate remaining active pipes, if any. + * 4. Enable any new pipes, if any. + * + * Example: + * Initially DDB looks like this: + * | B | C | + * And the final DDB should look like this: + * | B | C | A | + * + * 1. We're not disabling any pipes, so do nothing on this step. + * 2. Pipe B's new allocation wouldn't overlap with pipe C, however + * pipe C's new allocation does overlap with pipe B's current + * allocation. Reallocate B first so the DDB looks like this: + * | B |xx| C | + * 3. Now we can safely reallocate pipe C to it's new location: + * | B | C |xxxxx| + * 4. Enable any remaining pipes, in this case A + * | B | C | A | + * + * As well, between every pipe reallocation we have to wait for a + * vblank on the pipe so that we ensure it's new allocation has taken + * effect by the time we start moving the next pipe. This can be + * skipped on the last step we need to perform, which is why we keep + * track of that information here. For example, if we've reallocated + * all the pipes that need changing by the time we reach step 3, we can + * finish without waiting for the pipes we changed in step 3 to update. + */ + for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) { + enum pipe pipe = intel_crtc->pipe; + enum skl_ddb_step step; + + cstate = intel_atomic_get_crtc_state(state, intel_crtc); + if (IS_ERR(cstate)) + return PTR_ERR(cstate); + + /* Step 1: Pipes we're disabling / haven't changed */ + if (skl_ddb_allocation_equals(old_ddb, new_ddb, pipe) || + new_ddb->pipe[pipe].end == 0) { + step = SKL_DDB_STEP_NONE; + /* Step 2-3: Active pipes we're reallocating */ + } else if (old_ddb->pipe[pipe].end != 0) { + if (skl_ddb_allocation_overlaps(state, old_ddb, new_ddb, + pipe)) + step = SKL_DDB_STEP_OVERLAP; + else + step = SKL_DDB_STEP_NO_OVERLAP; + /* Step 4: Pipes we're enabling */ + } else { + step = SKL_DDB_STEP_FINAL; + } + + cstate->wm.skl.ddb_realloc = step; + + if (step > intel_state->last_ddb_step) + intel_state->last_ddb_step = step; + } + return 0; }

@@ -4110,10 +4063,13 @@ skl_copy_wm_for_pipe(struct skl_wm_values *dst, static int skl_compute_wm(struct drm_atomic_state *state) { + struct drm_i915_private *dev_priv = to_i915(state->dev); struct drm_crtc *crtc; struct drm_crtc_state *cstate; struct intel_atomic_state *intel_state = to_intel_atomic_state(state); struct skl_wm_values *results = &intel_state->wm_results; + struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb; + struct skl_ddb_allocation *new_ddb = &results->ddb; struct skl_pipe_wm *pipe_wm; bool changed = false; int ret, i; @@ -4152,7 +4108,10 @@ skl_compute_wm(struct drm_atomic_state *state) struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate); + enum skl_ddb_step step; + enum pipe pipe;

+ pipe = intel_crtc->pipe; pipe_wm = &intel_cstate->wm.skl.optimal; ret = skl_update_pipe_wm(cstate, &results->ddb, pipe_wm, &changed); @@ -4167,7 +4126,18 @@ skl_compute_wm(struct drm_atomic_state *state) continue;

intel_cstate->update_wm_pre = true; + step = intel_cstate->wm.skl.ddb_realloc; skl_compute_wm_results(crtc->dev, pipe_wm, results, intel_crtc); + + if (!skl_ddb_entry_equal(&old_ddb->pipe[pipe], + &new_ddb->pipe[pipe])) { + DRM_DEBUG_KMS( + "DDB changes for [CRTC:%d:pipe %c]: (%3d - %3d) -> (%3d - %3d) on step %d\n", + intel_crtc->base.base.id, pipe_name(pipe), + old_ddb->pipe[pipe].start, old_ddb->pipe[pipe].end, + new_ddb->pipe[pipe].start, new_ddb->pipe[pipe].end, + step); + } }

return 0; @@ -4191,8 +4161,20 @@ static void skl_update_wm(struct drm_crtc *crtc)

mutex_lock(&dev_priv->wm.wm_mutex);

- skl_write_wm_values(dev_priv, results); - skl_flush_wm_values(dev_priv, results); + /* + * If this pipe isn't active already, we're going to be enabling it + * very soon. Since it's safe to update these while the pipe's shut off, + * just do so here. Already active pipes will have their watermarks + * updated once we update their planes. + */ + if (!intel_crtc->active) { + int plane; + + for (plane = 0; plane < intel_num_planes(intel_crtc); plane++) + skl_write_plane_wm(intel_crtc, results, plane); + + skl_write_cursor_wm(intel_crtc, results); + }

/* * Store the new configuration (but only for the pipes that have

-- 2.7.4

Ville Syrjälä

3 Aug 3 Aug

3 p.m.

New subject: [PATCH v6 6/6] drm/i915/skl: Update DDB values atomically with wms/plane attrs

On Tue, Aug 02, 2016 at 06:37:37PM -0400, Lyude wrote:

...

Now that we can hook into update_crtcs and control the order in which we update CRTCs at each modeset, we can finish the final step of fixing Skylake's watermark handling by performing DDB updates at the same time as plane updates and watermark updates.

The first major change in this patch is skl_update_crtcs(), which handles ensuring that we order each CRTC update in our atomic commits properly so that they honor the DDB flush order.

The second major change in this patch is the order in which we flush the pipes. While the previous order may have worked, it can't be used in this approach since it no longer will do the right thing. For example, using the old ddb flush order:

We have pipes A, B, and C enabled, and we're disabling C. Initial ddb allocation looks like this:

| A | B |xxxxxxx|

Since we're performing the ddb updates after performing any CRTC disablements in intel_atomic_commit_tail(), the space to the right of pipe B is unallocated.

Flush pipes with new allocation contained into old space. None apply, so we skip this

Flush pipes having their allocation reduced, but overlapping with a previous allocation. None apply, so we also skip this

Flush pipes that got more space allocated. This applies to A and B, giving us the following update order: A, B

This is wrong, since updating pipe A first will cause it to overlap with B and potentially burst into flames. Our new order (see the code comments for details) would update the pipes in the proper order: B, A.

As well, we calculate the order for each DDB update during the check phase, and reference it later in the commit phase when we hit skl_update_crtcs().

This long overdue patch fixes the rest of the underruns on Skylake.

Changes since v1:

Add skl_ddb_entry_write() for cursor into skl_write_cursor_wm()

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration") Fixes: 8211bd5bdf5e ("drm/i915/skl: Program the DDB allocation") Signed-off-by: Lyude cpaul@redhat.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com

drivers/gpu/drm/i915/intel_display.c | 100 ++++++++++-- drivers/gpu/drm/i915/intel_drv.h | 10 ++ drivers/gpu/drm/i915/intel_pm.c | 288 ++++++++++++++++------------------- 3 files changed, 233 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59cf513..06295f7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12897,16 +12897,23 @@ static void verify_wm_state(struct drm_crtc *crtc, hw_entry->start, hw_entry->end); }
/* cursor */

hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];

sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];

if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
DRM_ERROR("mismatch in DDB state pipe %c cursor "
	  "(expected (%u,%u), found (%u,%u))\n",
	  pipe_name(pipe),
	  sw_entry->start, sw_entry->end,
	  hw_entry->start, hw_entry->end);
/*
* cursor
* If the cursor plane isn't active, we may not have updated it's ddb
* allocation. In that case since the ddb allocation will be updated
* once the plane becomes visible, we can skip this check
*/
if (intel_crtc->cursor_addr) {
hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];
sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];
if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
	DRM_ERROR("mismatch in DDB state pipe %c cursor "
		  "(expected (%u,%u), found (%u,%u))\n",
		  pipe_name(pipe),
		  sw_entry->start, sw_entry->end,
		  hw_entry->start, hw_entry->end);
}
}
}

@@ -13658,6 +13665,72 @@ static void intel_update_crtcs(struct drm_atomic_state *state, } }

+static inline void +skl_do_ddb_step(struct drm_atomic_state *state,
enum skl_ddb_step step)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct drm_crtc *crtc;

struct drm_crtc_state *old_crtc_state;

unsigned int crtc_vblank_mask; /* unused */

int i;

for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
struct intel_crtc_state *cstate =
	to_intel_crtc_state(crtc->state);
bool vblank_wait = false;
if (cstate->wm.skl.ddb_realloc != step || !crtc->state->active)
	continue;
/*
 * If we're changing the ddb allocation of this pipe to make
 * room for another pipe, we have to wait for the pipe's ddb
 * allocations to actually update by waiting for a vblank.
 * Otherwise we risk the next pipe updating before this pipe
 * finishes, resulting in the pipe fetching from ddb space for
 * the wrong pipe.
 *
 * However, if we know we don't have any more pipes to move
 * around, we can skip this wait and the new ddb allocation
 * will take effect at the start of the next vblank.
 */
switch (step) {
case SKL_DDB_STEP_NO_OVERLAP:
case SKL_DDB_STEP_OVERLAP:
	if (step != intel_state->last_ddb_step)
		vblank_wait = true;
/* drop through */
case SKL_DDB_STEP_FINAL:
	DRM_DEBUG_KMS(
	    "Updating [CRTC:%d:pipe %c] for DDB step %d\n",
	    crtc->base.id, pipe_name(intel_crtc->pipe),
	    step);
case SKL_DDB_STEP_NONE:
	break;
}

Not sure we really need this step stuff. How about?

for_each_crtc if (crtc_needs_disabling) disable_crtc();

do { progress = false; wait_vbl_pipes=0; for_each_crtc() { if (!active || needs_modeset) continue; if (!ddb_changed) continue; if (new_ddb_overlaps_with_any_other_pipes_current_ddb) continue; commit; wait_vbl_pipes |= pipe; progress = true; } wait_vbls(wait_vbl_pipes); } while (progress);

for_each_crtc if (crtc_needs_enabling) enable_crtc(); commit; }

Or if we're paranoid, we could also have an upper bound on the loop and assert that we never reach it.

Though one thing I don't particularly like about this commit while changing the ddb approach is that it's going to make the update appear even less atomic. What I'd rather like to do for the normal commit path is this:

for_each_crtc if (crtc_needs_disabling) disable_planes for_each_crtc if (crtc_needs_disabling) disable_crtc for_each_crtc if (crtc_needs_enabling) enable_crtc for_each_crtc if (active) commit_planes;

That way everything would pop in and out as close together as possible. Hmm. Actually, I wonder... I'm thinking we should be able to enable all crtcs prior to entering the ddb commit loop, on account of no planes being enabled on those crtcs until we commit them. And if no planes are enabled, running the pipe w/o allocated ddb should be fine. So with that approach, I think we should be able to commit all planes within a few iterations of the loop, and hence within a few vblanks.

...

intel_update_crtc(crtc, state, old_crtc_state,
		  &crtc_vblank_mask);
if (vblank_wait)
	intel_wait_for_vblank(state->dev, intel_crtc->pipe);
}
+}

+static void skl_update_crtcs(struct drm_atomic_state *state,
	     unsigned int *crtc_vblank_mask)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

enum skl_ddb_step step;

for (step = 0; step <= intel_state->last_ddb_step; step++)
skl_do_ddb_step(state, step);
+}

static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -15235,8 +15308,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

dev_priv->display.update_crtcs = intel_update_crtcs;

/* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed =

@@ -15326,6 +15397,11 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) skl_modeset_calc_cdclk; }
if (dev_priv->info.gen >= 9)
dev_priv->display.update_crtcs = skl_update_crtcs;
else
dev_priv->display.update_crtcs = intel_update_crtcs;
switch (INTEL_INFO(dev_priv)->gen) { case 2: dev_priv->display.queue_flip = intel_gen2_queue_flip;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1b444d3..cf5da83 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -334,6 +334,7 @@ struct intel_atomic_state {

/* Gen9+ only */ struct skl_wm_values wm_results;

int last_ddb_step;

};

struct intel_plane_state { @@ -437,6 +438,13 @@ struct skl_pipe_wm { uint32_t linetime; };

+enum skl_ddb_step {

SKL_DDB_STEP_NONE = 0,

SKL_DDB_STEP_NO_OVERLAP,

SKL_DDB_STEP_OVERLAP,

SKL_DDB_STEP_FINAL

+};

struct intel_crtc_wm_state { union { struct { @@ -467,6 +475,8 @@ struct intel_crtc_wm_state { /* minimum block allocation */ uint16_t minimum_blocks[I915_MAX_PLANES]; uint16_t minimum_y_blocks[I915_MAX_PLANES];
	enum skl_ddb_step ddb_realloc;
} skl; };
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6f5beb3..636c90a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3816,6 +3816,11 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][plane][level]); } I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm->plane_trans[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, plane),
	    &wm->ddb.plane[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_NV12_BUF_CFG(pipe, plane),
	    &wm->ddb.y_plane[pipe][plane]);
}

void skl_write_cursor_wm(struct intel_crtc *intel_crtc, @@ -3832,170 +3837,51 @@ void skl_write_cursor_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][PLANE_CURSOR][level]); } I915_WRITE(CUR_WM_TRANS(pipe), wm->plane_trans[pipe][PLANE_CURSOR]); -}

-static void skl_write_wm_values(struct drm_i915_private *dev_priv,
		const struct skl_wm_values *new)
-{
struct drm_device *dev = &dev_priv->drm;

struct intel_crtc *crtc;

for_each_intel_crtc(dev, crtc) {
int i;
enum pipe pipe = crtc->pipe;
if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0)
	continue;
if (!crtc->active)
	continue;
for (i = 0; i < intel_num_planes(crtc); i++) {
	skl_ddb_entry_write(dev_priv,
			    PLANE_BUF_CFG(pipe, i),
			    &new->ddb.plane[pipe][i]);
	skl_ddb_entry_write(dev_priv,
			    PLANE_NV12_BUF_CFG(pipe, i),
			    &new->ddb.y_plane[pipe][i]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
		    &new->ddb.plane[pipe][PLANE_CURSOR]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
	    &wm->ddb.plane[pipe][PLANE_CURSOR]);
}

-/*
When setting up a new DDB allocation arrangement, we need to correctly

sequence the times at which the new allocations for the pipes are taken into

account or we'll have pipes fetching from space previously allocated to

another pipe.

Roughly the sequence looks like:

re-allocate the pipe(s) with the allocation being reduced and not
overlapping with a previous light-up pipe (another way to put it is:
pipes with their new allocation strickly included into their old ones).
re-allocate the other pipes that get their allocation reduced

allocate the pipes having their allocation increased

Steps 1. and 2. are here to take care of the following case:

Initially DDB looks like this:
|   B    |   C    |
enable pipe A.

pipe B has a reduced DDB allocation that overlaps with the old pipe C

allocation
|  A  |  B  |  C  |
We need to sequence the re-allocation: C, B, A (and not B, C, A).

*/
-static void -skl_wm_flush_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, int pass) +static bool +skl_ddb_allocation_equals(const struct skl_ddb_allocation *old,
	  const struct skl_ddb_allocation *new,
	  enum pipe pipe)
{
int plane;

DRM_DEBUG_KMS("flush pipe %c (pass %d)\n", pipe_name(pipe), pass);

for_each_plane(dev_priv, pipe, plane) {
I915_WRITE(PLANE_SURF(pipe, plane),
	   I915_READ(PLANE_SURF(pipe, plane)));
}

I915_WRITE(CURBASE(pipe), I915_READ(CURBASE(pipe)));
return new->pipe[pipe].start == old->pipe[pipe].start &&
      new->pipe[pipe].end == old->pipe[pipe].end;
}

static bool -skl_ddb_allocation_included(const struct skl_ddb_allocation *old, +skl_ddb_allocation_overlaps(struct drm_atomic_state *state,
	    const struct skl_ddb_allocation *old,
    const struct skl_ddb_allocation *new,
    enum pipe pipe)
{
uint16_t old_size, new_size;

old_size = skl_ddb_entry_size(&old->pipe[pipe]);

new_size = skl_ddb_entry_size(&new->pipe[pipe]);

return old_size != new_size &&
      new->pipe[pipe].start >= old->pipe[pipe].start &&
      new->pipe[pipe].end <= old->pipe[pipe].end;
-}

-static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
		struct skl_wm_values *new_values)
-{
struct drm_device *dev = &dev_priv->drm;

struct skl_ddb_allocation *cur_ddb, *new_ddb;

bool reallocated[I915_MAX_PIPES] = {};

struct intel_crtc *crtc;

enum pipe pipe;

new_ddb = &new_values->ddb;

cur_ddb = &dev_priv->wm.skl_hw.ddb;

/*
* First pass: flush the pipes with the new allocation contained into
* the old space.
*
* We'll wait for the vblank on those pipes to ensure we can safely
* re-allocate the freed space without this pipe fetching from it.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (!skl_ddb_allocation_included(cur_ddb, new_ddb, pipe))
	continue;
skl_wm_flush_pipe(dev_priv, pipe, 1);
intel_wait_for_vblank(dev, pipe);
reallocated[pipe] = true;
}

/*
* Second pass: flush the pipes that are having their allocation
* reduced, but overlapping with a previous allocation.
*
* Here as well we need to wait for the vblank to make sure the freed
* space is not used anymore.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (reallocated[pipe])
	continue;
if (skl_ddb_entry_size(&new_ddb->pipe[pipe]) <
    skl_ddb_entry_size(&cur_ddb->pipe[pipe])) {
	skl_wm_flush_pipe(dev_priv, pipe, 2);
	intel_wait_for_vblank(dev, pipe);
	reallocated[pipe] = true;
}
}

/*
* Third pass: flush the pipes that got more space allocated.
*
* We don't need to actively wait for the update here, next vblank
* will just get more DDB space with the correct WM values.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
struct drm_device *dev = state->dev;

struct intel_crtc *intel_crtc;

enum pipe otherp;
pipe = crtc->pipe;
for_each_intel_crtc(dev, intel_crtc) {
otherp = intel_crtc->pipe;
/*
 * At this point, only the pipes more space than before are
 * left to re-allocate.
 * When checking for overlaps, we don't want to:
 *  - Compare against ourselves
 *  - Compare against pipes that will be disabled in step 0
 *  - Compare against pipes that won't be enabled until step 3
*/
if (reallocated[pipe])
if (otherp == pipe || !new->pipe[otherp].end ||
    !old->pipe[otherp].end)
continue;
skl_wm_flush_pipe(dev_priv, pipe, 3);
if ((new->pipe[pipe].start >= old->pipe[otherp].start &&
     new->pipe[pipe].start < old->pipe[otherp].end) ||
    (old->pipe[otherp].start >= new->pipe[pipe].start &&
     old->pipe[otherp].start < new->pipe[pipe].end))
	return true;
}
return false;
}

static int skl_update_pipe_wm(struct drm_crtc_state *cstate, @@ -4038,8 +3924,10 @@ skl_compute_ddb(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct intel_crtc_state *cstate; struct intel_crtc *intel_crtc;

struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb; uint32_t realloc_pipes = pipes_modified(state); int ret;

@@ -4071,13 +3959,11 @@ skl_compute_ddb(struct drm_atomic_state *state) }

for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
struct intel_crtc_state *cstate;
cstate = intel_atomic_get_crtc_state(state, intel_crtc); if (IS_ERR(cstate)) return PTR_ERR(cstate);
ret = skl_allocate_pipe_ddb(cstate, ddb);
ret = skl_allocate_pipe_ddb(cstate, new_ddb);
if (ret) return ret;
@@ -4086,6 +3972,73 @@ skl_compute_ddb(struct drm_atomic_state *state) return ret; }
/*
* When setting up a new DDB allocation arrangement, we need to
* correctly sequence the times at which the new allocations for the
* pipes are taken into account or we'll have pipes fetching from space
* previously allocated to another pipe.
*
* Roughly the final sequence we want looks like this:
*  1. Disable any pipes we're not going to be using anymore
*  2. Reallocate all of the active pipes whose new ddb allocations
*  won't overlap with another active pipe's ddb allocation.
*  3. Reallocate remaining active pipes, if any.
*  4. Enable any new pipes, if any.
*
* Example:
* Initially DDB looks like this:
*   |   B    |   C    |
* And the final DDB should look like this:
*   |  B  |  C  |  A  |
*
* 1. We're not disabling any pipes, so do nothing on this step.
* 2. Pipe B's new allocation wouldn't overlap with pipe C, however
* pipe C's new allocation does overlap with pipe B's current
* allocation. Reallocate B first so the DDB looks like this:
*   |  B  |xx|   C    |
* 3. Now we can safely reallocate pipe C to it's new location:
*   |  B  |  C  |xxxxx|
* 4. Enable any remaining pipes, in this case A
*   |  B  |  C  |  A  |
*
* As well, between every pipe reallocation we have to wait for a
* vblank on the pipe so that we ensure it's new allocation has taken
* effect by the time we start moving the next pipe. This can be
* skipped on the last step we need to perform, which is why we keep
* track of that information here. For example, if we've reallocated
* all the pipes that need changing by the time we reach step 3, we can
* finish without waiting for the pipes we changed in step 3 to update.
*/
for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
enum pipe pipe = intel_crtc->pipe;
enum skl_ddb_step step;
cstate = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(cstate))
	return PTR_ERR(cstate);
/* Step 1: Pipes we're disabling / haven't changed */
if (skl_ddb_allocation_equals(old_ddb, new_ddb, pipe) ||
    new_ddb->pipe[pipe].end == 0) {
	step = SKL_DDB_STEP_NONE;
/* Step 2-3: Active pipes we're reallocating */
} else if (old_ddb->pipe[pipe].end != 0) {
	if (skl_ddb_allocation_overlaps(state, old_ddb, new_ddb,
					pipe))
		step = SKL_DDB_STEP_OVERLAP;
	else
		step = SKL_DDB_STEP_NO_OVERLAP;
/* Step 4: Pipes we're enabling */
} else {
	step = SKL_DDB_STEP_FINAL;
}
cstate->wm.skl.ddb_realloc = step;
if (step > intel_state->last_ddb_step)
	intel_state->last_ddb_step = step;
}

return 0;
}

@@ -4110,10 +4063,13 @@ skl_copy_wm_for_pipe(struct skl_wm_values *dst, static int skl_compute_wm(struct drm_atomic_state *state) {

struct drm_i915_private *dev_priv = to_i915(state->dev); struct drm_crtc *crtc; struct drm_crtc_state *cstate; struct intel_atomic_state *intel_state = to_intel_atomic_state(state); struct skl_wm_values *results = &intel_state->wm_results;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &results->ddb; struct skl_pipe_wm *pipe_wm; bool changed = false; int ret, i;

@@ -4152,7 +4108,10 @@ skl_compute_wm(struct drm_atomic_state *state) struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate);
enum skl_ddb_step step;
enum pipe pipe;
pipe = intel_crtc->pipe;
pipe_wm = &intel_cstate->wm.skl.optimal; ret = skl_update_pipe_wm(cstate, &results->ddb, pipe_wm, &changed);
@@ -4167,7 +4126,18 @@ skl_compute_wm(struct drm_atomic_state *state) continue;
intel_cstate->update_wm_pre = true;
step = intel_cstate->wm.skl.ddb_realloc;
skl_compute_wm_results(crtc->dev, pipe_wm, results, intel_crtc);
if (!skl_ddb_entry_equal(&old_ddb->pipe[pipe],
			 &new_ddb->pipe[pipe])) {
	DRM_DEBUG_KMS(
	    "DDB changes for [CRTC:%d:pipe %c]: (%3d - %3d) -> (%3d - %3d) on step %d\n",
	    intel_crtc->base.base.id, pipe_name(pipe),
	    old_ddb->pipe[pipe].start, old_ddb->pipe[pipe].end,
	    new_ddb->pipe[pipe].start, new_ddb->pipe[pipe].end,
	    step);
}
}

return 0;
@@ -4191,8 +4161,20 @@ static void skl_update_wm(struct drm_crtc *crtc)

mutex_lock(&dev_priv->wm.wm_mutex);

skl_write_wm_values(dev_priv, results);

skl_flush_wm_values(dev_priv, results);
/*
* If this pipe isn't active already, we're going to be enabling it
* very soon. Since it's safe to update these while the pipe's shut off,
* just do so here. Already active pipes will have their watermarks
* updated once we update their planes.
*/
if (!intel_crtc->active) {
int plane;
for (plane = 0; plane < intel_num_planes(intel_crtc); plane++)
	skl_write_plane_wm(intel_crtc, results, plane);
skl_write_cursor_wm(intel_crtc, results);
}

/*

Store the new configuration (but only for the pipes that have
-- 2.7.4

-- Ville Syrjälä Intel OTC

Lyude Paul

9:39 p.m.

New subject: [PATCH v6 6/6] drm/i915/skl: Update DDB values atomically with wms/plane attrs

On Wed, 2016-08-03 at 18:00 +0300, Ville Syrjälä wrote:

...

On Tue, Aug 02, 2016 at 06:37:37PM -0400, Lyude wrote:

...
Now that we can hook into update_crtcs and control the order in which we update CRTCs at each modeset, we can finish the final step of fixing Skylake's watermark handling by performing DDB updates at the same time as plane updates and watermark updates.

The first major change in this patch is skl_update_crtcs(), which handles ensuring that we order each CRTC update in our atomic commits properly so that they honor the DDB flush order.

The second major change in this patch is the order in which we flush the pipes. While the previous order may have worked, it can't be used in this approach since it no longer will do the right thing. For example, using the old ddb flush order:

We have pipes A, B, and C enabled, and we're disabling C. Initial ddb allocation looks like this:

...
A   |   B   |xxxxxxx|

Since we're performing the ddb updates after performing any CRTC disablements in intel_atomic_commit_tail(), the space to the right of pipe B is unallocated.

Flush pipes with new allocation contained into old space. None

apply, so we skip this 2. Flush pipes having their allocation reduced, but overlapping with a    previous allocation. None apply, so we also skip this 3. Flush pipes that got more space allocated. This applies to A and B,    giving us the following update order: A, B

This is wrong, since updating pipe A first will cause it to overlap with B and potentially burst into flames. Our new order (see the code comments for details) would update the pipes in the proper order: B, A.

As well, we calculate the order for each DDB update during the check phase, and reference it later in the commit phase when we hit skl_update_crtcs().

This long overdue patch fixes the rest of the underruns on Skylake.

Changes since v1: - Add skl_ddb_entry_write() for cursor into skl_write_cursor_wm()

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration") Fixes: 8211bd5bdf5e ("drm/i915/skl: Program the DDB allocation") Signed-off-by: Lyude cpaul@redhat.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com

drivers/gpu/drm/i915/intel_display.c | 100 ++++++++++-- drivers/gpu/drm/i915/intel_drv.h     |  10 ++ drivers/gpu/drm/i915/intel_pm.c      | 288 ++++++++++++++++--------------

3 files changed, 233 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59cf513..06295f7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12897,16 +12897,23 @@ static void verify_wm_state(struct drm_crtc *crtc,   hw_entry->start, hw_entry->end); }
/* cursor */

hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];

sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];

if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
DRM_ERROR("mismatch in DDB state pipe %c cursor "
	  "(expected (%u,%u), found (%u,%u))\n",
	  pipe_name(pipe),
	  sw_entry->start, sw_entry->end,
	  hw_entry->start, hw_entry->end);
/*

* cursor

* If the cursor plane isn't active, we may not have updated it's

ddb

* allocation. In that case since the ddb allocation will be

updated
* once the plane becomes visible, we can skip this check

*/

if (intel_crtc->cursor_addr) {
hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];
sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];
if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
	DRM_ERROR("mismatch in DDB state pipe %c cursor "
		  "(expected (%u,%u), found (%u,%u))\n",
		  pipe_name(pipe),
		  sw_entry->start, sw_entry->end,
		  hw_entry->start, hw_entry->end);
}
} } @@ -13658,6 +13665,72 @@ static void intel_update_crtcs(struct drm_atomic_state *state, } } +static inline void +skl_do_ddb_step(struct drm_atomic_state *state,
enum skl_ddb_step step)
+{

struct intel_atomic_state *intel_state =

to_intel_atomic_state(state);
struct drm_crtc *crtc;

struct drm_crtc_state *old_crtc_state;

unsigned int crtc_vblank_mask; /* unused */

int i;

for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
struct intel_crtc_state *cstate =
	to_intel_crtc_state(crtc->state);
bool vblank_wait = false;
if (cstate->wm.skl.ddb_realloc != step || !crtc->state-
...
active)
	continue;
/*
 * If we're changing the ddb allocation of this pipe to
make
 * room for another pipe, we have to wait for the pipe's
ddb
 * allocations to actually update by waiting for a vblank.
 * Otherwise we risk the next pipe updating before this
pipe
 * finishes, resulting in the pipe fetching from ddb space
for
 * the wrong pipe.
 *
 * However, if we know we don't have any more pipes to move
 * around, we can skip this wait and the new ddb allocation
 * will take effect at the start of the next vblank.
 */
switch (step) {
case SKL_DDB_STEP_NO_OVERLAP:
case SKL_DDB_STEP_OVERLAP:
	if (step != intel_state->last_ddb_step)
		vblank_wait = true;
/* drop through */
case SKL_DDB_STEP_FINAL:
	DRM_DEBUG_KMS(
	    "Updating [CRTC:%d:pipe %c] for DDB step %d\n",
	    crtc->base.id, pipe_name(intel_crtc->pipe),
	    step);
case SKL_DDB_STEP_NONE:
	break;
}
Not sure we really need this step stuff. How about?

for_each_crtc if (crtc_needs_disabling) disable_crtc();

do { progress = false; wait_vbl_pipes=0; for_each_crtc() { if (!active || needs_modeset) continue; if (!ddb_changed) continue; if (new_ddb_overlaps_with_any_other_pipes_current_ddb) continue; commit; wait_vbl_pipes |= pipe; progress = true; } wait_vbls(wait_vbl_pipes); } while (progress);

for_each_crtc if (crtc_needs_enabling) enable_crtc(); commit; }

I'm fine with this, it might make this logic a little easier to read.

...

Or if we're paranoid, we could also have an upper bound on the loop and assert that we never reach it.

Though one thing I don't particularly like about this commit while changing the ddb approach is that it's going to make the update appear even less atomic. What I'd rather like to do for the normal commit path is this:

for_each_crtc if (crtc_needs_disabling) disable_planes for_each_crtc if (crtc_needs_disabling) disable_crtc for_each_crtc if (crtc_needs_enabling) enable_crtc for_each_crtc if (active) commit_planes;

That way everything would pop in and out as close together as possible. Hmm. Actually, I wonder... I'm thinking we should be able to enable all crtcs prior to entering the ddb commit loop, on account of no planes being enabled on those crtcs until we commit them. And if no planes are enabled, running the pipe w/o allocated ddb should be fine. So with that approach, I think we should be able to commit all planes within a few iterations of the loop, and hence within a few vblanks.

I can't see any issues with this, and this would definitely make the code a lot cleaner. I'm alright with going this route if matt doesn't see any issues with it as well.

Cheers, Lyude

...

...
intel_update_crtc(crtc, state, old_crtc_state,
		  &crtc_vblank_mask);
if (vblank_wait)
	intel_wait_for_vblank(state->dev, intel_crtc-
...
pipe);

}

+}

+static void skl_update_crtcs(struct drm_atomic_state *state,
	     unsigned int *crtc_vblank_mask)
+{

struct intel_atomic_state *intel_state =

to_intel_atomic_state(state);
enum skl_ddb_step step;

for (step = 0; step <= intel_state->last_ddb_step; step++)
skl_do_ddb_step(state, step);
+}

static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -15235,8 +15308,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

dev_priv->display.update_crtcs = intel_update_crtcs;

/* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed = @@ -15326,6 +15397,11 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) skl_modeset_calc_cdclk; }
if (dev_priv->info.gen >= 9)
dev_priv->display.update_crtcs = skl_update_crtcs;
else
dev_priv->display.update_crtcs = intel_update_crtcs;
switch (INTEL_INFO(dev_priv)->gen) { case 2: dev_priv->display.queue_flip = intel_gen2_queue_flip; diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1b444d3..cf5da83 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -334,6 +334,7 @@ struct intel_atomic_state { /* Gen9+ only */ struct skl_wm_values wm_results;

int last_ddb_step;

}; struct intel_plane_state { @@ -437,6 +438,13 @@ struct skl_pipe_wm { uint32_t linetime; }; +enum skl_ddb_step {

SKL_DDB_STEP_NONE = 0,

SKL_DDB_STEP_NO_OVERLAP,

SKL_DDB_STEP_OVERLAP,

SKL_DDB_STEP_FINAL

+};

struct intel_crtc_wm_state { union { struct { @@ -467,6 +475,8 @@ struct intel_crtc_wm_state { /* minimum block allocation */ uint16_t minimum_blocks[I915_MAX_PLANES]; uint16_t minimum_y_blocks[I915_MAX_PLANES];
	enum skl_ddb_step ddb_realloc;
} skl; }; diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6f5beb3..636c90a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3816,6 +3816,11 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][plane][level]); } I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm-

...
plane_trans[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, plane),
	    &wm->ddb.plane[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_NV12_BUF_CFG(pipe, plane),
	    &wm->ddb.y_plane[pipe][plane]);
} void skl_write_cursor_wm(struct intel_crtc *intel_crtc, @@ -3832,170 +3837,51 @@ void skl_write_cursor_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][PLANE_CURSOR][level]); } I915_WRITE(CUR_WM_TRANS(pipe), wm-

...
plane_trans[pipe][PLANE_CURSOR]);

-}

-static void skl_write_wm_values(struct drm_i915_private *dev_priv,
		const struct skl_wm_values *new)
-{
struct drm_device *dev = &dev_priv->drm;

struct intel_crtc *crtc;

for_each_intel_crtc(dev, crtc) {
int i;
enum pipe pipe = crtc->pipe;
if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0)
	continue;
if (!crtc->active)
	continue;
for (i = 0; i < intel_num_planes(crtc); i++) {
	skl_ddb_entry_write(dev_priv,
			    PLANE_BUF_CFG(pipe, i),
			    &new->ddb.plane[pipe][i]);
	skl_ddb_entry_write(dev_priv,
			    PLANE_NV12_BUF_CFG(pipe, i),
			    &new->ddb.y_plane[pipe][i]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
		    &new->ddb.plane[pipe][PLANE_CURSOR]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
	    &wm->ddb.plane[pipe][PLANE_CURSOR]);
} -/*

When setting up a new DDB allocation arrangement, we need to correctly

sequence the times at which the new allocations for the pipes are taken

into

account or we'll have pipes fetching from space previously allocated to

another pipe.

Roughly the sequence looks like:

*  1. re-allocate the pipe(s) with the allocation being reduced and not

*     overlapping with a previous light-up pipe (another way to put it is:

*     pipes with their new allocation strickly included into their old

ones).

*  2. re-allocate the other pipes that get their allocation reduced

*  3. allocate the pipes having their allocation increased

Steps 1. and 2. are here to take care of the following case:

Initially DDB looks like this:

*     |   B    |   C    |

enable pipe A.

pipe B has a reduced DDB allocation that overlaps with the old pipe C

*   allocation

*     |  A  |  B  |  C  |

We need to sequence the re-allocation: C, B, A (and not B, C, A).

*/

-static void -skl_wm_flush_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, int pass) +static bool +skl_ddb_allocation_equals(const struct skl_ddb_allocation *old,
	  const struct skl_ddb_allocation *new,
	  enum pipe pipe)
{
int plane;

DRM_DEBUG_KMS("flush pipe %c (pass %d)\n", pipe_name(pipe), pass);

for_each_plane(dev_priv, pipe, plane) {
I915_WRITE(PLANE_SURF(pipe, plane),
	   I915_READ(PLANE_SURF(pipe, plane)));
}

I915_WRITE(CURBASE(pipe), I915_READ(CURBASE(pipe)));
return new->pipe[pipe].start == old->pipe[pipe].start &&

new->pipe[pipe].end == old->pipe[pipe].end;

} static bool -skl_ddb_allocation_included(const struct skl_ddb_allocation *old, +skl_ddb_allocation_overlaps(struct drm_atomic_state *state,
	    const struct skl_ddb_allocation *old,
const struct skl_ddb_allocation *new, enum pipe pipe) {

uint16_t old_size, new_size;

old_size = skl_ddb_entry_size(&old->pipe[pipe]);

new_size = skl_ddb_entry_size(&new->pipe[pipe]);

return old_size != new_size &&

new->pipe[pipe].start >= old->pipe[pipe].start &&

new->pipe[pipe].end <= old->pipe[pipe].end;

-}

-static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
		struct skl_wm_values *new_values)
-{

struct drm_device *dev = &dev_priv->drm;

struct skl_ddb_allocation *cur_ddb, *new_ddb;

bool reallocated[I915_MAX_PIPES] = {};

struct intel_crtc *crtc;

enum pipe pipe;

new_ddb = &new_values->ddb;

cur_ddb = &dev_priv->wm.skl_hw.ddb;

/*

* First pass: flush the pipes with the new allocation contained

into
* the old space.

*

* We'll wait for the vblank on those pipes to ensure we can safely

* re-allocate the freed space without this pipe fetching from it.

*/

for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (!skl_ddb_allocation_included(cur_ddb, new_ddb, pipe))
	continue;
skl_wm_flush_pipe(dev_priv, pipe, 1);
intel_wait_for_vblank(dev, pipe);
reallocated[pipe] = true;
}

/*

* Second pass: flush the pipes that are having their allocation

* reduced, but overlapping with a previous allocation.

*

* Here as well we need to wait for the vblank to make sure the
freed
* space is not used anymore.

*/

for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (reallocated[pipe])
	continue;
if (skl_ddb_entry_size(&new_ddb->pipe[pipe]) <
    skl_ddb_entry_size(&cur_ddb->pipe[pipe])) {
	skl_wm_flush_pipe(dev_priv, pipe, 2);
	intel_wait_for_vblank(dev, pipe);
	reallocated[pipe] = true;
}
}

/*

* Third pass: flush the pipes that got more space allocated.

*

* We don't need to actively wait for the update here, next vblank

* will just get more DDB space with the correct WM values.

*/

for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
struct drm_device *dev = state->dev;

struct intel_crtc *intel_crtc;

enum pipe otherp;
pipe = crtc->pipe;
for_each_intel_crtc(dev, intel_crtc) {
otherp = intel_crtc->pipe;
/*
 * At this point, only the pipes more space than before are
 * left to re-allocate.
 * When checking for overlaps, we don't want to:
 *  - Compare against ourselves
 *  - Compare against pipes that will be disabled in step 0
 *  - Compare against pipes that won't be enabled until
step 3 */
if (reallocated[pipe])
if (otherp == pipe || !new->pipe[otherp].end ||
    !old->pipe[otherp].end)
continue;
skl_wm_flush_pipe(dev_priv, pipe, 3);
if ((new->pipe[pipe].start >= old->pipe[otherp].start &&
     new->pipe[pipe].start < old->pipe[otherp].end) ||
    (old->pipe[otherp].start >= new->pipe[pipe].start &&
     old->pipe[otherp].start < new->pipe[pipe].end))
	return true;
}

return false;

} static int skl_update_pipe_wm(struct drm_crtc_state *cstate, @@ -4038,8 +3924,10 @@ skl_compute_ddb(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct intel_crtc_state *cstate;

struct intel_crtc *intel_crtc;

struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb;

uint32_t realloc_pipes = pipes_modified(state); int ret; @@ -4071,13 +3959,11 @@ skl_compute_ddb(struct drm_atomic_state *state) } for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
struct intel_crtc_state *cstate;
cstate = intel_atomic_get_crtc_state(state, intel_crtc); if (IS_ERR(cstate)) return PTR_ERR(cstate);
ret = skl_allocate_pipe_ddb(cstate, ddb);
ret = skl_allocate_pipe_ddb(cstate, new_ddb);
if (ret) return ret; @@ -4086,6 +3972,73 @@ skl_compute_ddb(struct drm_atomic_state *state) return ret; }

/*

* When setting up a new DDB allocation arrangement, we need to

* correctly sequence the times at which the new allocations for

the

* pipes are taken into account or we'll have pipes fetching from

space

* previously allocated to another pipe.

*

* Roughly the final sequence we want looks like this:

*  1. Disable any pipes we're not going to be using anymore

*  2. Reallocate all of the active pipes whose new ddb allocations

*  won't overlap with another active pipe's ddb allocation.

*  3. Reallocate remaining active pipes, if any.

*  4. Enable any new pipes, if any.

*

* Example:

* Initially DDB looks like this:

*   |   B    |   C    |

* And the final DDB should look like this:

*   |  B  |  C  |  A  |

*

* 1. We're not disabling any pipes, so do nothing on this step.

* 2. Pipe B's new allocation wouldn't overlap with pipe C, however

* pipe C's new allocation does overlap with pipe B's current

* allocation. Reallocate B first so the DDB looks like this:

*   |  B  |xx|   C    |

* 3. Now we can safely reallocate pipe C to it's new location:

*   |  B  |  C  |xxxxx|

* 4. Enable any remaining pipes, in this case A

*   |  B  |  C  |  A  |

*

* As well, between every pipe reallocation we have to wait for a

* vblank on the pipe so that we ensure it's new allocation has

taken

* effect by the time we start moving the next pipe. This can be

* skipped on the last step we need to perform, which is why we

keep

* track of that information here. For example, if we've

reallocated

* all the pipes that need changing by the time we reach step 3, we

can

* finish without waiting for the pipes we changed in step 3 to

update.
*/

for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
enum pipe pipe = intel_crtc->pipe;
enum skl_ddb_step step;
cstate = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(cstate))
	return PTR_ERR(cstate);
/* Step 1: Pipes we're disabling / haven't changed */
if (skl_ddb_allocation_equals(old_ddb, new_ddb, pipe) ||
    new_ddb->pipe[pipe].end == 0) {
	step = SKL_DDB_STEP_NONE;
/* Step 2-3: Active pipes we're reallocating */
} else if (old_ddb->pipe[pipe].end != 0) {
	if (skl_ddb_allocation_overlaps(state, old_ddb,
new_ddb,
					pipe))
		step = SKL_DDB_STEP_OVERLAP;
	else
		step = SKL_DDB_STEP_NO_OVERLAP;
/* Step 4: Pipes we're enabling */
} else {
	step = SKL_DDB_STEP_FINAL;
}
cstate->wm.skl.ddb_realloc = step;
if (step > intel_state->last_ddb_step)
	intel_state->last_ddb_step = step;
}
return 0; } @@ -4110,10 +4063,13 @@ skl_copy_wm_for_pipe(struct skl_wm_values *dst, static int skl_compute_wm(struct drm_atomic_state *state) {

struct drm_i915_private *dev_priv = to_i915(state->dev);

struct drm_crtc *crtc; struct drm_crtc_state *cstate; struct intel_atomic_state *intel_state = to_intel_atomic_state(state); struct skl_wm_values *results = &intel_state->wm_results;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &results->ddb;

struct skl_pipe_wm *pipe_wm; bool changed = false; int ret, i; @@ -4152,7 +4108,10 @@ skl_compute_wm(struct drm_atomic_state *state) struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate);
enum skl_ddb_step step;
enum pipe pipe;
pipe = intel_crtc->pipe;
pipe_wm = &intel_cstate->wm.skl.optimal; ret = skl_update_pipe_wm(cstate, &results->ddb, pipe_wm, &changed); @@ -4167,7 +4126,18 @@ skl_compute_wm(struct drm_atomic_state *state) continue; intel_cstate->update_wm_pre = true;
step = intel_cstate->wm.skl.ddb_realloc;
skl_compute_wm_results(crtc->dev, pipe_wm, results, intel_crtc);
if (!skl_ddb_entry_equal(&old_ddb->pipe[pipe],
			 &new_ddb->pipe[pipe])) {
	DRM_DEBUG_KMS(
	    "DDB changes for [CRTC:%d:pipe %c]: (%3d - %3d)
-> (%3d - %3d) on step %d\n",
	    intel_crtc->base.base.id, pipe_name(pipe),
	    old_ddb->pipe[pipe].start, old_ddb-
...
pipe[pipe].end,
	    new_ddb->pipe[pipe].start, new_ddb-
...
pipe[pipe].end,
	    step);
}
} return 0; @@ -4191,8 +4161,20 @@ static void skl_update_wm(struct drm_crtc *crtc) mutex_lock(&dev_priv->wm.wm_mutex);

skl_write_wm_values(dev_priv, results);

skl_flush_wm_values(dev_priv, results);

/*

* If this pipe isn't active already, we're going to be enabling it

* very soon. Since it's safe to update these while the pipe's shut

off,
* just do so here. Already active pipes will have their watermarks

* updated once we update their planes.

*/

if (!intel_crtc->active) {
int plane;
for (plane = 0; plane < intel_num_planes(intel_crtc);
plane++)
	skl_write_plane_wm(intel_crtc, results, plane);
skl_write_cursor_wm(intel_crtc, results);
}
/* * Store the new configuration (but only for the pipes that have -- 2.7.4

Matt Roper

10:19 p.m.

New subject: [PATCH v6 6/6] drm/i915/skl: Update DDB values atomically with wms/plane attrs

On Wed, Aug 03, 2016 at 06:00:42PM +0300, Ville Syrjälä wrote:

...

On Tue, Aug 02, 2016 at 06:37:37PM -0400, Lyude wrote:

...
Now that we can hook into update_crtcs and control the order in which we update CRTCs at each modeset, we can finish the final step of fixing Skylake's watermark handling by performing DDB updates at the same time as plane updates and watermark updates.

The first major change in this patch is skl_update_crtcs(), which handles ensuring that we order each CRTC update in our atomic commits properly so that they honor the DDB flush order.

The second major change in this patch is the order in which we flush the pipes. While the previous order may have worked, it can't be used in this approach since it no longer will do the right thing. For example, using the old ddb flush order:

We have pipes A, B, and C enabled, and we're disabling C. Initial ddb allocation looks like this:

| A | B |xxxxxxx|

Since we're performing the ddb updates after performing any CRTC disablements in intel_atomic_commit_tail(), the space to the right of pipe B is unallocated.

Flush pipes with new allocation contained into old space. None apply, so we skip this

Flush pipes having their allocation reduced, but overlapping with a previous allocation. None apply, so we also skip this

Flush pipes that got more space allocated. This applies to A and B, giving us the following update order: A, B

This is wrong, since updating pipe A first will cause it to overlap with B and potentially burst into flames. Our new order (see the code comments for details) would update the pipes in the proper order: B, A.

As well, we calculate the order for each DDB update during the check phase, and reference it later in the commit phase when we hit skl_update_crtcs().

This long overdue patch fixes the rest of the underruns on Skylake.

Changes since v1:

Add skl_ddb_entry_write() for cursor into skl_write_cursor_wm()

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration") Fixes: 8211bd5bdf5e ("drm/i915/skl: Program the DDB allocation") Signed-off-by: Lyude cpaul@redhat.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com

drivers/gpu/drm/i915/intel_display.c | 100 ++++++++++-- drivers/gpu/drm/i915/intel_drv.h | 10 ++ drivers/gpu/drm/i915/intel_pm.c | 288 ++++++++++++++++------------------- 3 files changed, 233 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59cf513..06295f7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12897,16 +12897,23 @@ static void verify_wm_state(struct drm_crtc *crtc, hw_entry->start, hw_entry->end); }
/* cursor */

hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];

sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];

if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
DRM_ERROR("mismatch in DDB state pipe %c cursor "
	  "(expected (%u,%u), found (%u,%u))\n",
	  pipe_name(pipe),
	  sw_entry->start, sw_entry->end,
	  hw_entry->start, hw_entry->end);
/*
* cursor
* If the cursor plane isn't active, we may not have updated it's ddb
* allocation. In that case since the ddb allocation will be updated
* once the plane becomes visible, we can skip this check
*/
if (intel_crtc->cursor_addr) {
hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];
sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];
if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
	DRM_ERROR("mismatch in DDB state pipe %c cursor "
		  "(expected (%u,%u), found (%u,%u))\n",
		  pipe_name(pipe),
		  sw_entry->start, sw_entry->end,
		  hw_entry->start, hw_entry->end);
}
}
}

@@ -13658,6 +13665,72 @@ static void intel_update_crtcs(struct drm_atomic_state *state, } }

+static inline void +skl_do_ddb_step(struct drm_atomic_state *state,
enum skl_ddb_step step)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct drm_crtc *crtc;

struct drm_crtc_state *old_crtc_state;

unsigned int crtc_vblank_mask; /* unused */

int i;

for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
struct intel_crtc_state *cstate =
	to_intel_crtc_state(crtc->state);
bool vblank_wait = false;
if (cstate->wm.skl.ddb_realloc != step || !crtc->state->active)
	continue;
/*
 * If we're changing the ddb allocation of this pipe to make
 * room for another pipe, we have to wait for the pipe's ddb
 * allocations to actually update by waiting for a vblank.
 * Otherwise we risk the next pipe updating before this pipe
 * finishes, resulting in the pipe fetching from ddb space for
 * the wrong pipe.
 *
 * However, if we know we don't have any more pipes to move
 * around, we can skip this wait and the new ddb allocation
 * will take effect at the start of the next vblank.
 */
switch (step) {
case SKL_DDB_STEP_NO_OVERLAP:
case SKL_DDB_STEP_OVERLAP:
	if (step != intel_state->last_ddb_step)
		vblank_wait = true;
/* drop through */
case SKL_DDB_STEP_FINAL:
	DRM_DEBUG_KMS(
	    "Updating [CRTC:%d:pipe %c] for DDB step %d\n",
	    crtc->base.id, pipe_name(intel_crtc->pipe),
	    step);
case SKL_DDB_STEP_NONE:
	break;
}
Not sure we really need this step stuff. How about?

for_each_crtc if (crtc_needs_disabling) disable_crtc();

do { progress = false; wait_vbl_pipes=0; for_each_crtc() { if (!active || needs_modeset) continue; if (!ddb_changed) continue; if (new_ddb_overlaps_with_any_other_pipes_current_ddb) continue; commit; wait_vbl_pipes |= pipe; progress = true; } wait_vbls(wait_vbl_pipes); } while (progress);

for_each_crtc if (crtc_needs_enabling) enable_crtc(); commit; }

Yeah, this approach looks nicer. It's a bit simpler to follow code-wise and doesn't require us to precompute any ordering during the check phase so it's a bit more self-contained. It should also scale properly if future platforms decide to add more pipes.

...

Or if we're paranoid, we could also have an upper bound on the loop and assert that we never reach it.

Though one thing I don't particularly like about this commit while changing the ddb approach is that it's going to make the update appear even less atomic. What I'd rather like to do for the normal commit path is this:

for_each_crtc if (crtc_needs_disabling) disable_planes for_each_crtc if (crtc_needs_disabling) disable_crtc for_each_crtc if (crtc_needs_enabling) enable_crtc for_each_crtc if (active) commit_planes;

That way everything would pop in and out as close together as possible. Hmm. Actually, I wonder... I'm thinking we should be able to enable all crtcs prior to entering the ddb commit loop, on account of no planes being enabled on those crtcs until we commit them. And if no planes are enabled, running the pipe w/o allocated ddb should be fine. So with that approach, I think we should be able to commit all planes within a few iterations of the loop, and hence within a few vblanks.

So this is pretty similar to what we do today, except that we do the enabling/disabling of each CRTC and its planes all together, right? Sounds reasonable to me, although I'm not sure we want to mix that change in with the gen9-specific series Lyude is working on here. Maybe just do the new gen9 handler that way as part of that series and then come back and update the non-gen9 handler to follow the new flow as a separate patch?

Matt

...

...
intel_update_crtc(crtc, state, old_crtc_state,
		  &crtc_vblank_mask);
if (vblank_wait)
	intel_wait_for_vblank(state->dev, intel_crtc->pipe);
}
+}

+static void skl_update_crtcs(struct drm_atomic_state *state,
	     unsigned int *crtc_vblank_mask)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

enum skl_ddb_step step;

for (step = 0; step <= intel_state->last_ddb_step; step++)
skl_do_ddb_step(state, step);
+}

static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -15235,8 +15308,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

dev_priv->display.update_crtcs = intel_update_crtcs;

/* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed =

@@ -15326,6 +15397,11 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) skl_modeset_calc_cdclk; }
if (dev_priv->info.gen >= 9)
dev_priv->display.update_crtcs = skl_update_crtcs;
else
dev_priv->display.update_crtcs = intel_update_crtcs;
switch (INTEL_INFO(dev_priv)->gen) { case 2: dev_priv->display.queue_flip = intel_gen2_queue_flip;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1b444d3..cf5da83 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -334,6 +334,7 @@ struct intel_atomic_state {

/* Gen9+ only */ struct skl_wm_values wm_results;

int last_ddb_step;

};

struct intel_plane_state { @@ -437,6 +438,13 @@ struct skl_pipe_wm { uint32_t linetime; };

+enum skl_ddb_step {

SKL_DDB_STEP_NONE = 0,

SKL_DDB_STEP_NO_OVERLAP,

SKL_DDB_STEP_OVERLAP,

SKL_DDB_STEP_FINAL

+};

struct intel_crtc_wm_state { union { struct { @@ -467,6 +475,8 @@ struct intel_crtc_wm_state { /* minimum block allocation */ uint16_t minimum_blocks[I915_MAX_PLANES]; uint16_t minimum_y_blocks[I915_MAX_PLANES];
	enum skl_ddb_step ddb_realloc;
} skl; };
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6f5beb3..636c90a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3816,6 +3816,11 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][plane][level]); } I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm->plane_trans[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, plane),
	    &wm->ddb.plane[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_NV12_BUF_CFG(pipe, plane),
	    &wm->ddb.y_plane[pipe][plane]);
}

void skl_write_cursor_wm(struct intel_crtc *intel_crtc, @@ -3832,170 +3837,51 @@ void skl_write_cursor_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][PLANE_CURSOR][level]); } I915_WRITE(CUR_WM_TRANS(pipe), wm->plane_trans[pipe][PLANE_CURSOR]); -}

-static void skl_write_wm_values(struct drm_i915_private *dev_priv,
		const struct skl_wm_values *new)
-{
struct drm_device *dev = &dev_priv->drm;

struct intel_crtc *crtc;

for_each_intel_crtc(dev, crtc) {
int i;
enum pipe pipe = crtc->pipe;
if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0)
	continue;
if (!crtc->active)
	continue;
for (i = 0; i < intel_num_planes(crtc); i++) {
	skl_ddb_entry_write(dev_priv,
			    PLANE_BUF_CFG(pipe, i),
			    &new->ddb.plane[pipe][i]);
	skl_ddb_entry_write(dev_priv,
			    PLANE_NV12_BUF_CFG(pipe, i),
			    &new->ddb.y_plane[pipe][i]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
		    &new->ddb.plane[pipe][PLANE_CURSOR]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
	    &wm->ddb.plane[pipe][PLANE_CURSOR]);
}

-/*
When setting up a new DDB allocation arrangement, we need to correctly

sequence the times at which the new allocations for the pipes are taken into

account or we'll have pipes fetching from space previously allocated to

another pipe.

Roughly the sequence looks like:

re-allocate the pipe(s) with the allocation being reduced and not
overlapping with a previous light-up pipe (another way to put it is:
pipes with their new allocation strickly included into their old ones).
re-allocate the other pipes that get their allocation reduced

allocate the pipes having their allocation increased

Steps 1. and 2. are here to take care of the following case:

Initially DDB looks like this:
|   B    |   C    |
enable pipe A.

pipe B has a reduced DDB allocation that overlaps with the old pipe C

allocation
|  A  |  B  |  C  |
We need to sequence the re-allocation: C, B, A (and not B, C, A).

*/
-static void -skl_wm_flush_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, int pass) +static bool +skl_ddb_allocation_equals(const struct skl_ddb_allocation *old,
	  const struct skl_ddb_allocation *new,
	  enum pipe pipe)
{
int plane;

DRM_DEBUG_KMS("flush pipe %c (pass %d)\n", pipe_name(pipe), pass);

for_each_plane(dev_priv, pipe, plane) {
I915_WRITE(PLANE_SURF(pipe, plane),
	   I915_READ(PLANE_SURF(pipe, plane)));
}

I915_WRITE(CURBASE(pipe), I915_READ(CURBASE(pipe)));
return new->pipe[pipe].start == old->pipe[pipe].start &&
      new->pipe[pipe].end == old->pipe[pipe].end;
}

static bool -skl_ddb_allocation_included(const struct skl_ddb_allocation *old, +skl_ddb_allocation_overlaps(struct drm_atomic_state *state,
	    const struct skl_ddb_allocation *old,
    const struct skl_ddb_allocation *new,
    enum pipe pipe)
{
uint16_t old_size, new_size;

old_size = skl_ddb_entry_size(&old->pipe[pipe]);

new_size = skl_ddb_entry_size(&new->pipe[pipe]);

return old_size != new_size &&
      new->pipe[pipe].start >= old->pipe[pipe].start &&
      new->pipe[pipe].end <= old->pipe[pipe].end;
-}

-static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
		struct skl_wm_values *new_values)
-{
struct drm_device *dev = &dev_priv->drm;

struct skl_ddb_allocation *cur_ddb, *new_ddb;

bool reallocated[I915_MAX_PIPES] = {};

struct intel_crtc *crtc;

enum pipe pipe;

new_ddb = &new_values->ddb;

cur_ddb = &dev_priv->wm.skl_hw.ddb;

/*
* First pass: flush the pipes with the new allocation contained into
* the old space.
*
* We'll wait for the vblank on those pipes to ensure we can safely
* re-allocate the freed space without this pipe fetching from it.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (!skl_ddb_allocation_included(cur_ddb, new_ddb, pipe))
	continue;
skl_wm_flush_pipe(dev_priv, pipe, 1);
intel_wait_for_vblank(dev, pipe);
reallocated[pipe] = true;
}

/*
* Second pass: flush the pipes that are having their allocation
* reduced, but overlapping with a previous allocation.
*
* Here as well we need to wait for the vblank to make sure the freed
* space is not used anymore.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (reallocated[pipe])
	continue;
if (skl_ddb_entry_size(&new_ddb->pipe[pipe]) <
    skl_ddb_entry_size(&cur_ddb->pipe[pipe])) {
	skl_wm_flush_pipe(dev_priv, pipe, 2);
	intel_wait_for_vblank(dev, pipe);
	reallocated[pipe] = true;
}
}

/*
* Third pass: flush the pipes that got more space allocated.
*
* We don't need to actively wait for the update here, next vblank
* will just get more DDB space with the correct WM values.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
struct drm_device *dev = state->dev;

struct intel_crtc *intel_crtc;

enum pipe otherp;
pipe = crtc->pipe;
for_each_intel_crtc(dev, intel_crtc) {
otherp = intel_crtc->pipe;
/*
 * At this point, only the pipes more space than before are
 * left to re-allocate.
 * When checking for overlaps, we don't want to:
 *  - Compare against ourselves
 *  - Compare against pipes that will be disabled in step 0
 *  - Compare against pipes that won't be enabled until step 3
*/
if (reallocated[pipe])
if (otherp == pipe || !new->pipe[otherp].end ||
    !old->pipe[otherp].end)
continue;
skl_wm_flush_pipe(dev_priv, pipe, 3);
if ((new->pipe[pipe].start >= old->pipe[otherp].start &&
     new->pipe[pipe].start < old->pipe[otherp].end) ||
    (old->pipe[otherp].start >= new->pipe[pipe].start &&
     old->pipe[otherp].start < new->pipe[pipe].end))
	return true;
}
return false;
}

static int skl_update_pipe_wm(struct drm_crtc_state *cstate, @@ -4038,8 +3924,10 @@ skl_compute_ddb(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct intel_crtc_state *cstate; struct intel_crtc *intel_crtc;

struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb; uint32_t realloc_pipes = pipes_modified(state); int ret;

@@ -4071,13 +3959,11 @@ skl_compute_ddb(struct drm_atomic_state *state) }

for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
struct intel_crtc_state *cstate;
cstate = intel_atomic_get_crtc_state(state, intel_crtc); if (IS_ERR(cstate)) return PTR_ERR(cstate);
ret = skl_allocate_pipe_ddb(cstate, ddb);
ret = skl_allocate_pipe_ddb(cstate, new_ddb);
if (ret) return ret;
@@ -4086,6 +3972,73 @@ skl_compute_ddb(struct drm_atomic_state *state) return ret; }
/*
* When setting up a new DDB allocation arrangement, we need to
* correctly sequence the times at which the new allocations for the
* pipes are taken into account or we'll have pipes fetching from space
* previously allocated to another pipe.
*
* Roughly the final sequence we want looks like this:
*  1. Disable any pipes we're not going to be using anymore
*  2. Reallocate all of the active pipes whose new ddb allocations
*  won't overlap with another active pipe's ddb allocation.
*  3. Reallocate remaining active pipes, if any.
*  4. Enable any new pipes, if any.
*
* Example:
* Initially DDB looks like this:
*   |   B    |   C    |
* And the final DDB should look like this:
*   |  B  |  C  |  A  |
*
* 1. We're not disabling any pipes, so do nothing on this step.
* 2. Pipe B's new allocation wouldn't overlap with pipe C, however
* pipe C's new allocation does overlap with pipe B's current
* allocation. Reallocate B first so the DDB looks like this:
*   |  B  |xx|   C    |
* 3. Now we can safely reallocate pipe C to it's new location:
*   |  B  |  C  |xxxxx|
* 4. Enable any remaining pipes, in this case A
*   |  B  |  C  |  A  |
*
* As well, between every pipe reallocation we have to wait for a
* vblank on the pipe so that we ensure it's new allocation has taken
* effect by the time we start moving the next pipe. This can be
* skipped on the last step we need to perform, which is why we keep
* track of that information here. For example, if we've reallocated
* all the pipes that need changing by the time we reach step 3, we can
* finish without waiting for the pipes we changed in step 3 to update.
*/
for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
enum pipe pipe = intel_crtc->pipe;
enum skl_ddb_step step;
cstate = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(cstate))
	return PTR_ERR(cstate);
/* Step 1: Pipes we're disabling / haven't changed */
if (skl_ddb_allocation_equals(old_ddb, new_ddb, pipe) ||
    new_ddb->pipe[pipe].end == 0) {
	step = SKL_DDB_STEP_NONE;
/* Step 2-3: Active pipes we're reallocating */
} else if (old_ddb->pipe[pipe].end != 0) {
	if (skl_ddb_allocation_overlaps(state, old_ddb, new_ddb,
					pipe))
		step = SKL_DDB_STEP_OVERLAP;
	else
		step = SKL_DDB_STEP_NO_OVERLAP;
/* Step 4: Pipes we're enabling */
} else {
	step = SKL_DDB_STEP_FINAL;
}
cstate->wm.skl.ddb_realloc = step;
if (step > intel_state->last_ddb_step)
	intel_state->last_ddb_step = step;
}

return 0;
}

@@ -4110,10 +4063,13 @@ skl_copy_wm_for_pipe(struct skl_wm_values *dst, static int skl_compute_wm(struct drm_atomic_state *state) {

struct drm_i915_private *dev_priv = to_i915(state->dev); struct drm_crtc *crtc; struct drm_crtc_state *cstate; struct intel_atomic_state *intel_state = to_intel_atomic_state(state); struct skl_wm_values *results = &intel_state->wm_results;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &results->ddb; struct skl_pipe_wm *pipe_wm; bool changed = false; int ret, i;

@@ -4152,7 +4108,10 @@ skl_compute_wm(struct drm_atomic_state *state) struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate);
enum skl_ddb_step step;
enum pipe pipe;
pipe = intel_crtc->pipe;
pipe_wm = &intel_cstate->wm.skl.optimal; ret = skl_update_pipe_wm(cstate, &results->ddb, pipe_wm, &changed);
@@ -4167,7 +4126,18 @@ skl_compute_wm(struct drm_atomic_state *state) continue;
intel_cstate->update_wm_pre = true;
step = intel_cstate->wm.skl.ddb_realloc;
skl_compute_wm_results(crtc->dev, pipe_wm, results, intel_crtc);
if (!skl_ddb_entry_equal(&old_ddb->pipe[pipe],
			 &new_ddb->pipe[pipe])) {
	DRM_DEBUG_KMS(
	    "DDB changes for [CRTC:%d:pipe %c]: (%3d - %3d) -> (%3d - %3d) on step %d\n",
	    intel_crtc->base.base.id, pipe_name(pipe),
	    old_ddb->pipe[pipe].start, old_ddb->pipe[pipe].end,
	    new_ddb->pipe[pipe].start, new_ddb->pipe[pipe].end,
	    step);
}
}

return 0;
@@ -4191,8 +4161,20 @@ static void skl_update_wm(struct drm_crtc *crtc)

mutex_lock(&dev_priv->wm.wm_mutex);

skl_write_wm_values(dev_priv, results);

skl_flush_wm_values(dev_priv, results);
/*
* If this pipe isn't active already, we're going to be enabling it
* very soon. Since it's safe to update these while the pipe's shut off,
* just do so here. Already active pipes will have their watermarks
* updated once we update their planes.
*/
if (!intel_crtc->active) {
int plane;
for (plane = 0; plane < intel_num_planes(intel_crtc); plane++)
	skl_write_plane_wm(intel_crtc, results, plane);
skl_write_cursor_wm(intel_crtc, results);
}

/*

Store the new configuration (but only for the pipes that have
-- 2.7.4
-- Ville Syrjälä Intel OTC

-- Matt Roper Graphics Software Engineer IoTG Platform Enabling & Development Intel Corporation (916) 356-2795

Ville Syrjälä

4 Aug 4 Aug

6:34 a.m.

New subject: [PATCH v6 6/6] drm/i915/skl: Update DDB values atomically with wms/plane attrs

On Wed, Aug 03, 2016 at 03:19:28PM -0700, Matt Roper wrote:

...

On Wed, Aug 03, 2016 at 06:00:42PM +0300, Ville Syrjälä wrote:

...
On Tue, Aug 02, 2016 at 06:37:37PM -0400, Lyude wrote:

...
Now that we can hook into update_crtcs and control the order in which we update CRTCs at each modeset, we can finish the final step of fixing Skylake's watermark handling by performing DDB updates at the same time as plane updates and watermark updates.

The first major change in this patch is skl_update_crtcs(), which handles ensuring that we order each CRTC update in our atomic commits properly so that they honor the DDB flush order.

The second major change in this patch is the order in which we flush the pipes. While the previous order may have worked, it can't be used in this approach since it no longer will do the right thing. For example, using the old ddb flush order:

We have pipes A, B, and C enabled, and we're disabling C. Initial ddb allocation looks like this:

| A | B |xxxxxxx|

Since we're performing the ddb updates after performing any CRTC disablements in intel_atomic_commit_tail(), the space to the right of pipe B is unallocated.

Flush pipes with new allocation contained into old space. None apply, so we skip this

Flush pipes having their allocation reduced, but overlapping with a previous allocation. None apply, so we also skip this

Flush pipes that got more space allocated. This applies to A and B, giving us the following update order: A, B

This is wrong, since updating pipe A first will cause it to overlap with B and potentially burst into flames. Our new order (see the code comments for details) would update the pipes in the proper order: B, A.

As well, we calculate the order for each DDB update during the check phase, and reference it later in the commit phase when we hit skl_update_crtcs().

This long overdue patch fixes the rest of the underruns on Skylake.

Changes since v1:

Add skl_ddb_entry_write() for cursor into skl_write_cursor_wm()

Fixes: 0e8fb7ba7ca5 ("drm/i915/skl: Flush the WM configuration") Fixes: 8211bd5bdf5e ("drm/i915/skl: Program the DDB allocation") Signed-off-by: Lyude cpaul@redhat.com [omitting CC for stable, since this patch will need to be changed for such backports first] Cc: Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: Radhakrishna Sripada radhakrishna.sripada@intel.com Cc: Hans de Goede hdegoede@redhat.com Cc: Matt Roper matthew.d.roper@intel.com

drivers/gpu/drm/i915/intel_display.c | 100 ++++++++++-- drivers/gpu/drm/i915/intel_drv.h | 10 ++ drivers/gpu/drm/i915/intel_pm.c | 288 ++++++++++++++++------------------- 3 files changed, 233 insertions(+), 165 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 59cf513..06295f7 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12897,16 +12897,23 @@ static void verify_wm_state(struct drm_crtc *crtc, hw_entry->start, hw_entry->end); }
/* cursor */

hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];

sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];

if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
DRM_ERROR("mismatch in DDB state pipe %c cursor "
	  "(expected (%u,%u), found (%u,%u))\n",
	  pipe_name(pipe),
	  sw_entry->start, sw_entry->end,
	  hw_entry->start, hw_entry->end);
/*
* cursor
* If the cursor plane isn't active, we may not have updated it's ddb
* allocation. In that case since the ddb allocation will be updated
* once the plane becomes visible, we can skip this check
*/
if (intel_crtc->cursor_addr) {
hw_entry = &hw_ddb.plane[pipe][PLANE_CURSOR];
sw_entry = &sw_ddb->plane[pipe][PLANE_CURSOR];
if (!skl_ddb_entry_equal(hw_entry, sw_entry)) {
	DRM_ERROR("mismatch in DDB state pipe %c cursor "
		  "(expected (%u,%u), found (%u,%u))\n",
		  pipe_name(pipe),
		  sw_entry->start, sw_entry->end,
		  hw_entry->start, hw_entry->end);
}
}
}

@@ -13658,6 +13665,72 @@ static void intel_update_crtcs(struct drm_atomic_state *state, } }

+static inline void +skl_do_ddb_step(struct drm_atomic_state *state,
enum skl_ddb_step step)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct drm_crtc *crtc;

struct drm_crtc_state *old_crtc_state;

unsigned int crtc_vblank_mask; /* unused */

int i;

for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
struct intel_crtc_state *cstate =
	to_intel_crtc_state(crtc->state);
bool vblank_wait = false;
if (cstate->wm.skl.ddb_realloc != step || !crtc->state->active)
	continue;
/*
 * If we're changing the ddb allocation of this pipe to make
 * room for another pipe, we have to wait for the pipe's ddb
 * allocations to actually update by waiting for a vblank.
 * Otherwise we risk the next pipe updating before this pipe
 * finishes, resulting in the pipe fetching from ddb space for
 * the wrong pipe.
 *
 * However, if we know we don't have any more pipes to move
 * around, we can skip this wait and the new ddb allocation
 * will take effect at the start of the next vblank.
 */
switch (step) {
case SKL_DDB_STEP_NO_OVERLAP:
case SKL_DDB_STEP_OVERLAP:
	if (step != intel_state->last_ddb_step)
		vblank_wait = true;
/* drop through */
case SKL_DDB_STEP_FINAL:
	DRM_DEBUG_KMS(
	    "Updating [CRTC:%d:pipe %c] for DDB step %d\n",
	    crtc->base.id, pipe_name(intel_crtc->pipe),
	    step);
case SKL_DDB_STEP_NONE:
	break;
}
Not sure we really need this step stuff. How about?

for_each_crtc if (crtc_needs_disabling) disable_crtc();

do { progress = false; wait_vbl_pipes=0; for_each_crtc() { if (!active || needs_modeset) continue; if (!ddb_changed) continue; if (new_ddb_overlaps_with_any_other_pipes_current_ddb) continue; commit; wait_vbl_pipes |= pipe; progress = true; } wait_vbls(wait_vbl_pipes); } while (progress);

for_each_crtc if (crtc_needs_enabling) enable_crtc(); commit; }
Yeah, this approach looks nicer. It's a bit simpler to follow code-wise and doesn't require us to precompute any ordering during the check phase so it's a bit more self-contained. It should also scale properly if future platforms decide to add more pipes.

Yep.

...

...
Or if we're paranoid, we could also have an upper bound on the loop and assert that we never reach it.

Though one thing I don't particularly like about this commit while changing the ddb approach is that it's going to make the update appear even less atomic. What I'd rather like to do for the normal commit path is this:

for_each_crtc if (crtc_needs_disabling) disable_planes for_each_crtc if (crtc_needs_disabling) disable_crtc for_each_crtc if (crtc_needs_enabling) enable_crtc for_each_crtc if (active) commit_planes;

That way everything would pop in and out as close together as possible. Hmm. Actually, I wonder... I'm thinking we should be able to enable all crtcs prior to entering the ddb commit loop, on account of no planes being enabled on those crtcs until we commit them. And if no planes are enabled, running the pipe w/o allocated ddb should be fine. So with that approach, I think we should be able to commit all planes within a few iterations of the loop, and hence within a few vblanks.

So this is pretty similar to what we do today, except that we do the enabling/disabling of each CRTC and its planes all together, right?

Yeah. Should provide better experience in case of "genlocked" pipes at least, eg. for those 2 part 4k MST monitors.

...

Sounds reasonable to me, although I'm not sure we want to mix that change in with the gen9-specific series Lyude is working on here. Maybe just do the new gen9 handler that way as part of that series and then come back and update the non-gen9 handler to follow the new flow as a separate patch?

Sounds good. First fix gen9, then make things pretty :)

As far as my idea of enabling the pipes on gen9 before the commit loop, I think that would also avoid having to commit the planes separately on those newly enabled crtcs. My 'progress' loop would take care of those pipes as well (would just have to drop the needs_modeset check).

...

Matt

...
...
intel_update_crtc(crtc, state, old_crtc_state,
		  &crtc_vblank_mask);
if (vblank_wait)
	intel_wait_for_vblank(state->dev, intel_crtc->pipe);
}
+}

+static void skl_update_crtcs(struct drm_atomic_state *state,
	     unsigned int *crtc_vblank_mask)
+{
struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

enum skl_ddb_step step;

for (step = 0; step <= intel_state->last_ddb_step; step++)
skl_do_ddb_step(state, step);
+}

static void intel_atomic_commit_tail(struct drm_atomic_state *state) { struct drm_device *dev = state->dev; @@ -15235,8 +15308,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) dev_priv->display.crtc_disable = i9xx_crtc_disable; }

dev_priv->display.update_crtcs = intel_update_crtcs;

/* Returns the core display clock speed */ if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) dev_priv->display.get_display_clock_speed =

@@ -15326,6 +15397,11 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv) skl_modeset_calc_cdclk; }
if (dev_priv->info.gen >= 9)
dev_priv->display.update_crtcs = skl_update_crtcs;
else
dev_priv->display.update_crtcs = intel_update_crtcs;
switch (INTEL_INFO(dev_priv)->gen) { case 2: dev_priv->display.queue_flip = intel_gen2_queue_flip;
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index 1b444d3..cf5da83 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -334,6 +334,7 @@ struct intel_atomic_state {

/* Gen9+ only */ struct skl_wm_values wm_results;

int last_ddb_step;

};

struct intel_plane_state { @@ -437,6 +438,13 @@ struct skl_pipe_wm { uint32_t linetime; };

+enum skl_ddb_step {

SKL_DDB_STEP_NONE = 0,

SKL_DDB_STEP_NO_OVERLAP,

SKL_DDB_STEP_OVERLAP,

SKL_DDB_STEP_FINAL

+};

struct intel_crtc_wm_state { union { struct { @@ -467,6 +475,8 @@ struct intel_crtc_wm_state { /* minimum block allocation */ uint16_t minimum_blocks[I915_MAX_PLANES]; uint16_t minimum_y_blocks[I915_MAX_PLANES];
	enum skl_ddb_step ddb_realloc;
} skl; };
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 6f5beb3..636c90a 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -3816,6 +3816,11 @@ void skl_write_plane_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][plane][level]); } I915_WRITE(PLANE_WM_TRANS(pipe, plane), wm->plane_trans[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_BUF_CFG(pipe, plane),
	    &wm->ddb.plane[pipe][plane]);
skl_ddb_entry_write(dev_priv, PLANE_NV12_BUF_CFG(pipe, plane),
	    &wm->ddb.y_plane[pipe][plane]);
}

void skl_write_cursor_wm(struct intel_crtc *intel_crtc, @@ -3832,170 +3837,51 @@ void skl_write_cursor_wm(struct intel_crtc *intel_crtc, wm->plane[pipe][PLANE_CURSOR][level]); } I915_WRITE(CUR_WM_TRANS(pipe), wm->plane_trans[pipe][PLANE_CURSOR]); -}

-static void skl_write_wm_values(struct drm_i915_private *dev_priv,
		const struct skl_wm_values *new)
-{
struct drm_device *dev = &dev_priv->drm;

struct intel_crtc *crtc;

for_each_intel_crtc(dev, crtc) {
int i;
enum pipe pipe = crtc->pipe;
if ((new->dirty_pipes & drm_crtc_mask(&crtc->base)) == 0)
	continue;
if (!crtc->active)
	continue;
for (i = 0; i < intel_num_planes(crtc); i++) {
	skl_ddb_entry_write(dev_priv,
			    PLANE_BUF_CFG(pipe, i),
			    &new->ddb.plane[pipe][i]);
	skl_ddb_entry_write(dev_priv,
			    PLANE_NV12_BUF_CFG(pipe, i),
			    &new->ddb.y_plane[pipe][i]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
		    &new->ddb.plane[pipe][PLANE_CURSOR]);
}
skl_ddb_entry_write(dev_priv, CUR_BUF_CFG(pipe),
	    &wm->ddb.plane[pipe][PLANE_CURSOR]);
}

-/*
When setting up a new DDB allocation arrangement, we need to correctly

sequence the times at which the new allocations for the pipes are taken into

account or we'll have pipes fetching from space previously allocated to

another pipe.

Roughly the sequence looks like:

re-allocate the pipe(s) with the allocation being reduced and not
overlapping with a previous light-up pipe (another way to put it is:
pipes with their new allocation strickly included into their old ones).
re-allocate the other pipes that get their allocation reduced

allocate the pipes having their allocation increased

Steps 1. and 2. are here to take care of the following case:

Initially DDB looks like this:
|   B    |   C    |
enable pipe A.

pipe B has a reduced DDB allocation that overlaps with the old pipe C

allocation
|  A  |  B  |  C  |
We need to sequence the re-allocation: C, B, A (and not B, C, A).

*/
-static void -skl_wm_flush_pipe(struct drm_i915_private *dev_priv, enum pipe pipe, int pass) +static bool +skl_ddb_allocation_equals(const struct skl_ddb_allocation *old,
	  const struct skl_ddb_allocation *new,
	  enum pipe pipe)
{
int plane;

DRM_DEBUG_KMS("flush pipe %c (pass %d)\n", pipe_name(pipe), pass);

for_each_plane(dev_priv, pipe, plane) {
I915_WRITE(PLANE_SURF(pipe, plane),
	   I915_READ(PLANE_SURF(pipe, plane)));
}

I915_WRITE(CURBASE(pipe), I915_READ(CURBASE(pipe)));
return new->pipe[pipe].start == old->pipe[pipe].start &&
      new->pipe[pipe].end == old->pipe[pipe].end;
}

static bool -skl_ddb_allocation_included(const struct skl_ddb_allocation *old, +skl_ddb_allocation_overlaps(struct drm_atomic_state *state,
	    const struct skl_ddb_allocation *old,
    const struct skl_ddb_allocation *new,
    enum pipe pipe)
{
uint16_t old_size, new_size;

old_size = skl_ddb_entry_size(&old->pipe[pipe]);

new_size = skl_ddb_entry_size(&new->pipe[pipe]);

return old_size != new_size &&
      new->pipe[pipe].start >= old->pipe[pipe].start &&
      new->pipe[pipe].end <= old->pipe[pipe].end;
-}

-static void skl_flush_wm_values(struct drm_i915_private *dev_priv,
		struct skl_wm_values *new_values)
-{
struct drm_device *dev = &dev_priv->drm;

struct skl_ddb_allocation *cur_ddb, *new_ddb;

bool reallocated[I915_MAX_PIPES] = {};

struct intel_crtc *crtc;

enum pipe pipe;

new_ddb = &new_values->ddb;

cur_ddb = &dev_priv->wm.skl_hw.ddb;

/*
* First pass: flush the pipes with the new allocation contained into
* the old space.
*
* We'll wait for the vblank on those pipes to ensure we can safely
* re-allocate the freed space without this pipe fetching from it.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (!skl_ddb_allocation_included(cur_ddb, new_ddb, pipe))
	continue;
skl_wm_flush_pipe(dev_priv, pipe, 1);
intel_wait_for_vblank(dev, pipe);
reallocated[pipe] = true;
}

/*
* Second pass: flush the pipes that are having their allocation
* reduced, but overlapping with a previous allocation.
*
* Here as well we need to wait for the vblank to make sure the freed
* space is not used anymore.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
pipe = crtc->pipe;
if (reallocated[pipe])
	continue;
if (skl_ddb_entry_size(&new_ddb->pipe[pipe]) <
    skl_ddb_entry_size(&cur_ddb->pipe[pipe])) {
	skl_wm_flush_pipe(dev_priv, pipe, 2);
	intel_wait_for_vblank(dev, pipe);
	reallocated[pipe] = true;
}
}

/*
* Third pass: flush the pipes that got more space allocated.
*
* We don't need to actively wait for the update here, next vblank
* will just get more DDB space with the correct WM values.
*/
for_each_intel_crtc(dev, crtc) {
if (!crtc->active)
	continue;
struct drm_device *dev = state->dev;

struct intel_crtc *intel_crtc;

enum pipe otherp;
pipe = crtc->pipe;
for_each_intel_crtc(dev, intel_crtc) {
otherp = intel_crtc->pipe;
/*
 * At this point, only the pipes more space than before are
 * left to re-allocate.
 * When checking for overlaps, we don't want to:
 *  - Compare against ourselves
 *  - Compare against pipes that will be disabled in step 0
 *  - Compare against pipes that won't be enabled until step 3
*/
if (reallocated[pipe])
if (otherp == pipe || !new->pipe[otherp].end ||
    !old->pipe[otherp].end)
continue;
skl_wm_flush_pipe(dev_priv, pipe, 3);
if ((new->pipe[pipe].start >= old->pipe[otherp].start &&
     new->pipe[pipe].start < old->pipe[otherp].end) ||
    (old->pipe[otherp].start >= new->pipe[pipe].start &&
     old->pipe[otherp].start < new->pipe[pipe].end))
	return true;
}
return false;
}

static int skl_update_pipe_wm(struct drm_crtc_state *cstate, @@ -4038,8 +3924,10 @@ skl_compute_ddb(struct drm_atomic_state *state) struct drm_device *dev = state->dev; struct drm_i915_private *dev_priv = to_i915(dev); struct intel_atomic_state *intel_state = to_intel_atomic_state(state);

struct intel_crtc_state *cstate; struct intel_crtc *intel_crtc;

struct skl_ddb_allocation *ddb = &intel_state->wm_results.ddb;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &intel_state->wm_results.ddb; uint32_t realloc_pipes = pipes_modified(state); int ret;

@@ -4071,13 +3959,11 @@ skl_compute_ddb(struct drm_atomic_state *state) }

for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
struct intel_crtc_state *cstate;
cstate = intel_atomic_get_crtc_state(state, intel_crtc); if (IS_ERR(cstate)) return PTR_ERR(cstate);
ret = skl_allocate_pipe_ddb(cstate, ddb);
ret = skl_allocate_pipe_ddb(cstate, new_ddb);
if (ret) return ret;
@@ -4086,6 +3972,73 @@ skl_compute_ddb(struct drm_atomic_state *state) return ret; }
/*
* When setting up a new DDB allocation arrangement, we need to
* correctly sequence the times at which the new allocations for the
* pipes are taken into account or we'll have pipes fetching from space
* previously allocated to another pipe.
*
* Roughly the final sequence we want looks like this:
*  1. Disable any pipes we're not going to be using anymore
*  2. Reallocate all of the active pipes whose new ddb allocations
*  won't overlap with another active pipe's ddb allocation.
*  3. Reallocate remaining active pipes, if any.
*  4. Enable any new pipes, if any.
*
* Example:
* Initially DDB looks like this:
*   |   B    |   C    |
* And the final DDB should look like this:
*   |  B  |  C  |  A  |
*
* 1. We're not disabling any pipes, so do nothing on this step.
* 2. Pipe B's new allocation wouldn't overlap with pipe C, however
* pipe C's new allocation does overlap with pipe B's current
* allocation. Reallocate B first so the DDB looks like this:
*   |  B  |xx|   C    |
* 3. Now we can safely reallocate pipe C to it's new location:
*   |  B  |  C  |xxxxx|
* 4. Enable any remaining pipes, in this case A
*   |  B  |  C  |  A  |
*
* As well, between every pipe reallocation we have to wait for a
* vblank on the pipe so that we ensure it's new allocation has taken
* effect by the time we start moving the next pipe. This can be
* skipped on the last step we need to perform, which is why we keep
* track of that information here. For example, if we've reallocated
* all the pipes that need changing by the time we reach step 3, we can
* finish without waiting for the pipes we changed in step 3 to update.
*/
for_each_intel_crtc_mask(dev, intel_crtc, realloc_pipes) {
enum pipe pipe = intel_crtc->pipe;
enum skl_ddb_step step;
cstate = intel_atomic_get_crtc_state(state, intel_crtc);
if (IS_ERR(cstate))
	return PTR_ERR(cstate);
/* Step 1: Pipes we're disabling / haven't changed */
if (skl_ddb_allocation_equals(old_ddb, new_ddb, pipe) ||
    new_ddb->pipe[pipe].end == 0) {
	step = SKL_DDB_STEP_NONE;
/* Step 2-3: Active pipes we're reallocating */
} else if (old_ddb->pipe[pipe].end != 0) {
	if (skl_ddb_allocation_overlaps(state, old_ddb, new_ddb,
					pipe))
		step = SKL_DDB_STEP_OVERLAP;
	else
		step = SKL_DDB_STEP_NO_OVERLAP;
/* Step 4: Pipes we're enabling */
} else {
	step = SKL_DDB_STEP_FINAL;
}
cstate->wm.skl.ddb_realloc = step;
if (step > intel_state->last_ddb_step)
	intel_state->last_ddb_step = step;
}

return 0;
}

@@ -4110,10 +4063,13 @@ skl_copy_wm_for_pipe(struct skl_wm_values *dst, static int skl_compute_wm(struct drm_atomic_state *state) {

struct drm_i915_private *dev_priv = to_i915(state->dev); struct drm_crtc *crtc; struct drm_crtc_state *cstate; struct intel_atomic_state *intel_state = to_intel_atomic_state(state); struct skl_wm_values *results = &intel_state->wm_results;

struct skl_ddb_allocation *old_ddb = &dev_priv->wm.skl_hw.ddb;

struct skl_ddb_allocation *new_ddb = &results->ddb; struct skl_pipe_wm *pipe_wm; bool changed = false; int ret, i;

@@ -4152,7 +4108,10 @@ skl_compute_wm(struct drm_atomic_state *state) struct intel_crtc *intel_crtc = to_intel_crtc(crtc); struct intel_crtc_state *intel_cstate = to_intel_crtc_state(cstate);
enum skl_ddb_step step;
enum pipe pipe;
pipe = intel_crtc->pipe;
pipe_wm = &intel_cstate->wm.skl.optimal; ret = skl_update_pipe_wm(cstate, &results->ddb, pipe_wm, &changed);
@@ -4167,7 +4126,18 @@ skl_compute_wm(struct drm_atomic_state *state) continue;
intel_cstate->update_wm_pre = true;
step = intel_cstate->wm.skl.ddb_realloc;
skl_compute_wm_results(crtc->dev, pipe_wm, results, intel_crtc);
if (!skl_ddb_entry_equal(&old_ddb->pipe[pipe],
			 &new_ddb->pipe[pipe])) {
	DRM_DEBUG_KMS(
	    "DDB changes for [CRTC:%d:pipe %c]: (%3d - %3d) -> (%3d - %3d) on step %d\n",
	    intel_crtc->base.base.id, pipe_name(pipe),
	    old_ddb->pipe[pipe].start, old_ddb->pipe[pipe].end,
	    new_ddb->pipe[pipe].start, new_ddb->pipe[pipe].end,
	    step);
}
}

return 0;
@@ -4191,8 +4161,20 @@ static void skl_update_wm(struct drm_crtc *crtc)

mutex_lock(&dev_priv->wm.wm_mutex);

skl_write_wm_values(dev_priv, results);

skl_flush_wm_values(dev_priv, results);
/*
* If this pipe isn't active already, we're going to be enabling it
* very soon. Since it's safe to update these while the pipe's shut off,
* just do so here. Already active pipes will have their watermarks
* updated once we update their planes.
*/
if (!intel_crtc->active) {
int plane;
for (plane = 0; plane < intel_num_planes(intel_crtc); plane++)
	skl_write_plane_wm(intel_crtc, results, plane);
skl_write_cursor_wm(intel_crtc, results);
}

/*

Store the new configuration (but only for the pipes that have
-- 2.7.4
-- Ville Syrjälä Intel OTC
-- Matt Roper Graphics Software Engineer IoTG Platform Enabling & Development Intel Corporation (916) 356-2795

-- Ville Syrjälä Intel OTC

3191

Age (days ago)

3193

Last active (days ago)

dri-devel@lists.freedesktop.org

11 comments

5 participants

tags (0)

participants (5)

Lyude
Lyude Paul
Maarten Lankhorst
Matt Roper
Ville Syrjälä