This is a follow-up to the original patches I sent to try to fix the issue of drm_dp_mst_topology_mgr_resume() not working due to aux transactions failing temporarily after resuming the machine. Unfortunately I haven't been able to figure out the actual cause of this issue; there don't seem to be any IRQs interrupting the DP aux transactions and everything seems to be normal except for the amount of time this MST dock takes to become available again over the DP aux channel. I have made a few discoveries though:
The reason why calling intel_dp_mst_resume() before calling intel_runtime_pm_enable_interrupts() worked is due to how intel_dp_aux_wait_done() works:
static uint32_t intel_dp_aux_wait_done(struct intel_dp *intel_dp, bool has_aux_irq) { struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp); struct drm_device *dev = intel_dig_port->base.base.dev; struct drm_i915_private *dev_priv = dev->dev_private; i915_reg_t ch_ctl = intel_dp->aux_ch_ctl_reg; uint32_t status; bool done;
#define C (((status = I915_READ_NOTRACE(ch_ctl)) & DP_AUX_CH_CTL_SEND_BUSY) == 0) if (has_aux_irq) done = wait_event_timeout(dev_priv->gmbus_wait_queue, C, msecs_to_jiffies_timeout(10)); else done = wait_for_atomic(C, 10) == 0; if (!done) DRM_ERROR("dp aux hw did not signal timeout (has irq: %i)!\n", has_aux_irq); #undef C
return status; }
When calling this function without interrupts enabled, wait_event_timeout() ends up timing out after 10ms, manually checking the DP AUX status register, and discovering that the aux transaction succeeded. This makes the aux transactions take quite a while, but still manage to work. Because of this, there's always a 10ms delay each time we do a transaction, and we end up delaying things long enough for the aux transactions to become functional again which results in intel_dp_mst_resume() working. With interrupts enabled, we get notified of the timeouts within a period of 3ms five times in a row, which doesn't give enough time for the aux transactions to start working again. If we change the timeout to something shorter like 3ms, calling intel_dp_mst_resume() before intel_runtime_pm_enable_interrupts() stops working.
So, my only possible thought for why this issue occurs is that the MST dock simply needs more time to respond. It's possible this issue has actually always been here, but we just never managed to have a skl machine resume quickly enough to notice it. The T560, the only machine I seem to be able to reproduce this issue on, does happen to be the fastest model out of all of the Skylake production machines I have available here.
It should be noted that for the second patch, I've considered a different workaround: calling intel_dp_check_mst_status() before calling drm_dp_mst_topology_mgr_resume(). This is another viable solution since it causes us to try to read the ESI from the dock using intel_dp_dpcd_read_wake(), which retries aux transactions enough times to give the dock time to resume. If everyone would rather that solution, I'd be happy to post that version of the patch instead.
Lyude (2): drm/i915: Call intel_dp_mst_resume() before resuming displays drm/i915: Retry after 30ms if we fail to resume DP MST
drivers/gpu/drm/i915/i915_drv.c | 4 ++-- drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++ 2 files changed, 15 insertions(+), 2 deletions(-)
Since we need MST devices ready before we try to resume displays, calling this after intel_display_resume() can result in some issues with various laptop docks where the monitor won't turn back on after suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com --- drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f357058..08854ae 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -761,12 +761,12 @@ static int i915_drm_resume(struct drm_device *dev) dev_priv->display.hpd_irq_setup(dev); spin_unlock_irq(&dev_priv->irq_lock);
+ intel_dp_mst_resume(dev); + drm_modeset_lock_all(dev); intel_display_resume(dev); drm_modeset_unlock_all(dev);
- intel_dp_mst_resume(dev); - /* * ... but also need to make sure that hotplug processing * doesn't cause havoc. Like in the driver load code we don't
On Fri, Mar 11, 2016 at 10:57:01AM -0500, Lyude wrote:
Since we need MST devices ready before we try to resume displays, calling this after intel_display_resume() can result in some issues with various laptop docks where the monitor won't turn back on after suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com
Don't we need to first apply patch 2/2 to avoid breaking systems in-between? -Daniel
drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f357058..08854ae 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -761,12 +761,12 @@ static int i915_drm_resume(struct drm_device *dev) dev_priv->display.hpd_irq_setup(dev); spin_unlock_irq(&dev_priv->irq_lock);
- intel_dp_mst_resume(dev);
- drm_modeset_lock_all(dev); intel_display_resume(dev); drm_modeset_unlock_all(dev);
- intel_dp_mst_resume(dev);
- /*
- ... but also need to make sure that hotplug processing
- doesn't cause havoc. Like in the driver load code we don't
-- 2.5.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Sun, 2016-03-13 at 19:45 +0100, Daniel Vetter wrote:
On Fri, Mar 11, 2016 at 10:57:01AM -0500, Lyude wrote:
Since we need MST devices ready before we try to resume displays, calling this after intel_display_resume() can result in some issues with various laptop docks where the monitor won't turn back on after suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com
Don't we need to first apply patch 2/2 to avoid breaking systems in-between? -Daniel
AFAICT the warns don't appear even with this patch, so no.
drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f357058..08854ae 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -761,12 +761,12 @@ static int i915_drm_resume(struct drm_device *dev) dev_priv->display.hpd_irq_setup(dev); spin_unlock_irq(&dev_priv->irq_lock);
- intel_dp_mst_resume(dev);
drm_modeset_lock_all(dev); intel_display_resume(dev); drm_modeset_unlock_all(dev);
- intel_dp_mst_resume(dev);
/* * ... but also need to make sure that hotplug processing * doesn't cause havoc. Like in the driver load code we don't -- 2.5.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
bump
Could we get a reviewed-by for this patch? It's needed in addition to the patch series I sent for removing intel_dp_dpcd_read_wake() for the T560 to have it's monitors work properly on resume.
On Wed, 2016-03-16 at 17:49 -0400, Lyude Paul wrote:
On Sun, 2016-03-13 at 19:45 +0100, Daniel Vetter wrote:
On Fri, Mar 11, 2016 at 10:57:01AM -0500, Lyude wrote:
Since we need MST devices ready before we try to resume displays, calling this after intel_display_resume() can result in some issues with various laptop docks where the monitor won't turn back on after suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com
Don't we need to first apply patch 2/2 to avoid breaking systems in-between? -Daniel
AFAICT the warns don't appear even with this patch, so no.
drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f357058..08854ae 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -761,12 +761,12 @@ static int i915_drm_resume(struct drm_device *dev) dev_priv->display.hpd_irq_setup(dev); spin_unlock_irq(&dev_priv->irq_lock);
- intel_dp_mst_resume(dev);
drm_modeset_lock_all(dev); intel_display_resume(dev); drm_modeset_unlock_all(dev);
- intel_dp_mst_resume(dev);
/* * ... but also need to make sure that hotplug processing * doesn't cause havoc. Like in the driver load code we don't -- 2.5.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Mar 29, 2016 at 10:11:54AM -0400, Lyude Paul wrote:
bump
Could we get a reviewed-by for this patch? It's needed in addition to the patch series I sent for removing intel_dp_dpcd_read_wake() for the T560 to have it's monitors work properly on resume.
Applied, thanks. -Daniel
On Wed, 2016-03-16 at 17:49 -0400, Lyude Paul wrote:
On Sun, 2016-03-13 at 19:45 +0100, Daniel Vetter wrote:
On Fri, Mar 11, 2016 at 10:57:01AM -0500, Lyude wrote:
Since we need MST devices ready before we try to resume displays, calling this after intel_display_resume() can result in some issues with various laptop docks where the monitor won't turn back on after suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com
Don't we need to first apply patch 2/2 to avoid breaking systems in-between? -Daniel
AFAICT the warns don't appear even with this patch, so no.
drivers/gpu/drm/i915/i915_drv.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f357058..08854ae 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -761,12 +761,12 @@ static int i915_drm_resume(struct drm_device *dev) dev_priv->display.hpd_irq_setup(dev); spin_unlock_irq(&dev_priv->irq_lock);
- intel_dp_mst_resume(dev);
drm_modeset_lock_all(dev); intel_display_resume(dev); drm_modeset_unlock_all(dev);
- intel_dp_mst_resume(dev);
/* * ... but also need to make sure that hotplug processing * doesn't cause havoc. Like in the driver load code we don't -- 2.5.0
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Cheers, Lyude
For whatever reason, I've found that some laptops aren't immediately capable of doing aux transactions with their docks when they come out of standby. While I'm still not entirely sure what the cause of this is, sleeping for 30ms and then retrying drm_dp_mst_topology_mgr_resume() should be a sufficient enough workaround until we find a real fix.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com --- drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 1d8de43..8cc5f6f 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -6114,6 +6114,19 @@ void intel_dp_mst_resume(struct drm_device *dev)
ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr); if (ret != 0) { + /* + * For some reason, some laptops can't bring + * their MST docks back up immediately after + * resume and need to wait a short period of + * time before aux transactions with the dock + * become functional again. Until we find a + * proper fix for this, this workaround should + * suffice + */ + msleep(30); + ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr); + } + if (ret != 0) { intel_dp_check_mst_status(&intel_dig_port->dp); } }
On Fri, Mar 11, 2016 at 10:57:02AM -0500, Lyude wrote:
For whatever reason, I've found that some laptops aren't immediately capable of doing aux transactions with their docks when they come out of standby. While I'm still not entirely sure what the cause of this is, sleeping for 30ms and then retrying drm_dp_mst_topology_mgr_resume() should be a sufficient enough workaround until we find a real fix.
CC: stable@vger.kernel.org Signed-off-by: Lyude cpaul@redhat.com
drivers/gpu/drm/i915/intel_dp.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 1d8de43..8cc5f6f 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -6114,6 +6114,19 @@ void intel_dp_mst_resume(struct drm_device *dev)
ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr); if (ret != 0) {
/*
* For some reason, some laptops can't bring
* their MST docks back up immediately after
* resume and need to wait a short period of
* time before aux transactions with the dock
* become functional again. Until we find a
* proper fix for this, this workaround should
* suffice
*/
msleep(30);
ret = drm_dp_mst_topology_mgr_resume(&intel_dig_port->dp.mst_mgr);
}
Hm, since it's the dp aux that fails (and not something higher up apparently) shouldnt' we have this massive retry somewhere in the dp aux helpers maybe? DP resume in general is a bit fragile, maybe we're just missing a lot of retries in general?
Either way this needs a lot more details. Comment definitely should start out with FIXME, and the commit message should have a protocol of all the experiments you've done thus far. Yes this means a ridiculously long commit message, but in roughly 2 weeks someone else will go wtf on this, and then they must be able to read up the full story. And we need links to bugzillas and mail threads, too.
And please Cc: Art with this one too.
Thanks, Daniel
}if (ret != 0) { intel_dp_check_mst_status(&intel_dig_port->dp); }
-- 2.5.0
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org