From: Ville Syrjälä ville.syrjala@linux.intel.com
This is mostly a repost of the earlier series [1]. Most of the patches have been reviewed, but I have added quite a few new ones to the end to fix various issues.
[1] http://lists.freedesktop.org/archives/dri-devel/2014-May/060518.html
Ville Syrjälä (19): drm: Always reject drm_vblank_get() after drm_vblank_off() drm/i915: Warn if drm_vblank_get() still works after drm_vblank_off() drm: Don't clear vblank timestamps when vblank interrupt is disabled drm: Move drm_update_vblank_count() drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off() drm: Avoid random vblank counter jumps if the hardware counter has been reset drm: Reduce the amount of dev->vblank[crtc] in the code drm: Fix deadlock between event_lock and vbl_lock/vblank_time_lock drm: Fix race between drm_vblank_off() and drm_queue_vblank_event() drm: Disable vblank interrupt immediately when drm_vblank_offdelay<0 drm: Add dev->vblank_disable_immediate flag drm/i915: Opt out of vblank disable timer on >gen2 drm: Kick start vblank interrupts at drm_vblank_on() drm: Don't update vblank timestamp when the counter didn't change drm: Update vblank->last in drm_update_vblank_count() drm: Store the vblank timestamp when adjusting the counter during disable drm/i915: Clear .last vblank count before drm_vblank_off() when sanitizing crtc state drm/i915: Update scanline_offset only for active crtcs drm: Fix confusing debug message in drm_update_vblank_count()
Documentation/DocBook/drm.tmpl | 7 + drivers/gpu/drm/drm_drv.c | 4 +- drivers/gpu/drm/drm_irq.c | 344 ++++++++++++++++++++++------------- drivers/gpu/drm/i915/i915_irq.c | 8 + drivers/gpu/drm/i915/intel_display.c | 20 +- include/drm/drmP.h | 12 +- 6 files changed, 259 insertions(+), 136 deletions(-)
From: Ville Syrjälä ville.syrjala@linux.intel.com
Make sure drm_vblank_get() never succeeds when called between drm_vblank_off() and drm_vblank_on(). Borrow a trick from the old drm_vblank_{pre,post}_modeset() functions and just bump the refcount in drm_vblank_off() and drop it in drm_vblank_on().
When drm_vblank_get() encounters a >0 refcount and the vblank interrupt is already disabled it will simply return -EINVAL.
Hopefully the use of inmodeset won't conflict badly with drm_vblank_{pre,post}_modeset().
For i915 there's a window between drm_vblank_off() and marking the crtc as inactive where the current code still allows drm_vblank_get().
v2: Describe what drm_vblank_get() does to explain how a simple refcount bump manages to fix things (Daniel)
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 0de123a..b16a636 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1039,6 +1039,15 @@ void drm_vblank_off(struct drm_device *dev, int crtc) } spin_unlock(&dev->event_lock);
+ /* + * Prevent subsequent drm_vblank_get() from re-enabling + * the vblank interrupt by bumping the refcount. + */ + if (!dev->vblank[crtc].inmodeset) { + atomic_inc(&dev->vblank[crtc].refcount); + dev->vblank[crtc].inmodeset = 1; + } + spin_unlock_irqrestore(&dev->vbl_lock, irqflags); } EXPORT_SYMBOL(drm_vblank_off); @@ -1079,6 +1088,11 @@ void drm_vblank_on(struct drm_device *dev, int crtc) unsigned long irqflags;
spin_lock_irqsave(&dev->vbl_lock, irqflags); + /* Drop our private "prevent drm_vblank_get" refcount */ + if (dev->vblank[crtc].inmodeset) { + atomic_dec(&dev->vblank[crtc].refcount); + dev->vblank[crtc].inmodeset = 0; + } /* re-enable interrupts if there's are users left */ if (atomic_read(&dev->vblank[crtc].refcount) != 0) WARN_ON(drm_vblank_enable(dev, crtc));
From: Ville Syrjälä ville.syrjala@linux.intel.com
v2: Drop the drm_vblank_off() (Daniel) Use drm_crtc_vblank_{get,put}()
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/i915/intel_display.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 1b4cb0b..ae5f20d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -1342,6 +1342,12 @@ static void assert_sprites_disabled(struct drm_i915_private *dev_priv, } }
+static void assert_vblank_disabled(struct drm_crtc *crtc) +{ + if (WARN_ON(drm_crtc_vblank_get(crtc) == 0)) + drm_crtc_vblank_put(crtc); +} + static void ibx_assert_pch_refclk_enabled(struct drm_i915_private *dev_priv) { u32 val; @@ -3909,6 +3915,8 @@ static void intel_crtc_enable_planes(struct drm_crtc *crtc) int pipe = intel_crtc->pipe; int plane = intel_crtc->plane;
+ assert_vblank_disabled(crtc); + drm_vblank_on(dev, pipe);
intel_enable_primary_hw_plane(dev_priv, plane, pipe); @@ -3958,6 +3966,8 @@ static void intel_crtc_disable_planes(struct drm_crtc *crtc) intel_frontbuffer_flip(dev, INTEL_FRONTBUFFER_ALL_MASK(pipe));
drm_vblank_off(dev, pipe); + + assert_vblank_disabled(crtc); }
static void ironlake_crtc_enable(struct drm_crtc *crtc)
From: Ville Syrjälä ville.syrjala@linux.intel.com
Clearing the timestamps causes us to send zeroed timestamps to userspace if they get sent out in response to the drm_vblank_off(). It's better to send the very latest timestamp and count instead.
Testcase: igt/kms_flip/modeset-vs-vblank-race Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 11 ----------- 1 file changed, 11 deletions(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index b16a636..65d2da9 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -56,14 +56,6 @@ #define DRM_REDUNDANT_VBLIRQ_THRESH_NS 1000000
/* - * Clear vblank timestamp buffer for a crtc. - */ -static void clear_vblank_timestamps(struct drm_device *dev, int crtc) -{ - memset(dev->vblank[crtc].time, 0, sizeof(dev->vblank[crtc].time)); -} - -/* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter * are preserved, even if there are any spurious vblank irq's after @@ -131,9 +123,6 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) smp_mb__after_atomic(); }
- /* Invalidate all timestamps while vblank irq's are off. */ - clear_vblank_timestamps(dev, crtc); - spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); }
From: Ville Syrjälä ville.syrjala@linux.intel.com
Move drm_update_vblank_count() to avoid forward a declaration. No functional change.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 128 +++++++++++++++++++++++----------------------- 1 file changed, 64 insertions(+), 64 deletions(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 65d2da9..af96517 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -55,6 +55,70 @@ */ #define DRM_REDUNDANT_VBLIRQ_THRESH_NS 1000000
+/** + * drm_update_vblank_count - update the master vblank counter + * @dev: DRM device + * @crtc: counter to update + * + * Call back into the driver to update the appropriate vblank counter + * (specified by @crtc). Deal with wraparound, if it occurred, and + * update the last read value so we can deal with wraparound on the next + * call if necessary. + * + * Only necessary when going from off->on, to account for frames we + * didn't get an interrupt for. + * + * Note: caller must hold dev->vbl_lock since this reads & writes + * device vblank fields. + */ +static void drm_update_vblank_count(struct drm_device *dev, int crtc) +{ + u32 cur_vblank, diff, tslot, rc; + struct timeval t_vblank; + + /* + * Interrupts were disabled prior to this call, so deal with counter + * wrap if needed. + * NOTE! It's possible we lost a full dev->max_vblank_count events + * here if the register is small or we had vblank interrupts off for + * a long time. + * + * We repeat the hardware vblank counter & timestamp query until + * we get consistent results. This to prevent races between gpu + * updating its hardware counter while we are retrieving the + * corresponding vblank timestamp. + */ + do { + cur_vblank = dev->driver->get_vblank_counter(dev, crtc); + rc = drm_get_last_vbltimestamp(dev, crtc, &t_vblank, 0); + } while (cur_vblank != dev->driver->get_vblank_counter(dev, crtc)); + + /* Deal with counter wrap */ + diff = cur_vblank - dev->vblank[crtc].last; + if (cur_vblank < dev->vblank[crtc].last) { + diff += dev->max_vblank_count; + + DRM_DEBUG("last_vblank[%d]=0x%x, cur_vblank=0x%x => diff=0x%x\n", + crtc, dev->vblank[crtc].last, cur_vblank, diff); + } + + DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", + crtc, diff); + + /* Reinitialize corresponding vblank timestamp if high-precision query + * available. Skip this step if query unsupported or failed. Will + * reinitialize delayed at next vblank interrupt in that case. + */ + if (rc) { + tslot = atomic_read(&dev->vblank[crtc].count) + diff; + vblanktimestamp(dev, crtc, tslot) = t_vblank; + } + + smp_mb__before_atomic(); + atomic_add(diff, &dev->vblank[crtc].count); + smp_mb__after_atomic(); +} + /* * Disable vblank irq's on crtc, make sure that last vblank count * of hardware and corresponding consistent software vblank counter @@ -799,70 +863,6 @@ void drm_send_vblank_event(struct drm_device *dev, int crtc, EXPORT_SYMBOL(drm_send_vblank_event);
/** - * drm_update_vblank_count - update the master vblank counter - * @dev: DRM device - * @crtc: counter to update - * - * Call back into the driver to update the appropriate vblank counter - * (specified by @crtc). Deal with wraparound, if it occurred, and - * update the last read value so we can deal with wraparound on the next - * call if necessary. - * - * Only necessary when going from off->on, to account for frames we - * didn't get an interrupt for. - * - * Note: caller must hold dev->vbl_lock since this reads & writes - * device vblank fields. - */ -static void drm_update_vblank_count(struct drm_device *dev, int crtc) -{ - u32 cur_vblank, diff, tslot, rc; - struct timeval t_vblank; - - /* - * Interrupts were disabled prior to this call, so deal with counter - * wrap if needed. - * NOTE! It's possible we lost a full dev->max_vblank_count events - * here if the register is small or we had vblank interrupts off for - * a long time. - * - * We repeat the hardware vblank counter & timestamp query until - * we get consistent results. This to prevent races between gpu - * updating its hardware counter while we are retrieving the - * corresponding vblank timestamp. - */ - do { - cur_vblank = dev->driver->get_vblank_counter(dev, crtc); - rc = drm_get_last_vbltimestamp(dev, crtc, &t_vblank, 0); - } while (cur_vblank != dev->driver->get_vblank_counter(dev, crtc)); - - /* Deal with counter wrap */ - diff = cur_vblank - dev->vblank[crtc].last; - if (cur_vblank < dev->vblank[crtc].last) { - diff += dev->max_vblank_count; - - DRM_DEBUG("last_vblank[%d]=0x%x, cur_vblank=0x%x => diff=0x%x\n", - crtc, dev->vblank[crtc].last, cur_vblank, diff); - } - - DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", - crtc, diff); - - /* Reinitialize corresponding vblank timestamp if high-precision query - * available. Skip this step if query unsupported or failed. Will - * reinitialize delayed at next vblank interrupt in that case. - */ - if (rc) { - tslot = atomic_read(&dev->vblank[crtc].count) + diff; - vblanktimestamp(dev, crtc, tslot) = t_vblank; - } - - smp_mb__before_atomic(); - atomic_add(diff, &dev->vblank[crtc].count); - smp_mb__after_atomic(); -} - -/** * drm_vblank_enable - enable the vblank interrupt on a CRTC * @dev: DRM device * @crtc: CRTC in question
From: Ville Syrjälä ville.syrjala@linux.intel.com
If the vblank irq has already been disabled (via the disable timer) when we call drm_vblank_off() sample the counter and timestamp one last time. This will make the sure that the user space visible counter will account for time between vblank irq disable and drm_vblank_off().
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af96517..1f86f6c 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) */ spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
+ /* + * If the vblank interrupt was already disbled update the count + * and timestamp to maintain the appearance that the counter + * has been ticking all along until this time. This makes the + * count account for the entire time between drm_vblank_on() and + * drm_vblank_off(). + */ + if (!dev->vblank[crtc].enabled) { + drm_update_vblank_count(dev, crtc); + spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); + return; + } + dev->driver->disable_vblank(dev, crtc); dev->vblank[crtc].enabled = false;
Hi Ville,
went through the vblank rework patch set, mostly looks good to me. I couldn't find any bugs in the code. A first quick test-run on my old Intel GMA-950 (Gen 3'ish i think?) also didn't show apparent problems with the OML_sync_control functions. I'll try to test more carefully with that card and maybe with a few more cards in the next days, if i can get my hands on something more recent.
The problematic bits:
Patch 3/19 [Don't clear vblank timestamp...] in combination with [5/19 below]:
I agree that not clearing the timestamps during drm_vblank_off() is probably the better thing to do for userspace. The idea behind clearing the timestamps was that a ust timestamp of zero signals to userspace that the timestamp is invalid/undefined at the moment, so the client should retry the query if it needs a valid timestamp. This worked in practice insofar as a value of zero can't happen normally, unless a client would query a timestamp during the first microsecond since machine powerup. But i guess returning the last valid (msc, ust) pair to a client during vblank off may be better for things like compositors etc. I also wonder if we ever documented this ust == 0 -> -EAGAIN behaviour?
The problem with patch 5/19 is gpus/drivers which don't provide precise instantaneous vblank timestamps - which are afaik everything except intel, radeon and nouveau. On such drivers, the old code would return a zero ust timestamp during queries after the first drm_vblank_get() and until the first vblank irq happens and initializes the timestamps to something valid. The zero ust would signal "please retry" to the client. With patch 5/19 you'd get an updated vblank counter with an outdated vblank timestamp - whatever is stored in the ringbuffer from the past, because drm_update_vblank_count() can't update the timestamp without support for the optional vblank-timestamp driver function. A mismatched msc, ust would be very confusing to clients.
The only way one could get valid msc + ust on such drivers would be to enable vblank irq's and then wait for the next vblank irq to actually update the timestamp, at the cost of a couple of msecs waiting time.
So either have drm_update_vblank_count() itself sleep until next vblank "if (!rc) ..." at the very end, as a rc == 0 would signal an imprecise/wrong vblank timestamp. Or have all callers of it do this, if locking makes it neccessary. Or only care about it for the drm_vblank_off() special case, e.g., if !vblank->enabled there, then drm_vblank_get() -> wait for a valid timestamp and count to show up -> drm_vblank_put() -> vblank_disable_and_save().
For Patch 11/19 [Add dev->vblank_disable_immediate flag]: Can we make it so that a drm_vblank_offdelay module parameter of zero always overrides the flag? The only reason why a user wants to set drm_vblank_offdelay to zero is if that user absolutely needs precise and reliable vblank counts/timestamps and finds out that something is broken with his driver+gpu, so uses this as an override to temporarily fix a broken driver. That doesn't work if the vblank_disable_immediate flag overrides the override from the user - the user couldn't opt out of the trouble.
This might not be such an issue with Intel cards, as you have test suites and a QA team, and i assume/hope you tested every single intel gpu shipped in the last decade or so if the whole vblank off/on logic really is perfectly race-free now? At least it seems to work with that one gen-3 card i quickly tested. But for most other drivers with small teams and no dedicated QA this could end badly quickly for the user without any manual override.
The docs should probably clarify that a hw vblank counter isn't enough for the vblank_disable_immediate flag to be set. Their vblank off/on/hardware counter query implementation must be completely race free. iirc this means the hw counter query must behave as if the vblank counter always increments at the leading edge of vblank. E.g., radeon has hw counter queries, but the counter increments either at the trailing edge, or somewhere in the middle of vblank, so there it wouldn't work without races, causing off-by-one errors sometimes.
For Patch 14/19 [Don't update vblank timestamp when the counter didn't change]
That would go wrong if a driver doesn't implement a proper vblank counter query. E.g., nouveau has precise vblank timestamping since Linux 3.14, but still no functional hw counter query.
Almost all embedded gpu drivers currently implement completely bogus hw vblank counter queries, because that driver hook is mandatory. I think it would make sense if we would make that hook optional, allow a NULL function pointer and adapt to the lack of that query, e.g., by never disabling vblank irq's, except in drm_vblank_off() when a kms-driver insists on disabling its irq during modeset/dpms off/suspend etc.
With these remarks somehow taken into account you have my
Reviewed-by: Mario Kleiner mario.kleiner.de@gmail.com
for the whole series, if you want.
thanks, -mario
On 08/06/2014 01:49 PM, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
If the vblank irq has already been disabled (via the disable timer) when we call drm_vblank_off() sample the counter and timestamp one last time. This will make the sure that the user space visible counter will account for time between vblank irq disable and drm_vblank_off().
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/drm_irq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af96517..1f86f6c 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) */ spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
- /*
* If the vblank interrupt was already disbled update the count
* and timestamp to maintain the appearance that the counter
* has been ticking all along until this time. This makes the
* count account for the entire time between drm_vblank_on() and
* drm_vblank_off().
*/
- if (!dev->vblank[crtc].enabled) {
drm_update_vblank_count(dev, crtc);
spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags);
return;
- }
- dev->driver->disable_vblank(dev, crtc); dev->vblank[crtc].enabled = false;
From: Ville Syrjälä ville.syrjala@linux.intel.com
When drm_vblank_on() is called the hardware vblank counter may have been reset, so we can't trust that the old values sampled prior to drm_vblank_off() have anything to do with the new values.
So update the .last count in drm_vblank_on() to make the first drm_vblank_enable() consider that as the reference point. This will correct the user space visible counter to account for the time between drm_vblank_on() and the first drm_vblank_enable() calls.
For extra safety subtract one from the .last count in drm_vblank_on() to make sure that user space will never see the same counter value before and after modeset.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 1f86f6c..fc1525a 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1095,6 +1095,18 @@ void drm_vblank_on(struct drm_device *dev, int crtc) atomic_dec(&dev->vblank[crtc].refcount); dev->vblank[crtc].inmodeset = 0; } + + /* + * sample the current counter to avoid random jumps + * when drm_vblank_enable() applies the diff + * + * -1 to make sure user will never see the same + * vblank counter value before and after a modeset + */ + dev->vblank[crtc].last = + (dev->driver->get_vblank_counter(dev, crtc) - 1) & + dev->max_vblank_count; + /* re-enable interrupts if there's are users left */ if (atomic_read(&dev->vblank[crtc].refcount) != 0) WARN_ON(drm_vblank_enable(dev, crtc));
From: Ville Syrjälä ville.syrjala@linux.intel.com
Declare a local struct drm_vblank_crtc * and use that instead of having to do dig it out via 'dev->vblank[crtc]' everywhere.
Performed with the following coccinelle incantation, and a few manual whitespace cleanups:
@@ identifier func,member; expression num_crtcs; struct drm_device *dev; unsigned int crtc; @@ func (...) { + struct drm_vblank_crtc *vblank; ... if (crtc >= num_crtcs) return ...; + vblank = &dev->vblank[crtc]; <+... ( - dev->vblank[crtc].member + vblank->member | - &(dev->vblank[crtc]) + vblank ) ...+> }
@@ struct drm_device *dev; int crtc; identifier member; expression num_crtcs; @@ for (crtc = 0; crtc < num_crtcs; crtc++) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; + <+... ( - dev->vblank[crtc].member + vblank->member | - &(dev->vblank[crtc]) + vblank ) ...+> }
@@ identifier func,member; @@ func (struct drm_device *dev, int crtc, ...) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; <+... ( - dev->vblank[crtc].member + vblank->member | - &(dev->vblank[crtc]) + vblank ) ...+> }
v2: Rebased
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 134 +++++++++++++++++++++++++++------------------- 1 file changed, 79 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index fc1525a..b4460bf 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -73,6 +73,7 @@ */ static void drm_update_vblank_count(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; u32 cur_vblank, diff, tslot, rc; struct timeval t_vblank;
@@ -94,12 +95,12 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) } while (cur_vblank != dev->driver->get_vblank_counter(dev, crtc));
/* Deal with counter wrap */ - diff = cur_vblank - dev->vblank[crtc].last; - if (cur_vblank < dev->vblank[crtc].last) { + diff = cur_vblank - vblank->last; + if (cur_vblank < vblank->last) { diff += dev->max_vblank_count;
DRM_DEBUG("last_vblank[%d]=0x%x, cur_vblank=0x%x => diff=0x%x\n", - crtc, dev->vblank[crtc].last, cur_vblank, diff); + crtc, vblank->last, cur_vblank, diff); }
DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", @@ -110,12 +111,12 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) * reinitialize delayed at next vblank interrupt in that case. */ if (rc) { - tslot = atomic_read(&dev->vblank[crtc].count) + diff; + tslot = atomic_read(&vblank->count) + diff; vblanktimestamp(dev, crtc, tslot) = t_vblank; }
smp_mb__before_atomic(); - atomic_add(diff, &dev->vblank[crtc].count); + atomic_add(diff, &vblank->count); smp_mb__after_atomic(); }
@@ -127,6 +128,7 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) */ static void vblank_disable_and_save(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; unsigned long irqflags; u32 vblcount; s64 diff_ns; @@ -147,14 +149,14 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) * count account for the entire time between drm_vblank_on() and * drm_vblank_off(). */ - if (!dev->vblank[crtc].enabled) { + if (!vblank->enabled) { drm_update_vblank_count(dev, crtc); spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); return; }
dev->driver->disable_vblank(dev, crtc); - dev->vblank[crtc].enabled = false; + vblank->enabled = false;
/* No further vblank irq's will be processed after * this point. Get current hardware vblank count and @@ -169,9 +171,9 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) * delayed gpu counter increment. */ do { - dev->vblank[crtc].last = dev->driver->get_vblank_counter(dev, crtc); + vblank->last = dev->driver->get_vblank_counter(dev, crtc); vblrc = drm_get_last_vbltimestamp(dev, crtc, &tvblank, 0); - } while (dev->vblank[crtc].last != dev->driver->get_vblank_counter(dev, crtc) && (--count) && vblrc); + } while (vblank->last != dev->driver->get_vblank_counter(dev, crtc) && (--count) && vblrc);
if (!count) vblrc = 0; @@ -179,7 +181,7 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) /* Compute time difference to stored timestamp of last vblank * as updated by last invocation of drm_handle_vblank() in vblank irq. */ - vblcount = atomic_read(&dev->vblank[crtc].count); + vblcount = atomic_read(&vblank->count); diff_ns = timeval_to_ns(&tvblank) - timeval_to_ns(&vblanktimestamp(dev, crtc, vblcount));
@@ -196,7 +198,7 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) * hope for the best. */ if ((vblrc > 0) && (abs64(diff_ns) > 1000000)) { - atomic_inc(&dev->vblank[crtc].count); + atomic_inc(&vblank->count); smp_mb__after_atomic(); }
@@ -236,8 +238,10 @@ void drm_vblank_cleanup(struct drm_device *dev) return;
for (crtc = 0; crtc < dev->num_crtcs; crtc++) { - del_timer_sync(&dev->vblank[crtc].disable_timer); - vblank_disable_fn((unsigned long)&dev->vblank[crtc]); + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; + + del_timer_sync(&vblank->disable_timer); + vblank_disable_fn((unsigned long)vblank); }
kfree(dev->vblank); @@ -270,11 +274,13 @@ int drm_vblank_init(struct drm_device *dev, int num_crtcs) goto err;
for (i = 0; i < num_crtcs; i++) { - dev->vblank[i].dev = dev; - dev->vblank[i].crtc = i; - init_waitqueue_head(&dev->vblank[i].queue); - setup_timer(&dev->vblank[i].disable_timer, vblank_disable_fn, - (unsigned long)&dev->vblank[i]); + struct drm_vblank_crtc *vblank = &dev->vblank[i]; + + vblank->dev = dev; + vblank->crtc = i; + init_waitqueue_head(&vblank->queue); + setup_timer(&vblank->disable_timer, vblank_disable_fn, + (unsigned long)vblank); }
DRM_INFO("Supports vblank timestamp caching Rev 2 (21.10.2013).\n"); @@ -426,9 +432,11 @@ int drm_irq_uninstall(struct drm_device *dev) if (dev->num_crtcs) { spin_lock_irqsave(&dev->vbl_lock, irqflags); for (i = 0; i < dev->num_crtcs; i++) { - wake_up(&dev->vblank[i].queue); - dev->vblank[i].enabled = false; - dev->vblank[i].last = + struct drm_vblank_crtc *vblank = &dev->vblank[i]; + + wake_up(&vblank->queue); + vblank->enabled = false; + vblank->last = dev->driver->get_vblank_counter(dev, i); } spin_unlock_irqrestore(&dev->vbl_lock, irqflags); @@ -796,7 +804,9 @@ EXPORT_SYMBOL(drm_get_last_vbltimestamp); */ u32 drm_vblank_count(struct drm_device *dev, int crtc) { - return atomic_read(&dev->vblank[crtc].count); + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; + + return atomic_read(&vblank->count); } EXPORT_SYMBOL(drm_vblank_count);
@@ -816,6 +826,7 @@ EXPORT_SYMBOL(drm_vblank_count); u32 drm_vblank_count_and_time(struct drm_device *dev, int crtc, struct timeval *vblanktime) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; u32 cur_vblank;
/* Read timestamp from slot of _vblank_time ringbuffer @@ -824,10 +835,10 @@ u32 drm_vblank_count_and_time(struct drm_device *dev, int crtc, * a seqlock. */ do { - cur_vblank = atomic_read(&dev->vblank[crtc].count); + cur_vblank = atomic_read(&vblank->count); *vblanktime = vblanktimestamp(dev, crtc, cur_vblank); smp_rmb(); - } while (cur_vblank != atomic_read(&dev->vblank[crtc].count)); + } while (cur_vblank != atomic_read(&vblank->count));
return cur_vblank; } @@ -882,13 +893,14 @@ EXPORT_SYMBOL(drm_send_vblank_event); */ static int drm_vblank_enable(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; int ret = 0;
assert_spin_locked(&dev->vbl_lock);
spin_lock(&dev->vblank_time_lock);
- if (!dev->vblank[crtc].enabled) { + if (!vblank->enabled) { /* * Enable vblank irqs under vblank_time_lock protection. * All vblank count & timestamp updates are held off @@ -899,9 +911,9 @@ static int drm_vblank_enable(struct drm_device *dev, int crtc) ret = dev->driver->enable_vblank(dev, crtc); DRM_DEBUG("enabling vblank on crtc %d, ret: %d\n", crtc, ret); if (ret) - atomic_dec(&dev->vblank[crtc].refcount); + atomic_dec(&vblank->refcount); else { - dev->vblank[crtc].enabled = true; + vblank->enabled = true; drm_update_vblank_count(dev, crtc); } } @@ -926,16 +938,17 @@ static int drm_vblank_enable(struct drm_device *dev, int crtc) */ int drm_vblank_get(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; unsigned long irqflags; int ret = 0;
spin_lock_irqsave(&dev->vbl_lock, irqflags); /* Going from 0->1 means we have to enable interrupts again */ - if (atomic_add_return(1, &dev->vblank[crtc].refcount) == 1) { + if (atomic_add_return(1, &vblank->refcount) == 1) { ret = drm_vblank_enable(dev, crtc); } else { - if (!dev->vblank[crtc].enabled) { - atomic_dec(&dev->vblank[crtc].refcount); + if (!vblank->enabled) { + atomic_dec(&vblank->refcount); ret = -EINVAL; } } @@ -975,12 +988,14 @@ EXPORT_SYMBOL(drm_crtc_vblank_get); */ void drm_vblank_put(struct drm_device *dev, int crtc) { - BUG_ON(atomic_read(&dev->vblank[crtc].refcount) == 0); + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; + + BUG_ON(atomic_read(&vblank->refcount) == 0);
/* Last user schedules interrupt disable */ - if (atomic_dec_and_test(&dev->vblank[crtc].refcount) && + if (atomic_dec_and_test(&vblank->refcount) && (drm_vblank_offdelay > 0)) - mod_timer(&dev->vblank[crtc].disable_timer, + mod_timer(&vblank->disable_timer, jiffies + ((drm_vblank_offdelay * HZ)/1000)); } EXPORT_SYMBOL(drm_vblank_put); @@ -1016,6 +1031,7 @@ EXPORT_SYMBOL(drm_crtc_vblank_put); */ void drm_vblank_off(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; struct drm_pending_vblank_event *e, *t; struct timeval now; unsigned long irqflags; @@ -1023,7 +1039,7 @@ void drm_vblank_off(struct drm_device *dev, int crtc)
spin_lock_irqsave(&dev->vbl_lock, irqflags); vblank_disable_and_save(dev, crtc); - wake_up(&dev->vblank[crtc].queue); + wake_up(&vblank->queue);
/* Send any queued vblank events, lest the natives grow disquiet */ seq = drm_vblank_count_and_time(dev, crtc, &now); @@ -1045,9 +1061,9 @@ void drm_vblank_off(struct drm_device *dev, int crtc) * Prevent subsequent drm_vblank_get() from re-enabling * the vblank interrupt by bumping the refcount. */ - if (!dev->vblank[crtc].inmodeset) { - atomic_inc(&dev->vblank[crtc].refcount); - dev->vblank[crtc].inmodeset = 1; + if (!vblank->inmodeset) { + atomic_inc(&vblank->refcount); + vblank->inmodeset = 1; }
spin_unlock_irqrestore(&dev->vbl_lock, irqflags); @@ -1087,13 +1103,14 @@ EXPORT_SYMBOL(drm_crtc_vblank_off); */ void drm_vblank_on(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; unsigned long irqflags;
spin_lock_irqsave(&dev->vbl_lock, irqflags); /* Drop our private "prevent drm_vblank_get" refcount */ - if (dev->vblank[crtc].inmodeset) { - atomic_dec(&dev->vblank[crtc].refcount); - dev->vblank[crtc].inmodeset = 0; + if (vblank->inmodeset) { + atomic_dec(&vblank->refcount); + vblank->inmodeset = 0; }
/* @@ -1103,12 +1120,12 @@ void drm_vblank_on(struct drm_device *dev, int crtc) * -1 to make sure user will never see the same * vblank counter value before and after a modeset */ - dev->vblank[crtc].last = + vblank->last = (dev->driver->get_vblank_counter(dev, crtc) - 1) & dev->max_vblank_count;
/* re-enable interrupts if there's are users left */ - if (atomic_read(&dev->vblank[crtc].refcount) != 0) + if (atomic_read(&vblank->refcount) != 0) WARN_ON(drm_vblank_enable(dev, crtc)); spin_unlock_irqrestore(&dev->vbl_lock, irqflags); } @@ -1156,6 +1173,8 @@ EXPORT_SYMBOL(drm_crtc_vblank_on); */ void drm_vblank_pre_modeset(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; + /* vblank is not initialized (IRQ not installed ?), or has been freed */ if (!dev->num_crtcs) return; @@ -1166,10 +1185,10 @@ void drm_vblank_pre_modeset(struct drm_device *dev, int crtc) * to avoid corrupting the count if multiple, mismatch calls occur), * so that interrupts remain enabled in the interim. */ - if (!dev->vblank[crtc].inmodeset) { - dev->vblank[crtc].inmodeset = 0x1; + if (!vblank->inmodeset) { + vblank->inmodeset = 0x1; if (drm_vblank_get(dev, crtc) == 0) - dev->vblank[crtc].inmodeset |= 0x2; + vblank->inmodeset |= 0x2; } } EXPORT_SYMBOL(drm_vblank_pre_modeset); @@ -1184,21 +1203,22 @@ EXPORT_SYMBOL(drm_vblank_pre_modeset); */ void drm_vblank_post_modeset(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; unsigned long irqflags;
/* vblank is not initialized (IRQ not installed ?), or has been freed */ if (!dev->num_crtcs) return;
- if (dev->vblank[crtc].inmodeset) { + if (vblank->inmodeset) { spin_lock_irqsave(&dev->vbl_lock, irqflags); dev->vblank_disable_allowed = true; spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
- if (dev->vblank[crtc].inmodeset & 0x2) + if (vblank->inmodeset & 0x2) drm_vblank_put(dev, crtc);
- dev->vblank[crtc].inmodeset = 0; + vblank->inmodeset = 0; } } EXPORT_SYMBOL(drm_vblank_post_modeset); @@ -1333,6 +1353,7 @@ err_put: int drm_wait_vblank(struct drm_device *dev, void *data, struct drm_file *file_priv) { + struct drm_vblank_crtc *vblank; union drm_wait_vblank *vblwait = data; int ret; unsigned int flags, seq, crtc, high_crtc; @@ -1362,6 +1383,8 @@ int drm_wait_vblank(struct drm_device *dev, void *data, if (crtc >= dev->num_crtcs) return -EINVAL;
+ vblank = &dev->vblank[crtc]; + ret = drm_vblank_get(dev, crtc); if (ret) { DRM_DEBUG("failed to acquire vblank counter, %d\n", ret); @@ -1394,11 +1417,11 @@ int drm_wait_vblank(struct drm_device *dev, void *data,
DRM_DEBUG("waiting on vblank count %d, crtc %d\n", vblwait->request.sequence, crtc); - dev->vblank[crtc].last_wait = vblwait->request.sequence; - DRM_WAIT_ON(ret, dev->vblank[crtc].queue, 3 * HZ, + vblank->last_wait = vblwait->request.sequence; + DRM_WAIT_ON(ret, vblank->queue, 3 * HZ, (((drm_vblank_count(dev, crtc) - vblwait->request.sequence) <= (1 << 23)) || - !dev->vblank[crtc].enabled || + !vblank->enabled || !dev->irq_enabled));
if (ret != -EINTR) { @@ -1459,6 +1482,7 @@ static void drm_handle_vblank_events(struct drm_device *dev, int crtc) */ bool drm_handle_vblank(struct drm_device *dev, int crtc) { + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; u32 vblcount; s64 diff_ns; struct timeval tvblank; @@ -1474,7 +1498,7 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
/* Vblank irq handling disabled. Nothing to do. */ - if (!dev->vblank[crtc].enabled) { + if (!vblank->enabled) { spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); return false; } @@ -1484,7 +1508,7 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) */
/* Get current timestamp and count. */ - vblcount = atomic_read(&dev->vblank[crtc].count); + vblcount = atomic_read(&vblank->count); drm_get_last_vbltimestamp(dev, crtc, &tvblank, DRM_CALLED_FROM_VBLIRQ);
/* Compute time difference to timestamp of last vblank */ @@ -1508,14 +1532,14 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) * the timestamp computed above. */ smp_mb__before_atomic(); - atomic_inc(&dev->vblank[crtc].count); + atomic_inc(&vblank->count); smp_mb__after_atomic(); } else { DRM_DEBUG("crtc %d: Redundant vblirq ignored. diff_ns = %d\n", crtc, (int) diff_ns); }
- wake_up(&dev->vblank[crtc].queue); + wake_up(&vblank->queue); drm_handle_vblank_events(dev, crtc);
spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags);
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently both drm_irq.c and several drivers call drm_vblank_put() while holding event_lock. Now that drm_vblank_put() can disable the vblank interrupt directly it may need to grab vbl_lock and vblank_time_lock. That causes deadlocks since we take the locks in the opposite order in two places in drm_irq.c. So let's make sure the locking order is always event_lock->vbl_lock->vblank_time_lock.
In drm_vblank_off() pull up event_lock from underneath vbl_lock. Hold the event_lock across the whole operation to make sure we only send out the events that were on the queue when we disabled the interrupt, and not ones that got added just after (assuming drm_vblank_on() already managed to get called somewhere between).
To sort the other deadlock pull the event_lock out from drm_handle_vblank_events() into drm_handle_vblank() to be taken outside vblank_time_lock. Add the appropriate assert_spin_locked() to drm_handle_vblank_events().
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 47 +++++++++++++++++++++++++---------------------- 1 file changed, 25 insertions(+), 22 deletions(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index b4460bf..9353609 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1037,14 +1037,25 @@ void drm_vblank_off(struct drm_device *dev, int crtc) unsigned long irqflags; unsigned int seq;
- spin_lock_irqsave(&dev->vbl_lock, irqflags); + spin_lock_irqsave(&dev->event_lock, irqflags); + + spin_lock(&dev->vbl_lock); vblank_disable_and_save(dev, crtc); wake_up(&vblank->queue);
+ /* + * Prevent subsequent drm_vblank_get() from re-enabling + * the vblank interrupt by bumping the refcount. + */ + if (!vblank->inmodeset) { + atomic_inc(&vblank->refcount); + vblank->inmodeset = 1; + } + spin_unlock(&dev->vbl_lock); + /* Send any queued vblank events, lest the natives grow disquiet */ seq = drm_vblank_count_and_time(dev, crtc, &now);
- spin_lock(&dev->event_lock); list_for_each_entry_safe(e, t, &dev->vblank_event_list, base.link) { if (e->pipe != crtc) continue; @@ -1055,18 +1066,7 @@ void drm_vblank_off(struct drm_device *dev, int crtc) drm_vblank_put(dev, e->pipe); send_vblank_event(dev, e, seq, &now); } - spin_unlock(&dev->event_lock); - - /* - * Prevent subsequent drm_vblank_get() from re-enabling - * the vblank interrupt by bumping the refcount. - */ - if (!vblank->inmodeset) { - atomic_inc(&vblank->refcount); - vblank->inmodeset = 1; - } - - spin_unlock_irqrestore(&dev->vbl_lock, irqflags); + spin_unlock_irqrestore(&dev->event_lock, irqflags); } EXPORT_SYMBOL(drm_vblank_off);
@@ -1446,12 +1446,11 @@ static void drm_handle_vblank_events(struct drm_device *dev, int crtc) { struct drm_pending_vblank_event *e, *t; struct timeval now; - unsigned long flags; unsigned int seq;
- seq = drm_vblank_count_and_time(dev, crtc, &now); + assert_spin_locked(&dev->event_lock);
- spin_lock_irqsave(&dev->event_lock, flags); + seq = drm_vblank_count_and_time(dev, crtc, &now);
list_for_each_entry_safe(e, t, &dev->vblank_event_list, base.link) { if (e->pipe != crtc) @@ -1467,8 +1466,6 @@ static void drm_handle_vblank_events(struct drm_device *dev, int crtc) send_vblank_event(dev, e, seq, &now); }
- spin_unlock_irqrestore(&dev->event_lock, flags); - trace_drm_vblank_event(crtc, seq); }
@@ -1491,15 +1488,18 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) if (!dev->num_crtcs) return false;
+ spin_lock_irqsave(&dev->event_lock, irqflags); + /* Need timestamp lock to prevent concurrent execution with * vblank enable/disable, as this would cause inconsistent * or corrupted timestamps and vblank counts. */ - spin_lock_irqsave(&dev->vblank_time_lock, irqflags); + spin_lock(&dev->vblank_time_lock);
/* Vblank irq handling disabled. Nothing to do. */ if (!vblank->enabled) { - spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); + spin_unlock(&dev->vblank_time_lock); + spin_unlock_irqrestore(&dev->event_lock, irqflags); return false; }
@@ -1539,10 +1539,13 @@ bool drm_handle_vblank(struct drm_device *dev, int crtc) crtc, (int) diff_ns); }
+ spin_unlock(&dev->vblank_time_lock); + wake_up(&vblank->queue); drm_handle_vblank_events(dev, crtc);
- spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags); + spin_unlock_irqrestore(&dev->event_lock, irqflags); + return true; } EXPORT_SYMBOL(drm_handle_vblank);
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently it's possible that the following will happen: 1. drm_wait_vblank() calls drm_vblank_get() 2. drm_vblank_off() gets called 3. drm_wait_vblank() calls drm_queue_vblank_event() which adds the event to the queue event though vblank interrupts are currently disabled (and may not be re-enabled ever again).
To fix the problem, add another vblank->enabled check into drm_queue_vblank_event().
drm_vblank_off() holds event_lock around the vblank disable, so no further locking needs to be added to drm_queue_vblank_event(). vblank disable from another source is not possible since drm_wait_vblank() already holds a vblank reference.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 9353609..b2428cb 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1270,6 +1270,7 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe, union drm_wait_vblank *vblwait, struct drm_file *file_priv) { + struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; struct drm_pending_vblank_event *e; struct timeval now; unsigned long flags; @@ -1293,6 +1294,18 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe,
spin_lock_irqsave(&dev->event_lock, flags);
+ /* + * drm_vblank_off() might have been called after we called + * drm_vblank_get(). drm_vblank_off() holds event_lock + * around the vblank disable, so no need for further locking. + * The reference from drm_vblank_get() protects against + * vblank disable from another source. + */ + if (!vblank->enabled) { + ret = -EINVAL; + goto err_unlock; + } + if (file_priv->event_space < sizeof e->event) { ret = -EBUSY; goto err_unlock;
On Wed, Aug 06, 2014 at 02:49:52PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently it's possible that the following will happen:
- drm_wait_vblank() calls drm_vblank_get()
- drm_vblank_off() gets called
- drm_wait_vblank() calls drm_queue_vblank_event() which adds the event to the queue event though vblank interrupts are currently disabled (and may not be re-enabled ever again).
To fix the problem, add another vblank->enabled check into drm_queue_vblank_event().
drm_vblank_off() holds event_lock around the vblank disable, so no further locking needs to be added to drm_queue_vblank_event(). vblank disable from another source is not possible since drm_wait_vblank() already holds a vblank reference.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
I guess the window is too small here to make this reproducible in a test? Especially since each attempt will take a few hundred ms ... -Daniel
drivers/gpu/drm/drm_irq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 9353609..b2428cb 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1270,6 +1270,7 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe, union drm_wait_vblank *vblwait, struct drm_file *file_priv) {
- struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; struct drm_pending_vblank_event *e; struct timeval now; unsigned long flags;
@@ -1293,6 +1294,18 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe,
spin_lock_irqsave(&dev->event_lock, flags);
- /*
* drm_vblank_off() might have been called after we called
* drm_vblank_get(). drm_vblank_off() holds event_lock
* around the vblank disable, so no need for further locking.
* The reference from drm_vblank_get() protects against
* vblank disable from another source.
*/
- if (!vblank->enabled) {
ret = -EINVAL;
goto err_unlock;
- }
- if (file_priv->event_space < sizeof e->event) { ret = -EBUSY; goto err_unlock;
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Wed, Aug 06, 2014 at 03:23:01PM +0200, Daniel Vetter wrote:
On Wed, Aug 06, 2014 at 02:49:52PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently it's possible that the following will happen:
- drm_wait_vblank() calls drm_vblank_get()
- drm_vblank_off() gets called
- drm_wait_vblank() calls drm_queue_vblank_event() which adds the event to the queue event though vblank interrupts are currently disabled (and may not be re-enabled ever again).
To fix the problem, add another vblank->enabled check into drm_queue_vblank_event().
drm_vblank_off() holds event_lock around the vblank disable, so no further locking needs to be added to drm_queue_vblank_event(). vblank disable from another source is not possible since drm_wait_vblank() already holds a vblank reference.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
I guess the window is too small here to make this reproducible in a test?
I must admit that I didn't even try. I supposed I could try it now.
Especially since each attempt will take a few hundred ms ... -Daniel
drivers/gpu/drm/drm_irq.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 9353609..b2428cb 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1270,6 +1270,7 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe, union drm_wait_vblank *vblwait, struct drm_file *file_priv) {
- struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; struct drm_pending_vblank_event *e; struct timeval now; unsigned long flags;
@@ -1293,6 +1294,18 @@ static int drm_queue_vblank_event(struct drm_device *dev, int pipe,
spin_lock_irqsave(&dev->event_lock, flags);
- /*
* drm_vblank_off() might have been called after we called
* drm_vblank_get(). drm_vblank_off() holds event_lock
* around the vblank disable, so no need for further locking.
* The reference from drm_vblank_get() protects against
* vblank disable from another source.
*/
- if (!vblank->enabled) {
ret = -EINVAL;
goto err_unlock;
- }
- if (file_priv->event_space < sizeof e->event) { ret = -EBUSY; goto err_unlock;
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
From: Ville Syrjälä ville.syrjala@linux.intel.com
Make drm_vblank_put() disable the vblank interrupt immediately when the refcount drops to zero and drm_vblank_offdelay<0.
v2: Preserve the current drm_vblank_offdelay==0 'never disable' behaviur
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- Documentation/DocBook/drm.tmpl | 1 + drivers/gpu/drm/drm_drv.c | 4 ++-- drivers/gpu/drm/drm_irq.c | 11 +++++++---- include/drm/drmP.h | 2 +- 4 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 9727594..1cbe2a0 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -3386,6 +3386,7 @@ void (*disable_vblank) (struct drm_device *dev, int crtc);</synopsis> by scheduling a timer. The delay is accessible through the vblankoffdelay module parameter or the <varname>drm_vblank_offdelay</varname> global variable and expressed in milliseconds. Its default value is 5000 ms. + Zero means never disable, and a negative value means disable immediately. </para> <para> When a vertical blanking interrupt occurs drivers only need to call the diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 92bc6b1..db03e16 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -39,7 +39,7 @@ unsigned int drm_debug = 0; /* 1 to enable debug output */ EXPORT_SYMBOL(drm_debug);
-unsigned int drm_vblank_offdelay = 5000; /* Default to 5000 msecs. */ +int drm_vblank_offdelay = 5000; /* Default to 5000 msecs. */
unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */
@@ -53,7 +53,7 @@ MODULE_AUTHOR(CORE_AUTHOR); MODULE_DESCRIPTION(CORE_DESC); MODULE_LICENSE("GPL and additional rights"); MODULE_PARM_DESC(debug, "Enable debug output"); -MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]"); +MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs] (0: never disable, <0: disable immediately)"); MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]"); MODULE_PARM_DESC(timestamp_monotonic, "Use monotonic timestamps");
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index b2428cb..99145c4 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -993,10 +993,13 @@ void drm_vblank_put(struct drm_device *dev, int crtc) BUG_ON(atomic_read(&vblank->refcount) == 0);
/* Last user schedules interrupt disable */ - if (atomic_dec_and_test(&vblank->refcount) && - (drm_vblank_offdelay > 0)) - mod_timer(&vblank->disable_timer, - jiffies + ((drm_vblank_offdelay * HZ)/1000)); + if (atomic_dec_and_test(&vblank->refcount)) { + if (drm_vblank_offdelay < 0) + vblank_disable_fn((unsigned long)vblank); + else if (drm_vblank_offdelay > 0) + mod_timer(&vblank->disable_timer, + jiffies + ((drm_vblank_offdelay * HZ)/1000)); + } } EXPORT_SYMBOL(drm_vblank_put);
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 895e166..cdcc60b 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1359,7 +1359,7 @@ extern void drm_put_dev(struct drm_device *dev); extern void drm_unplug_dev(struct drm_device *dev); extern unsigned int drm_debug;
-extern unsigned int drm_vblank_offdelay; +extern int drm_vblank_offdelay; extern unsigned int drm_timestamp_precision; extern unsigned int drm_timestamp_monotonic;
From: Ville Syrjälä ville.syrjala@linux.intel.com
Add a flag to drm_device which will cause the vblank code to bypass the disable timer and always disable the vblank interrupt immediately when the last reference is dropped.
v2: Add some notes about the flag to the kernel doc
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- Documentation/DocBook/drm.tmpl | 6 ++++++ drivers/gpu/drm/drm_irq.c | 2 +- include/drm/drmP.h | 10 ++++++++++ 3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/Documentation/DocBook/drm.tmpl b/Documentation/DocBook/drm.tmpl index 1cbe2a0..3ec4b2b 100644 --- a/Documentation/DocBook/drm.tmpl +++ b/Documentation/DocBook/drm.tmpl @@ -3387,6 +3387,12 @@ void (*disable_vblank) (struct drm_device *dev, int crtc);</synopsis> module parameter or the <varname>drm_vblank_offdelay</varname> global variable and expressed in milliseconds. Its default value is 5000 ms. Zero means never disable, and a negative value means disable immediately. + Drivers may override the behaviour by setting the + <structname>drm_device</structname> + <structfield>vblank_disable_immediate</structfield> flag, which when set + causes vblank interrupts to be disabled immediately regardless of the + drm_vblank_offdelay value. The flag should only be set if there's a + properly working hardware vblank counter present. </para> <para> When a vertical blanking interrupt occurs drivers only need to call the diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 99145c4..8dbcc3f 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -994,7 +994,7 @@ void drm_vblank_put(struct drm_device *dev, int crtc)
/* Last user schedules interrupt disable */ if (atomic_dec_and_test(&vblank->refcount)) { - if (drm_vblank_offdelay < 0) + if (dev->vblank_disable_immediate || drm_vblank_offdelay < 0) vblank_disable_fn((unsigned long)vblank); else if (drm_vblank_offdelay > 0) mod_timer(&vblank->disable_timer, diff --git a/include/drm/drmP.h b/include/drm/drmP.h index cdcc60b..3c22c35 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1077,6 +1077,16 @@ struct drm_device { */ bool vblank_disable_allowed;
+ /* + * If true, vblank interrupt will be disabled immediately when the + * refcount drops to zero, as opposed to via the vblank disable + * timer. + * This can be set to true it the hardware has a working vblank + * counter and the driver uses drm_vblank_on() and drm_vblank_off() + * appropriately. + */ + bool vblank_disable_immediate; + /* array of size num_crtcs */ struct drm_vblank_crtc *vblank;
From: Ville Syrjälä ville.syrjala@linux.intel.com
Now that the vblank races are plugged, we can opt out of using the vblank disable timer and just let vblank interrupts get disabled immediately when the last reference is dropped.
Gen2 is the exception since it has no hardware frame counter.
Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/i915/i915_irq.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 87abe86..9fdf738 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -4682,6 +4682,14 @@ void intel_irq_init(struct drm_device *dev) dev->max_vblank_count = 0xffffff; /* only 24 bits of frame count */ }
+ /* + * Opt out of the vblank disable timer on everything except gen2. + * Gen2 doesn't have a hardware frame counter and so depends on + * vblank interrupts to produce sane vblank seuquence numbers. + */ + if (!IS_GEN2(dev)) + dev->vblank_disable_immediate = true; + if (drm_core_check_feature(dev, DRIVER_MODESET)) { dev->driver->get_vblank_timestamp = i915_get_vblank_timestamp; dev->driver->get_scanout_position = i915_get_crtc_scanoutpos;
From: Ville Syrjälä ville.syrjala@linux.intel.com
If the user is interested in getting accurate vblank sequence numbers all the time they may disable the vblank disable timer entirely. In that case it seems appropriate to kick start the vblank interrupts already from drm_vblank_on().
v2: Adapt to the drm_vblank_offdelay ==0 vs <0 changes
Reviewed-by: Matt Roper matthew.d.roper@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 8dbcc3f..af33df1 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1126,9 +1126,12 @@ void drm_vblank_on(struct drm_device *dev, int crtc) vblank->last = (dev->driver->get_vblank_counter(dev, crtc) - 1) & dev->max_vblank_count; - - /* re-enable interrupts if there's are users left */ - if (atomic_read(&vblank->refcount) != 0) + /* + * re-enable interrupts if there are users left, or the + * user wishes vblank interrupts to be enabled all the time. + */ + if (atomic_read(&vblank->refcount) != 0 || + (!dev->vblank_disable_immediate && drm_vblank_offdelay == 0)) WARN_ON(drm_vblank_enable(dev, crtc)); spin_unlock_irqrestore(&dev->vbl_lock, irqflags); }
From: Ville Syrjälä ville.syrjala@linux.intel.com
If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice.
This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable<->enable transition during the same frame which would all attempt to update the timestamp with the latest estimate.
Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", crtc, diff);
+ if (diff == 0) + return; + /* Reinitialize corresponding vblank timestamp if high-precision query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case.
On Wed, Aug 06, 2014 at 02:49:57PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice.
This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable<->enable transition during the same frame which would all attempt to update the timestamp with the latest estimate.
Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch
drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", crtc, diff);
- if (diff == 0)
return;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Wed, Aug 06, 2014 at 02:56:14PM +0200, Daniel Vetter wrote:
On Wed, Aug 06, 2014 at 02:49:57PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice.
This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable<->enable transition during the same frame which would all attempt to update the timestamp with the latest estimate.
Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch
On second consideration this seems to just paper over drivers that enable vblanks a few too many times. Or a bug in the vblank_get handling in drm_irq.c. If it's just that I think we should just tighten up the checks to catch that and drop this patch here. -Daniel
drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", crtc, diff);
- if (diff == 0)
return;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
I thought about this one again and opposed to my previous comment now think it's fine, also for drivers without hw vblank counter queries.
-mario
On Wed, Aug 6, 2014 at 1:49 PM, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice.
This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable<->enable transition during the same frame which would all attempt to update the timestamp with the latest estimate.
Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", crtc, diff);
if (diff == 0)
return;
/* Reinitialize corresponding vblank timestamp if high-precision
query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case. -- 1.8.5.5
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those "email hates me" weeks.
-mario
On 08/06/2014 01:49 PM, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
If we already have a timestamp for the current vblank counter, don't update it with a new timestmap. Small errors can creep in between two timestamp queries for the same vblank count, which could be confusing to userspace when it queries the timestamp for the same vblank sequence number twice.
This problem gets exposed when the vblank disable timer is not used (or is set to expire quickly) and thus we can get multiple vblank disable<->enable transition during the same frame which would all attempt to update the timestamp with the latest estimate.
Testcase: igt/kms_flip/flip-vs-expired-vblank Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/drm_irq.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index af33df1..0523f5b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -106,6 +106,9 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", crtc, diff);
- if (diff == 0)
return;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote:
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those "email hates me" weeks.
Oh dear, I've made a decent mess of all of this really. Picked up to make sure it doesn't get lost again. -Daniel
On Mon, 15 Sep 2014, Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote:
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those "email hates me" weeks.
Oh dear, I've made a decent mess of all of this really. Picked up to make sure it doesn't get lost again.
After all this nice ping pong our QA has reported a bisected regression on this commit: https://bugs.freedesktop.org/show_bug.cgi?id=84161
BR, Jani.
On Tue, Sep 23, 2014 at 03:48:25PM +0300, Jani Nikula wrote:
On Mon, 15 Sep 2014, Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote:
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those "email hates me" weeks.
Oh dear, I've made a decent mess of all of this really. Picked up to make sure it doesn't get lost again.
After all this nice ping pong our QA has reported a bisected regression on this commit: https://bugs.freedesktop.org/show_bug.cgi?id=84161
Looks like a minuscule timing change which resulted in us detecting a fifo underrun. Or at least I don't see any other related information that would indicate otherwise ... -Daniel
On 23/09/14 15:51, Daniel Vetter wrote:
On Tue, Sep 23, 2014 at 03:48:25PM +0300, Jani Nikula wrote:
On Mon, 15 Sep 2014, Daniel Vetter daniel@ffwll.ch wrote:
On Sat, Sep 13, 2014 at 06:25:54PM +0200, Mario Kleiner wrote:
The current drm-next misses Ville's original Patch 14/19, the one i first objected, then objected to my objection. It is needed to avoid actual regressions. Attached a trivially rebased (v2) of Ville's patch to go on top of drm-next, also as tgz in case my e-mail client mangles the patch again, because it's one of those "email hates me" weeks.
Oh dear, I've made a decent mess of all of this really. Picked up to make sure it doesn't get lost again.
After all this nice ping pong our QA has reported a bisected regression on this commit: https://bugs.freedesktop.org/show_bug.cgi?id=84161
Looks like a minuscule timing change which resulted in us detecting a fifo underrun. Or at least I don't see any other related information that would indicate otherwise ... -Daniel
There's nothing in that code path which could cause this - except for altered execution timing. I've seen that warning as well on my Intel HD Ironlake Mobile (MBP 2010), but only spuriously when plugging/unplugging an external display into the laptop iirc, so i thought it would be unrelated.
-mario
From: Ville Syrjälä ville.syrjala@linux.intel.com
We should update the last in drm_update_vblank_count() to avoid applying the diff more than once. This could occur eg. if drm_vblank_off() gets called multiple times for the crtc.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 0523f5b..67507a4 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -109,6 +109,8 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return;
+ vblank->last = cur_vblank; + /* Reinitialize corresponding vblank timestamp if high-precision query * available. Skip this step if query unsupported or failed. Will * reinitialize delayed at next vblank interrupt in that case.
On Wed, Aug 06, 2014 at 02:49:58PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
We should update the last in drm_update_vblank_count() to avoid applying the diff more than once. This could occur eg. if drm_vblank_off() gets called multiple times for the crtc.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we update ->last when disabling the vblank and use it when re-enabling it. Those calls should be symmetric, except for driver bugs. Imo would be better to tighten up the checks for that.
Or do I completely misunderstand what's going on here? -Daniel
drivers/gpu/drm/drm_irq.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 0523f5b..67507a4 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -109,6 +109,8 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return;
- vblank->last = cur_vblank;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Wed, Aug 06, 2014 at 03:08:25PM +0200, Daniel Vetter wrote:
On Wed, Aug 06, 2014 at 02:49:58PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
We should update the last in drm_update_vblank_count() to avoid applying the diff more than once. This could occur eg. if drm_vblank_off() gets called multiple times for the crtc.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we update ->last when disabling the vblank and use it when re-enabling it. Those calls should be symmetric, except for driver bugs. Imo would be better to tighten up the checks for that.
Or do I completely misunderstand what's going on here?
The issue is that we want to call drm_vblank_off() in intel_sanitize_crtc() for already disabled crtcs in order to to bump the refcount and set inmodeset=1 so that drm_vblank_get() won't try to enable vblank interrupts if someone calls drm_vblank_get() on it. So during suspend+resume we can get two drm_vblank_off() calls for the same crtc w/o a drm_vblank_on() in between.
Although this patch doesn't actually fix the problem really since vblank counter query during sanitize will just return 0 because the pipe isn't active, so there's a later patch to just override .last=0 in intel_sanitize_crtc(). Another approach might be to set .last=0 in drm_vblank_off() itself (after vblank_disable_and_save()), mainly just to hide the details from the caller.
-Daniel
drivers/gpu/drm/drm_irq.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 0523f5b..67507a4 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -109,6 +109,8 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return;
- vblank->last = cur_vblank;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Wed, Aug 06, 2014 at 04:30:29PM +0300, Ville Syrjälä wrote:
On Wed, Aug 06, 2014 at 03:08:25PM +0200, Daniel Vetter wrote:
On Wed, Aug 06, 2014 at 02:49:58PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
We should update the last in drm_update_vblank_count() to avoid applying the diff more than once. This could occur eg. if drm_vblank_off() gets called multiple times for the crtc.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we update ->last when disabling the vblank and use it when re-enabling it. Those calls should be symmetric, except for driver bugs. Imo would be better to tighten up the checks for that.
Or do I completely misunderstand what's going on here?
The issue is that we want to call drm_vblank_off() in intel_sanitize_crtc() for already disabled crtcs in order to to bump the refcount and set inmodeset=1 so that drm_vblank_get() won't try to enable vblank interrupts if someone calls drm_vblank_get() on it. So during suspend+resume we can get two drm_vblank_off() calls for the same crtc w/o a drm_vblank_on() in between.
Although this patch doesn't actually fix the problem really since vblank counter query during sanitize will just return 0 because the pipe isn't active, so there's a later patch to just override .last=0 in intel_sanitize_crtc(). Another approach might be to set .last=0 in drm_vblank_off() itself (after vblank_disable_and_save()), mainly just to hide the details from the caller.
Resetting ->last (if it works, need to be careful with drivers calling _off multiple times perhaps) sounds like the much simpler option, and probably does what everyone expects anyway. -Daniel
-Daniel
drivers/gpu/drm/drm_irq.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 0523f5b..67507a4 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -109,6 +109,8 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) if (diff == 0) return;
- vblank->last = cur_vblank;
- /* Reinitialize corresponding vblank timestamp if high-precision query
- available. Skip this step if query unsupported or failed. Will
- reinitialize delayed at next vblank interrupt in that case.
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
-- Ville Syrjälä Intel OTC
From: Ville Syrjälä ville.syrjala@linux.intel.com
During vblank disable the code tries to guess based on the timestamps whether we just missed one vblank or not. And if so it increments the counter. However it forgets to store the new timestamp to the approriate slot in our timestamp ring buffer. So anyone querying the timestamp for the resulting sequence number would get a stale timestamp. Fix it up by storing the new timestamp.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 67507a4..e927e5f 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -203,6 +203,13 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) * hope for the best. */ if ((vblrc > 0) && (abs64(diff_ns) > 1000000)) { + /* Store new timestamp in ringbuffer. */ + vblanktimestamp(dev, crtc, vblcount + 1) = tvblank; + + /* Increment cooked vblank count. This also atomically commits + * the timestamp computed above. + */ + smp_mb__before_atomic(); atomic_inc(&vblank->count); smp_mb__after_atomic(); }
On Wed, Aug 06, 2014 at 02:49:59PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
During vblank disable the code tries to guess based on the timestamps whether we just missed one vblank or not. And if so it increments the counter. However it forgets to store the new timestamp to the approriate slot in our timestamp ring buffer. So anyone querying the timestamp for the resulting sequence number would get a stale timestamp. Fix it up by storing the new timestamp.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/drm_irq.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index 67507a4..e927e5f 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -203,6 +203,13 @@ static void vblank_disable_and_save(struct drm_device *dev, int crtc) * hope for the best. */ if ((vblrc > 0) && (abs64(diff_ns) > 1000000)) {
We should use DRM_REDUNDANT_VBLIRQ_THRESH_NS here for symmtry. With that addressed this is Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch
/* Store new timestamp in ringbuffer. */
vblanktimestamp(dev, crtc, vblcount + 1) = tvblank;
/* Increment cooked vblank count. This also atomically commits
* the timestamp computed above.
*/
atomic_inc(&vblank->count); smp_mb__after_atomic(); }smp_mb__before_atomic();
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
From: Ville Syrjälä ville.syrjala@linux.intel.com
We call drm_vblank_off() during crtc sanitation to make sure the software and hardware states agree. However drm_vblank_off() will try to update the vblank timestamp and sequence number which lands us in some trouble.
As the pipe is disabled the hardware frame counter query will return 0. If the .last doesn't match the code will try to add the difference to the user visible sequence number. During driver init that's OK as .last == 0, but during resume .last may be anything. So we should make sure we don't try to apply the diff here by zeroing .last beforehand. Otherwise there maybe be a random jump in the user visible sequence number across suspend.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/i915/intel_display.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ae5f20d..c00bcd0 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12918,8 +12918,11 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc) /* restore vblank interrupts to correct state */ if (crtc->active) drm_vblank_on(dev, crtc->pipe); - else + else { + /* avoid random jumps in seq/ts */ + dev->vblank[crtc->pipe].last = 0; drm_vblank_off(dev, crtc->pipe); + }
/* We need to sanitize the plane -> pipe mapping first because this will * disable the crtc (and hence change the state) if it is wrong. Note
On Wed, Aug 06, 2014 at 02:50:00PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
We call drm_vblank_off() during crtc sanitation to make sure the software and hardware states agree. However drm_vblank_off() will try to update the vblank timestamp and sequence number which lands us in some trouble.
As the pipe is disabled the hardware frame counter query will return 0. If the .last doesn't match the code will try to add the difference to the user visible sequence number. During driver init that's OK as .last == 0, but during resume .last may be anything. So we should make sure we don't try to apply the diff here by zeroing .last beforehand. Otherwise there maybe be a random jump in the user visible sequence number across suspend.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/i915/intel_display.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ae5f20d..c00bcd0 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12918,8 +12918,11 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc) /* restore vblank interrupts to correct state */ if (crtc->active) drm_vblank_on(dev, crtc->pipe);
- else
- else {
/* avoid random jumps in seq/ts */
dev->vblank[crtc->pipe].last = 0;
Should we have a drm_vblank_reset for this? I guess other drivers will have similar issues with "sorry, just lost it all" kind of transitions.
Also this case should only happen when we enter the system resume with the pipes in dpms off state. In that case we should have sampled something in drm_vblank_off already and resampling doesn't look like a good idea.
Do we instead need some protection in drm_vblank_off to avoid re-sampling? -Daniel
drm_vblank_off(dev, crtc->pipe);
}
/* We need to sanitize the plane -> pipe mapping first because this will
- disable the crtc (and hence change the state) if it is wrong. Note
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Wed, Aug 06, 2014 at 03:30:17PM +0200, Daniel Vetter wrote:
On Wed, Aug 06, 2014 at 02:50:00PM +0300, ville.syrjala@linux.intel.com wrote:
From: Ville Syrjälä ville.syrjala@linux.intel.com
We call drm_vblank_off() during crtc sanitation to make sure the software and hardware states agree. However drm_vblank_off() will try to update the vblank timestamp and sequence number which lands us in some trouble.
As the pipe is disabled the hardware frame counter query will return 0. If the .last doesn't match the code will try to add the difference to the user visible sequence number. During driver init that's OK as .last == 0, but during resume .last may be anything. So we should make sure we don't try to apply the diff here by zeroing .last beforehand. Otherwise there maybe be a random jump in the user visible sequence number across suspend.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/i915/intel_display.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ae5f20d..c00bcd0 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12918,8 +12918,11 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc) /* restore vblank interrupts to correct state */ if (crtc->active) drm_vblank_on(dev, crtc->pipe);
- else
- else {
/* avoid random jumps in seq/ts */
dev->vblank[crtc->pipe].last = 0;
Should we have a drm_vblank_reset for this? I guess other drivers will have similar issues with "sorry, just lost it all" kind of transitions.
Also this case should only happen when we enter the system resume with the pipes in dpms off state. In that case we should have sampled something in drm_vblank_off already and resampling doesn't look like a good idea.
Do we instead need some protection in drm_vblank_off to avoid re-sampling?
Yeah, I suppose I could try to fix it in drm vblank code somehow. Just resetting .last=0 in drm_vblank_off() would avoid the seq jump but would still leave the vblank counter query in place. Admittedly the resulting debug spam about vblank counter queries on disabled crtcs is rather annoying.
-Daniel
drm_vblank_off(dev, crtc->pipe);
}
/* We need to sanitize the plane -> pipe mapping first because this will
- disable the crtc (and hence change the state) if it is wrong. Note
-- 1.8.5.5
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
From: Ville Syrjälä ville.syrjala@linux.intel.com
update_scanline_offset() in intel_sanitize_crtc() was supposed to be called only for active crtcs. But due to some underrun patches it now gets updated for all crtcs on gmch platforms.
Move the update_scanline_offset() to the very beginning of intel_sanitize_crtc() where we update the vblank state. This seems like a better place anyway since the scanline offset ought to be up to date before we might need to consult it. So before any vblanky stuff happens.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/i915/intel_display.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index c00bcd0..9403b2f 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -12916,9 +12916,10 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc) I915_WRITE(reg, I915_READ(reg) & ~PIPECONF_FRAME_START_DELAY_MASK);
/* restore vblank interrupts to correct state */ - if (crtc->active) + if (crtc->active) { + update_scanline_offset(crtc); drm_vblank_on(dev, crtc->pipe); - else { + } else { /* avoid random jumps in seq/ts */ dev->vblank[crtc->pipe].last = 0; drm_vblank_off(dev, crtc->pipe); @@ -13020,8 +13021,6 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc) */ crtc->cpu_fifo_underrun_disabled = true; crtc->pch_fifo_underrun_disabled = true; - - update_scanline_offset(crtc); } }
From: Ville Syrjälä ville.syrjala@linux.intel.com
Now that drm_update_vblank_count() can be called even when we're not about to enable the vblank interrupts we shouldn't print debug messages stating otherwise.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/drm_irq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index e927e5f..87abca3 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -103,7 +103,7 @@ static void drm_update_vblank_count(struct drm_device *dev, int crtc) crtc, vblank->last, cur_vblank, diff); }
- DRM_DEBUG("enabling vblank interrupts on crtc %d, missed %d\n", + DRM_DEBUG("updating vblank count on crtc %d, missed %d\n", crtc, diff);
if (diff == 0)
From: Ville Syrjälä ville.syrjala@linux.intel.com
The kernel might mistakenly send out a zeroed vblank timestamp when the vblank wait gets terminated early due to crtc disable. Add an assertion to catch that.
Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- tests/kms_flip.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/tests/kms_flip.c b/tests/kms_flip.c index 92f4eb5..e8a0b39 100644 --- a/tests/kms_flip.c +++ b/tests/kms_flip.c @@ -396,6 +396,9 @@ static int __wait_for_vblank(unsigned int flags, int crtc_idx, ret = drmWaitVBlank(drm_fd, &wait_vbl);
if (ret == 0) { + igt_assert_f(wait_vbl.reply.tval_sec != 0 || + wait_vbl.reply.tval_usec != 0, + "zeroed vblank timestamp\n"); reply->ts.tv_sec = wait_vbl.reply.tval_sec; reply->ts.tv_usec = wait_vbl.reply.tval_usec; reply->sequence = wait_vbl.reply.sequence;
dri-devel@lists.freedesktop.org