[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
The funny part is, there's a comment there that shows that this was done even for "PREEMPT_RT". Unfortunately, the call to "get_scanout_position()" can call functions that use the rt-mutex "sleeping spin locks" and it breaks there.
I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
/* Reset the chip */
/* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL);
/* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold: 1. Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks. 2. -RT still needs to prevent scheduling there. 3. the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
Does it have something to do with this dump?
Not sure [but I didn't realize the query was regarding 3.0].
[ 3.932060] ------------[ cut here ]------------ [ 3.936809] WARNING: at /home/rostedt/work/git/linux-rt.git/drivers/gpu/drm/i915/i915_drv.c:322 gen6_gt_force_wake_get+0x4d/0x50 [i915]() [ 3.949229] Hardware name: HP Compaq Pro 6300 SFF [ 3.953988] Modules linked in: i915(+) video i2c_algo_bit drm_kms_helper drm i2c_core [ 3.961943] Pid: 220, comm: udevd Not tainted 3.0.89-test-rt117 #117 [ 3.968343] Call Trace: [ 3.970851] [<ffffffff810671bf>] warn_slowpath_common+0x7f/0xc0 [ 3.970852] [<ffffffff8106721a>] warn_slowpath_null+0x1a/0x20 [ 3.970857] [<ffffffffa006569d>] gen6_gt_force_wake_get+0x4d/0x50 [i915] [ 3.970865] [<ffffffffa00962ab>] ivybridge_init_clock_gating+0xcb/0x2f0 [i915] [ 3.970870] [<ffffffffa0090209>] ? intel_crtc_disable+0x29/0x60 [i915] [ 3.970877] [<ffffffffa009f741>] intel_modeset_init+0x751/0x10e0 [i915] [ 3.970882] [<ffffffffa006a158>] i915_driver_load+0xfc8/0x17f0 [i915] [ 3.970885] [<ffffffff8137909e>] ? device_register+0x1e/0x30 [ 3.970892] [<ffffffffa001f596>] ? drm_sysfs_device_add+0x86/0xb0 [drm] [ 3.970896] [<ffffffffa001b8b3>] ? drm_get_minor+0x233/0x300 [drm] [ 3.970900] [<ffffffffa001dd49>] drm_get_pci_dev+0x189/0x2a0 [drm] [ 3.970902] [<ffffffff8105e31b>] ? migrate_enable+0x8b/0x1c0 [ 3.970910] [<ffffffffa00c5f27>] i915_pci_probe+0x1b/0x1d [i915] [ 3.970913] [<ffffffff812c5b6c>] local_pci_probe+0x5c/0xd0 [ 3.970915] [<ffffffff812c73f9>] pci_device_probe+0x109/0x130 [ 3.970917] [<ffffffff8137bc6c>] driver_probe_device+0x9c/0x2b0 [ 3.970918] [<ffffffff8137bf2b>] __driver_attach+0xab/0xb0 [ 3.970919] [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0 [ 3.970920] [<ffffffff8137be80>] ? driver_probe_device+0x2b0/0x2b0 [ 3.970921] [<ffffffff8137aa34>] bus_for_each_dev+0x64/0xa0 [ 3.970923] [<ffffffff8137b87e>] driver_attach+0x1e/0x20 [ 3.970924] [<ffffffff8137b490>] bus_add_driver+0x1b0/0x290 [ 3.970925] [<ffffffff8137c4a6>] driver_register+0x76/0x140 [ 3.970927] [<ffffffff812c70a2>] __pci_register_driver+0x82/0x100 [ 3.970930] [<ffffffff815b1f6d>] ? notifier_call_chain+0x4d/0x70 [ 3.970934] [<ffffffffa001df7a>] drm_pci_init+0x11a/0x130 [drm] [ 3.970935] [<ffffffffa00f4000>] ? 0xffffffffa00f3fff [ 3.970940] [<ffffffffa00f408b>] i915_init+0x8b/0x8d [i915] [ 3.970943] [<ffffffff81002040>] do_one_initcall+0x40/0x180 [ 3.970946] [<ffffffff810aa4ae>] sys_init_module+0xbe/0x230 [ 3.970948] [<ffffffff815b5a82>] system_call_fastpath+0x16/0x1b [ 3.970949] ---[ end trace 0000000000000002 ]--- [ 4.164458] ------------[ cut here ]------------
-- Steve
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
The funny part is, there's a comment there that shows that this was done even for "PREEMPT_RT". Unfortunately, the call to "get_scanout_position()" can call functions that use the rt-mutex "sleeping spin locks" and it breaks there.
I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
The funny part is, there's a comment there that shows that this was done even for "PREEMPT_RT". Unfortunately, the call to "get_scanout_position()" can call functions that use the rt-mutex "sleeping spin locks" and it breaks there.
I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
On 09/17/2013 04:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
The funny part is, there's a comment there that shows that this was done even for "PREEMPT_RT". Unfortunately, the call to "get_scanout_position()" can call functions that use the rt-mutex "sleeping spin locks" and it breaks there.
I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
Ouch. But thanks for clarifying that.
Ok, so register access needs to be serialized. And a separate but related concern is that gen6+ resets also need to hold-off register access where forcewake is required.
While I was reviewing the registers that require forcewake handling, I saw this:
from i915_reg.h: #define _DPLL_A (dev_priv->info->display_mmio_offset + 0x6014) #define _DPLL_B (dev_priv->info->display_mmio_offset + 0x6018)
from i915_drv.c: static const struct intel_device_info intel_valleyview_m_info = { GEN7_FEATURES, .is_mobile = 1, .num_pipes = 2, .is_valleyview = 1, .display_mmio_offset = VLV_DISPLAY_BASE, <<<------- .has_llc = 0, /* legal, last one wins */ };
from intel_uncore.c: #define NEEDS_FORCE_WAKE(dev_priv, reg) \ ((HAS_FORCE_WAKE((dev_priv)->dev)) && \ ((reg) < 0x40000) && \ ((reg) != FORCEWAKE))
Is this is a mistake or do the valleyview PLLs not require the same forcewake handling as the other intel gpus?
Regards, Peter Hurley
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
On Wed, Sep 18, 2013 at 6:52 PM, Peter Hurley peter@hurleysoftware.com wrote:
Ouch. But thanks for clarifying that.
Ok, so register access needs to be serialized. And a separate but related concern is that gen6+ resets also need to hold-off register access where forcewake is required.
While I was reviewing the registers that require forcewake handling, I saw this:
from i915_reg.h: #define _DPLL_A (dev_priv->info->display_mmio_offset + 0x6014) #define _DPLL_B (dev_priv->info->display_mmio_offset + 0x6018)
from i915_drv.c: static const struct intel_device_info intel_valleyview_m_info = { GEN7_FEATURES, .is_mobile = 1, .num_pipes = 2, .is_valleyview = 1, .display_mmio_offset = VLV_DISPLAY_BASE, <<<------- .has_llc = 0, /* legal, last one wins */ };
from intel_uncore.c: #define NEEDS_FORCE_WAKE(dev_priv, reg) \ ((HAS_FORCE_WAKE((dev_priv)->dev)) && \ ((reg) < 0x40000) && \ ((reg) != FORCEWAKE))
Is this is a mistake or do the valleyview PLLs not require the same forcewake handling as the other intel gpus?
The DPLL offset from the macro at 0x6000 is from older platforms which lacked forcewake and where the display block started already on 0x6000.
On recent big core platforms we have the north display block at 0x40000 (i.e. forcewake is only used for the rendering side of things). For those platforms the DPLL macro is called PCH_DPLL (and it's in the south display range starting at 0xc0000. VLV itself inherited the old display register blocks (mostly) but moved them all by the vlv display base offset.
We have quite a bit of fun with hw engineers moving display blocks around ;-)
Cheers, Daniel
On Wed, Sep 18, 2013 at 12:52:07PM -0400, Peter Hurley wrote:
On 09/17/2013 04:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
> The funny part is, there's a comment there that shows that this was > done even for "PREEMPT_RT". Unfortunately, the call to > "get_scanout_position()" can call functions that use the rt-mutex > "sleeping spin locks" and it breaks there. > > I guess we need to ask the authors of the mainline patch exactly why > that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
Ouch. But thanks for clarifying that.
Ok, so register access needs to be serialized. And a separate but related concern is that gen6+ resets also need to hold-off register access where forcewake is required.
While I was reviewing the registers that require forcewake handling, I saw this:
from i915_reg.h: #define _DPLL_A (dev_priv->info->display_mmio_offset + 0x6014) #define _DPLL_B (dev_priv->info->display_mmio_offset + 0x6018)
from i915_drv.c: static const struct intel_device_info intel_valleyview_m_info = { GEN7_FEATURES, .is_mobile = 1, .num_pipes = 2, .is_valleyview = 1, .display_mmio_offset = VLV_DISPLAY_BASE, <<<------- .has_llc = 0, /* legal, last one wins */ };
from intel_uncore.c: #define NEEDS_FORCE_WAKE(dev_priv, reg) \ ((HAS_FORCE_WAKE((dev_priv)->dev)) && \ ((reg) < 0x40000) && \ ((reg) != FORCEWAKE))
Is this is a mistake or do the valleyview PLLs not require the same forcewake handling as the other intel gpus?
Display registers shouldn't need forcewake on any platform. I guess our NEEDS_FORCE_WAKE() check is a bit too coarse and includes a bunch of stuff doesn't need to be there. So sort of by accident we do the right thing on VLV, and the "wrong" thing on other platforms.
On 09/17/2013 10:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
The funny part is, there's a comment there that shows that this was done even for "PREEMPT_RT". Unfortunately, the call to "get_scanout_position()" can call functions that use the rt-mutex "sleeping spin locks" and it breaks there.
I guess we need to ask the authors of the mainline patch exactly why that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
Ok, so register access must be serialized, at least within a register block, no way around it. Thanks for explaining this. That makes me a bit depressed. Is this also true for older hw generations than gen6?
Daniel, could we move access to the display register block to a separate lock, so the I915_READ to PIPEDSL in i915_get_crtc_scanoutpos() can avoid the uncore.lock? If the display registers don't need forcewake then i assume we could have much shorter lock hold times by avoiding the large up to 4 msecs waits in forcewake handling. Probably also much less contention in normal operation when no modesetting happens? I think i can get rid of the other register reads in that function. Those settings are already cached and accessible from the intel_crtc_config->adjusted_mode.
Steven, would it then be acceptable to convert that "faster" lock into a raw_spinlock_t or is this unacceptable? If so, the preempt_disable() could stay, right?
I really would like to keep those preempt_disable()/enable() calls where they are if somehow possible - it makes usage of the timestamping under any kernel nicely plug & play.
If that's not possible i think i could work around it for at least my application under rt kernel, but it would be somewhat ugly and high maintenance, and wouldn't help other apps with similar needs, which don't know what knobs to tune - or that any tuning is suddenly needed on the latest kernels.
In my case i could set the drm.vblankoffdelay module parameter to zero to keep vblank irq's running all the time. That would leave the irq kernel thread for the gpu interrupt as the only caller of the scanout timestamping. Then one could set the priority of that irq thread to something very high and hope for the best. How to locate the proper irq thread(s) for any given gpu(s), what priority(s) to choose etc. and how to do this portably across various versions of various distros, i don't know. Doesn't look very user friendly, compared to the current solution, but not impossible.
In any case, maybe the simple retry 3x loop in the DRM for measuring the timestamp is not good enough anymore to keep this reliable and accurate. Maybe we would need some loop that retries until an accurate measurement or a user configurable timeout is reached. Then users like mine could set a very high timeout which rather totally degrades the system and causes severe stuttering of graphics updates than silently failing with inaccurate timestamps. The default could be something safe for normal desktop use.
Can i ask somebody with the hardware to give me an idea how big the error there is? At a high drm.debug setting, you should see this logged at each vblank interrupt, as seen in this DRM_DEBUG statement:
http://lxr.free-electrons.com/source/drivers/gpu/drm/drm_irq.c#L717
e.g., ....[e 4 us, 1 rep] ... for 1 repetitions with the lowest error being 4 usecs.
Especially while rendering a lot, so the uncore.lock gets something to do.
Thanks, -mario
On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
On 09/17/2013 10:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
> The funny part is, there's a comment there that shows that this was > done even for "PREEMPT_RT". Unfortunately, the call to > "get_scanout_position()" can call functions that use the rt-mutex > "sleeping spin locks" and it breaks there. > > I guess we need to ask the authors of the mainline patch exactly why > that preempt_disable() is needed?
The drm core associates a timestamp with each vertical blank frame #. Drm drivers can optionally support a 'high resolution' hw timestamp. The vblank frame #/timestamp tuple is user-space visible.
The i915 drm driver supports a hw timestamp via this drm helper function which computes the timestamp from the crtc scan position (based on the pixel clock).
For mainline, the preempt_disable/_enable() isn't actually necessary because every call tree that leads here already has preemption disabled.
For -RT, the maybe i915 register spinlock (uncore.lock) should be raw?
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
Ok, so register access must be serialized, at least within a register block, no way around it. Thanks for explaining this. That makes me a bit depressed. Is this also true for older hw generations than gen6?
Daniel, could we move access to the display register block to a separate lock, so the I915_READ to PIPEDSL in i915_get_crtc_scanoutpos() can avoid the uncore.lock? If the display registers don't need forcewake then i assume we could have much shorter lock hold times by avoiding the large up to 4 msecs waits in forcewake handling. Probably also much less contention in normal operation when no modesetting happens? I think i can get rid of the other register reads in that function. Those settings are already cached and accessible from the intel_crtc_config->adjusted_mode.
I actually have a patch to kill most of the registers reads in get_scanout_position on i915 somewhere. Let me dig that out and send it to the list...
<snip>
In any case, maybe the simple retry 3x loop in the DRM for measuring the timestamp is not good enough anymore to keep this reliable and accurate. Maybe we would need some loop that retries until an accurate measurement or a user configurable timeout is reached. Then users like mine could set a very high timeout which rather totally degrades the system and causes severe stuttering of graphics updates than silently failing with inaccurate timestamps. The default could be something safe for normal desktop use.
I never really understood the need for the preempt disabled retry loop in drm core. What I would do is just something like this:
get_scanoutpos_and_timestamp() { local_irq_disable(); get_scanoutpos(); get_timestamp(); local_irq_enable(); }
If that has a lot of error, then it seems to me that the system is just crap and we shouldn't care. Or if you're really worried about SMIs and such then you could still do a retry loop.
On 23.09.13 10:38, Ville Syrjälä wrote:
On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
On 09/17/2013 10:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote:
On Wed, 11 Sep 2013 11:16:43 -0400 Peter Hurley peter@hurleysoftware.com wrote:
>> The funny part is, there's a comment there that shows that this was >> done even for "PREEMPT_RT". Unfortunately, the call to >> "get_scanout_position()" can call functions that use the rt-mutex >> "sleeping spin locks" and it breaks there. >> >> I guess we need to ask the authors of the mainline patch exactly why >> that preempt_disable() is needed? > > > The drm core associates a timestamp with each vertical blank frame #. > Drm drivers can optionally support a 'high resolution' hw timestamp. > The vblank frame #/timestamp tuple is user-space visible. > > The i915 drm driver supports a hw timestamp via this drm helper function > which computes the timestamp from the crtc scan position (based on the > pixel clock). > > For mainline, the preempt_disable/_enable() isn't actually necessary > because every call tree that leads here already has preemption disabled. > > For -RT, the maybe i915 register spinlock (uncore.lock) should be raw? >
No, it should not. Note, any other lock that can be held when it is held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
And by taking a quick audit of the code, I see this:
spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); /* Reset the chip */ /* GEN6_GDRST is not in the gt power well, no need to check * for fifo space for the write or forcewake the chip for * the read */ __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); /* Spin waiting for the device to ack the reset request */ ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) &
GEN6_GRDOM_FULL) == 0, 500);
That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
Ok, so register access must be serialized, at least within a register block, no way around it. Thanks for explaining this. That makes me a bit depressed. Is this also true for older hw generations than gen6?
Daniel, could we move access to the display register block to a separate lock, so the I915_READ to PIPEDSL in i915_get_crtc_scanoutpos() can avoid the uncore.lock? If the display registers don't need forcewake then i assume we could have much shorter lock hold times by avoiding the large up to 4 msecs waits in forcewake handling. Probably also much less contention in normal operation when no modesetting happens? I think i can get rid of the other register reads in that function. Those settings are already cached and accessible from the intel_crtc_config->adjusted_mode.
I actually have a patch to kill most of the registers reads in get_scanout_position on i915 somewhere. Let me dig that out and send it to the list...
<snip> > In any case, maybe the simple retry 3x loop in the DRM for measuring the > timestamp is not good enough anymore to keep this reliable and accurate. > Maybe we would need some loop that retries until an accurate measurement > or a user configurable timeout is reached. Then users like mine could > set a very high timeout which rather totally degrades the system and > causes severe stuttering of graphics updates than silently failing with > inaccurate timestamps. The default could be something safe for normal > desktop use.
I never really understood the need for the preempt disabled retry loop in drm core. What I would do is just something like this:
get_scanoutpos_and_timestamp() { local_irq_disable(); get_scanoutpos(); get_timestamp(); local_irq_enable(); }
The preempt_disable serves the same purpose for PREEMPT_RT kernels as your local_irq_disable in your example - get rid of any preventable interruption, so a retry is unlikely to be ever needed. At least that was the idea.
I assume if a spin_lock_irqsave doesn't really disable interrupts on a RT kernel with normal spinlocks then local_irq_disable won't really disable interrupts either?
If that has a lot of error, then it seems to me that the system is just crap and we shouldn't care. Or if you're really worried about SMIs and such then you could still do a retry loop.
I didn't expect a lot of error with preemption and interrupts disabled, essentially only such infrequent things as smi's, that's why the retry loop only tries 3x max, once for real, once in case the 1st one got spoiled by a smi or such, and once because three times a charm ;-) - In practice i didn't ever observe more than 3-4 usecs of delay, well below the 20 usecs retry threshold. But i didn't expect i915_crtc_get_scanoutpos() to ever take any locks or other stuff that could induce long waits.
But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms.
E.g., for intel-kms:
i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... }
With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks.
Unless ktime_get is also a bad thing to do in a preempt disabled section?
-mario
On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote:
On 23.09.13 10:38, Ville Syrjälä wrote:
On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
On 09/17/2013 10:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote:
[+cc dri-devel]
On 09/11/2013 11:38 AM, Steven Rostedt wrote: > > On Wed, 11 Sep 2013 11:16:43 -0400 > Peter Hurley peter@hurleysoftware.com wrote: > >>> The funny part is, there's a comment there that shows that this was >>> done even for "PREEMPT_RT". Unfortunately, the call to >>> "get_scanout_position()" can call functions that use the rt-mutex >>> "sleeping spin locks" and it breaks there. >>> >>> I guess we need to ask the authors of the mainline patch exactly why >>> that preempt_disable() is needed? >> >> >> The drm core associates a timestamp with each vertical blank frame #. >> Drm drivers can optionally support a 'high resolution' hw timestamp. >> The vblank frame #/timestamp tuple is user-space visible. >> >> The i915 drm driver supports a hw timestamp via this drm helper function >> which computes the timestamp from the crtc scan position (based on the >> pixel clock). >> >> For mainline, the preempt_disable/_enable() isn't actually necessary >> because every call tree that leads here already has preemption disabled. >> >> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw? >> > > No, it should not. Note, any other lock that can be held when it is > held would also need to be raw.
By that, you mean "any other lock" that might be claimed "would also need to be raw"? Hopefully not "any other lock" already held?
> And by taking a quick audit of the code, I see this: > > spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); > > /* Reset the chip */ > > /* GEN6_GDRST is not in the gt power well, no need to check > * for fifo space for the write or forcewake the chip for > * the read > */ > __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); > > /* Spin waiting for the device to ack the reset request */ > ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & > GEN6_GRDOM_FULL) == 0, 500); > > That spin is unacceptable in RT with preemption and interrupts disabled.
Yep. That would be bad.
AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included in the force-wake set, so raw reads of the registers would probably be acceptable (thus obviating the need for claiming the uncore.lock).
Except that _ALL_ register access is disabled with the uncore.lock during a gpu reset. Not sure if that's meant to include crtc registers or not, or what other synchronization/serialization issues are being handled/hidden by forcing all register accesses to wait during a gpu reset.
Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
> What's the real issue here?
That the vblank timestamp needs to be an accurate measurement of a realtime event. Sleeping/servicing interrupts while reading the registers necessary to compute the timestamp would be bad too.
(edit: which hopefully Mario Kleiner clarified in his reply)
My point earlier was three-fold:
- Don't need the preempt_disable() for mainline: all callers are already holding interrupt-disabling spinlocks.
- -RT still needs to prevent scheduling there.
- the problem is i915-specific.
[update: the radeon driver should also BUG like the i915 driver but probably should have mmio_idx_lock spinlock as raw]
Ok, so register access must be serialized, at least within a register block, no way around it. Thanks for explaining this. That makes me a bit depressed. Is this also true for older hw generations than gen6?
Daniel, could we move access to the display register block to a separate lock, so the I915_READ to PIPEDSL in i915_get_crtc_scanoutpos() can avoid the uncore.lock? If the display registers don't need forcewake then i assume we could have much shorter lock hold times by avoiding the large up to 4 msecs waits in forcewake handling. Probably also much less contention in normal operation when no modesetting happens? I think i can get rid of the other register reads in that function. Those settings are already cached and accessible from the intel_crtc_config->adjusted_mode.
I actually have a patch to kill most of the registers reads in get_scanout_position on i915 somewhere. Let me dig that out and send it to the list...
<snip> > In any case, maybe the simple retry 3x loop in the DRM for measuring the > timestamp is not good enough anymore to keep this reliable and accurate. > Maybe we would need some loop that retries until an accurate measurement > or a user configurable timeout is reached. Then users like mine could > set a very high timeout which rather totally degrades the system and > causes severe stuttering of graphics updates than silently failing with > inaccurate timestamps. The default could be something safe for normal > desktop use.
I never really understood the need for the preempt disabled retry loop in drm core. What I would do is just something like this:
get_scanoutpos_and_timestamp() { local_irq_disable(); get_scanoutpos(); get_timestamp(); local_irq_enable(); }
The preempt_disable serves the same purpose for PREEMPT_RT kernels as your local_irq_disable in your example - get rid of any preventable interruption, so a retry is unlikely to be ever needed. At least that was the idea.
I assume if a spin_lock_irqsave doesn't really disable interrupts on a RT kernel with normal spinlocks then local_irq_disable won't really disable interrupts either?
If that has a lot of error, then it seems to me that the system is just crap and we shouldn't care. Or if you're really worried about SMIs and such then you could still do a retry loop.
I didn't expect a lot of error with preemption and interrupts disabled, essentially only such infrequent things as smi's, that's why the retry loop only tries 3x max, once for real, once in case the 1st one got spoiled by a smi or such, and once because three times a charm ;-) - In practice i didn't ever observe more than 3-4 usecs of delay, well below the 20 usecs retry threshold. But i didn't expect i915_crtc_get_scanoutpos() to ever take any locks or other stuff that could induce long waits.
But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms.
E.g., for intel-kms:
i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... }
With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks.
Unless ktime_get is also a bad thing to do in a preempt disabled section?
The preempt_disable/enable is not needed. The spinlock serves the same purpose already.
As far as ktime_get(), I've used it from spinlocked/irq disabled sections and so far haven't seen it do bad things. But would be nice to get some official statement to that effect.
To minimize the chance of breakage, I guess we could add a new func ->get_scanout_pos_and_time or something like that, fill it by default with the code from the current drm_calc_vbltimestamp_from_scanoutpos().
Or we could just shovel the delta_ns handling from drm_calc_vbltimestamp_from_scanoutpos() into some small helper function that we could call from i915_get_vblank_timestamp(), and then we can call i915_get_crtc_scanoutpos() directly from there as well.
On Wed, 25 Sep 2013 10:49:36 +0300 Ville Syrjälä ville.syrjala@linux.intel.com wrote:
The preempt_disable/enable is not needed. The spinlock serves the same purpose already.
As stated, that was only for the -rt patch, as spin_lock_irqsave does not disable preemption nor does it even disable interrupts.
But I agree, the added preempt_disable() should be sent to us to keep in the -rt patch itself. We really appreciate that you are thinking about us :-) But something like this will just confuse the mainline folks. Having a "preempt_disable_rt()" would make a lot more sense (which exists in the -rt patch).
As far as ktime_get(), I've used it from spinlocked/irq disabled sections and so far haven't seen it do bad things. But would be nice to get some official statement to that effect.
It's just a read seqlock. It may do a few loops to get the correct time, but it's fine to have in a preempt/irq disabled section.
-- Steve
On 25.09.13 09:49, Ville Syrjälä wrote:
On Wed, Sep 25, 2013 at 06:32:10AM +0200, Mario Kleiner wrote:
On 23.09.13 10:38, Ville Syrjälä wrote:
On Sat, Sep 21, 2013 at 12:07:36AM +0200, Mario Kleiner wrote:
On 09/17/2013 10:55 PM, Daniel Vetter wrote:
On Tue, Sep 17, 2013 at 9:50 PM, Peter Hurley peter@hurleysoftware.com wrote:
On 09/11/2013 03:31 PM, Peter Hurley wrote: > > [+cc dri-devel] > > On 09/11/2013 11:38 AM, Steven Rostedt wrote: >> >> On Wed, 11 Sep 2013 11:16:43 -0400 >> Peter Hurley peter@hurleysoftware.com wrote: >> >>>> The funny part is, there's a comment there that shows that this was >>>> done even for "PREEMPT_RT". Unfortunately, the call to >>>> "get_scanout_position()" can call functions that use the rt-mutex >>>> "sleeping spin locks" and it breaks there. >>>> >>>> I guess we need to ask the authors of the mainline patch exactly why >>>> that preempt_disable() is needed? >>> >>> >>> The drm core associates a timestamp with each vertical blank frame #. >>> Drm drivers can optionally support a 'high resolution' hw timestamp. >>> The vblank frame #/timestamp tuple is user-space visible. >>> >>> The i915 drm driver supports a hw timestamp via this drm helper function >>> which computes the timestamp from the crtc scan position (based on the >>> pixel clock). >>> >>> For mainline, the preempt_disable/_enable() isn't actually necessary >>> because every call tree that leads here already has preemption disabled. >>> >>> For -RT, the maybe i915 register spinlock (uncore.lock) should be raw? >>> >> >> No, it should not. Note, any other lock that can be held when it is >> held would also need to be raw. > > > By that, you mean "any other lock" that might be claimed "would also need > to be raw"? Hopefully not "any other lock" already held? > >> And by taking a quick audit of the code, I see this: >> >> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); >> >> /* Reset the chip */ >> >> /* GEN6_GDRST is not in the gt power well, no need to check >> * for fifo space for the write or forcewake the chip for >> * the read >> */ >> __raw_i915_write32(dev_priv, GEN6_GDRST, GEN6_GRDOM_FULL); >> >> /* Spin waiting for the device to ack the reset request */ >> ret = wait_for((__raw_i915_read32(dev_priv, GEN6_GDRST) & >> GEN6_GRDOM_FULL) == 0, 500); >> >> That spin is unacceptable in RT with preemption and interrupts disabled. > > > Yep. That would be bad. > > AFAICT the registers read in i915_get_crtc_scanoutpos() aren't included > in the force-wake set, so raw reads of the registers would > probably be acceptable (thus obviating the need for claiming the > uncore.lock). > > Except that _ALL_ register access is disabled with the uncore.lock > during a gpu reset. Not sure if that's meant to include crtc registers > or not, or what other synchronization/serialization issues are being > handled/hidden by forcing all register accesses to wait during a gpu > reset. > > Hopefully an i915 expert can weigh in here?
Daniel,
Can you shed some light on whether the i915+ crtc registers (specifically those in i915_get_crtc_scanoutpos() and i915_/gm45_get_vblank_counter()) read as part of the vblank counter/timestamp handling need to be prevented during gpu reset?
The depency here in the locking is a recent addition:
commit a7cd1b8fea2f341b626b255d9898a5ca5fabbf0a Author: Chris Wilson chris@chris-wilson.co.uk Date: Fri Jul 19 20:36:51 2013 +0100
drm/i915: Serialize almost all register access
It's a (slightly) oversized hammer to work around a hardware issue - we could break it down to register blocks, which can be accessed concurrently, but that tends to be more fragile. But the chip really dies if you access (even just reads) the same block concurrently :(
We could try break the spinlock protected section a bit in the reset handler - register access on a hung gpu tends to be ill-defined anyway.
The implied wait with preemption and interrupts disabled is causing grief in -RT, but also a 4ms wait inside an irq handler seems like a bad idea.
Oops, the magic code in wait_for which is just there to make the imo totally misguided kgdb support work papered over the aweful long wait in atomic context ever since we've added this in
commit b6e45f866465f42b53d803b0c574da0fc508a0e9 Author: Keith Packard keithp@keithp.com Date: Fri Jan 6 11:34:04 2012 -0800
drm/i915: Move reset forcewake processing to gen6_do_reset
Reverting this change should be enough (code moved obviously a bit).
Cheers, Daniel
Regards, Peter Hurley
>> What's the real issue here? > > > That the vblank timestamp needs to be an accurate measurement of a > realtime event. Sleeping/servicing interrupts while reading > the registers necessary to compute the timestamp would be bad too. > > (edit: which hopefully Mario Kleiner clarified in his reply) > > My point earlier was three-fold: > 1. Don't need the preempt_disable() for mainline: all callers are already > holding interrupt-disabling spinlocks. > 2. -RT still needs to prevent scheduling there. > 3. the problem is i915-specific. > > [update: the radeon driver should also BUG like the i915 driver but > probably > should have mmio_idx_lock spinlock as raw]
Ok, so register access must be serialized, at least within a register block, no way around it. Thanks for explaining this. That makes me a bit depressed. Is this also true for older hw generations than gen6?
Daniel, could we move access to the display register block to a separate lock, so the I915_READ to PIPEDSL in i915_get_crtc_scanoutpos() can avoid the uncore.lock? If the display registers don't need forcewake then i assume we could have much shorter lock hold times by avoiding the large up to 4 msecs waits in forcewake handling. Probably also much less contention in normal operation when no modesetting happens? I think i can get rid of the other register reads in that function. Those settings are already cached and accessible from the intel_crtc_config->adjusted_mode.
I actually have a patch to kill most of the registers reads in get_scanout_position on i915 somewhere. Let me dig that out and send it to the list...
<snip> > In any case, maybe the simple retry 3x loop in the DRM for measuring the > timestamp is not good enough anymore to keep this reliable and accurate. > Maybe we would need some loop that retries until an accurate measurement > or a user configurable timeout is reached. Then users like mine could > set a very high timeout which rather totally degrades the system and > causes severe stuttering of graphics updates than silently failing with > inaccurate timestamps. The default could be something safe for normal > desktop use.
I never really understood the need for the preempt disabled retry loop in drm core. What I would do is just something like this:
get_scanoutpos_and_timestamp() { local_irq_disable(); get_scanoutpos(); get_timestamp(); local_irq_enable(); }
The preempt_disable serves the same purpose for PREEMPT_RT kernels as your local_irq_disable in your example - get rid of any preventable interruption, so a retry is unlikely to be ever needed. At least that was the idea.
I assume if a spin_lock_irqsave doesn't really disable interrupts on a RT kernel with normal spinlocks then local_irq_disable won't really disable interrupts either?
If that has a lot of error, then it seems to me that the system is just crap and we shouldn't care. Or if you're really worried about SMIs and such then you could still do a retry loop.
I didn't expect a lot of error with preemption and interrupts disabled, essentially only such infrequent things as smi's, that's why the retry loop only tries 3x max, once for real, once in case the 1st one got spoiled by a smi or such, and once because three times a charm ;-) - In practice i didn't ever observe more than 3-4 usecs of delay, well below the 20 usecs retry threshold. But i didn't expect i915_crtc_get_scanoutpos() to ever take any locks or other stuff that could induce long waits.
But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms.
E.g., for intel-kms:
i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... }
With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks.
Unless ktime_get is also a bad thing to do in a preempt disabled section?
The preempt_disable/enable is not needed. The spinlock serves the same purpose already.
As far as ktime_get(), I've used it from spinlocked/irq disabled sections and so far haven't seen it do bad things. But would be nice to get some official statement to that effect.
To minimize the chance of breakage, I guess we could add a new func ->get_scanout_pos_and_time or something like that, fill it by default with the code from the current drm_calc_vbltimestamp_from_scanoutpos().
Or we could just shovel the delta_ns handling from drm_calc_vbltimestamp_from_scanoutpos() into some small helper function that we could call from i915_get_vblank_timestamp(), and then we can call i915_get_crtc_scanoutpos() directly from there as well.
I would like to keep most of the logic for scanoutpos timestamping in the the current drm_calc_vbltimestamp_from_scanoutpos() and only have the inner part of the retry loop in i915_get_crtc_scanoutpos(), so we have as much shared code as possible between drivers there. Makes it easier to keep track of what changes when, fix stuff, and to apply the module parameters related to timestamping.
We could add a new get_scanout_pos_with_time(), but afaik only i915 and radeon-kms currently use/implement that function, so maybe we can just convert those bits in one go, as this is only kernel internal api?
-mario
On Wed, 25 Sep 2013 06:32:10 +0200 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
I assume if a spin_lock_irqsave doesn't really disable interrupts on a RT kernel with normal spinlocks then local_irq_disable won't really disable interrupts either?
That is incorrect. On PREEMPT_RT, you are right about spin_lock_irqsave() not disabling interrupts, but local_irq_disable() does indeed disable interrupts.
Open coded local_irq_disable() usually ends up being a bug as it does nothing to synchronize interrupts from other CPUs. But most of those bugs have been removed, and there are some very legit reasons for using local_irq_disable(). PREEMPT_RT honors those.
The reason PREEMPT_RT does not disable interrupts for spin_lock_irqsave(), is because that's usually used for when a interrupt uses the same spinlock. You need the irqsave() part in order to prevent a deadlock, if the code that has the spinlock gets preempted by the interrupt and that interrupt tries to grab the same lock.
Because PREEMPT_RT runs interrupts as threads, we don't have that issue, because if the interrupt preempts the holder of the lock and it tries to grab the same lock, it will just block like any other thread trying to grab that lock. That is, spinlocks turn into mutexes on PREEMPT_RT.
-- Steve
On Wed, 25 Sep 2013 06:32:10 +0200 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms.
E.g., for intel-kms:
i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... }
With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks.
Unless ktime_get is also a bad thing to do in a preempt disabled section?
ktime_get() works fine in preempt_disable sections, although it may add some latencies, but you shouldn't need to worry about it.
I like this solution the best too, but if it does go in, I would ask to send us the patch for adding the preempt_disable() and we can add the preempt_disable_rt() to it. Why make mainline have a little more overhead?
-- Steve
On 25.09.13 16:13, Steven Rostedt wrote:
On Wed, 25 Sep 2013 06:32:10 +0200 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
But given the new situation, your proposal is great! If we push the clock readouts into the get_scanoutpos routine, we can make this robust without causing grief for the rt people and without the need for a new separate lock for display regs in intel-kms.
E.g., for intel-kms:
i915_get_crtc_scanoutpos(..., ktime_t *stime, ktime_t *etime) { ... spin_lock_irqsave(...uncore.lock); preempt_disable(); *stime = ktime_get(); position = __raw_i915_read32(dev_priv, PIPEDSL(pipe)); *etime = ktime_get(); preempt_enable(); spin_unlock_irqrestore(...uncore.lock) ... }
With your patchset to reduce the amount of register reads needed in that function, and given that forcewake handling isn't needed for these registers, this should make it robust again and wouldn't need new locks.
Unless ktime_get is also a bad thing to do in a preempt disabled section?
ktime_get() works fine in preempt_disable sections, although it may add some latencies, but you shouldn't need to worry about it.
I like this solution the best too, but if it does go in, I would ask to send us the patch for adding the preempt_disable() and we can add the preempt_disable_rt() to it. Why make mainline have a little more overhead?
-- Steve
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
-mario
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
* Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
Are there any suggestions for "now"? preempt_disable_nort() like Luis suggesed?
-mario
Sebastian
On Fri, 11 Oct 2013 12:18:00 +0200 Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
- Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
Are there any suggestions for "now"? preempt_disable_nort() like Luis suggesed?
The preempt_disable_nort() is rather pointless, because the preempt_disable() was added specifically for -rt. When PREEMPT_RT is not enabled, preemption is disabled there already by the previous calls to spin_lock().
-- Steve
On 10/11/2013 02:37 PM, Steven Rostedt wrote:
On Fri, 11 Oct 2013 12:18:00 +0200 Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
- Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
Are there any suggestions for "now"? preempt_disable_nort() like Luis suggesed?
The preempt_disable_nort() is rather pointless, because the preempt_disable() was added specifically for -rt. When PREEMPT_RT is not enabled, preemption is disabled there already by the previous calls to spin_lock().
Either way. Then I remove the preempt_enable/disable call. Any objections?
-- Steve
Sebastian
On 10/11/2013 03:30 PM, Sebastian Andrzej Siewior wrote:
On 10/11/2013 02:37 PM, Steven Rostedt wrote:
On Fri, 11 Oct 2013 12:18:00 +0200 Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
- Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
Are there any suggestions for "now"? preempt_disable_nort() like Luis suggesed?
The preempt_disable_nort() is rather pointless, because the preempt_disable() was added specifically for -rt. When PREEMPT_RT is not enabled, preemption is disabled there already by the previous calls to spin_lock().
Either way. Then I remove the preempt_enable/disable call. Any objections?
Good with me. I'm currently working on a replacement. -mario
-- Steve
Sebastian
On Fri, 11 Oct 2013 15:30:22 +0200 Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
On 10/11/2013 02:37 PM, Steven Rostedt wrote:
On Fri, 11 Oct 2013 12:18:00 +0200 Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
- Mario Kleiner | 2013-09-26 18:16:47 [+0200]:
Good! I will do that. Thanks for clarifying the irq and constraints on raw locks in the other thread.
Are there any suggestions for "now"? preempt_disable_nort() like Luis suggesed?
The preempt_disable_nort() is rather pointless, because the preempt_disable() was added specifically for -rt. When PREEMPT_RT is not enabled, preemption is disabled there already by the previous calls to spin_lock().
Either way. Then I remove the preempt_enable/disable call. Any objections?
I have no issues with it, but it may cause issues with timings for the device. But I see Mario is looking into that :-)
-- Steve
Sorry for the late reply, I was at Linux Plumbers, and had a bunch of stuff to catch up on when I returned.
On Sat, 21 Sep 2013 00:07:36 +0200 Mario Kleiner mario.kleiner@tuebingen.mpg.de wrote:
Steven, would it then be acceptable to convert that "faster" lock into a raw_spinlock_t or is this unacceptable? If so, the preempt_disable() could stay, right?
If a spinlock is tight (not held for more than 2us on todays processors), and has little contention, than I would be fine with converting it to raw. And if that's the only lock held you could do the preempt_disable() call.
In fact, if you want, you can leave the preempt_disable() out of mainline, and send a patch to us that uses "preempt_disable_rt()" and add a comment to it. In the -rt patch, preempt_disable_rt() is a nop when PREEMPT_RT is not set, and is preempt_disable() when it is.
-- Steve
dri-devel@lists.freedesktop.org