Hi,
the following patches are from the PREEMPT_RT queue. It is mostly about disabling interrupts/preemption which leads to problems. Unfortunately DRM_I915_LOW_LEVEL_TRACEPOINTS had to be disabled because it acquires locks from within trace points. I tested it on a SandyBridge with built-in i915 by using X, OpenGL and playing videos without noticing any warnings. However, some code paths were not entered.
Sebastian
From: Mike Galbraith umgwanakikbuti@gmail.com
Mario Kleiner suggest in commit ad3543ede630f ("drm/intel: Push get_scanout_position() timestamping into kms driver.")
a spots where preemption should be disabled on PREEMPT_RT. The difference is that on PREEMPT_RT the intel_uncore::lock disables neither preemption nor interrupts and so region remains preemptible.
The area covers only register reads and writes. The part that worries me is: - __intel_get_crtc_scanline() the worst case is 100us if no match is found.
- intel_crtc_scanlines_since_frame_timestamp() not sure how long this may take in the worst case.
It was in the RT queue for a while and nobody complained. Disable preemption on PREEPMPT_RT during timestamping.
[bigeasy: patch description.]
Cc: Mario Kleiner mario.kleiner.de@gmail.com Signed-off-by: Mike Galbraith umgwanakikbuti@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- drivers/gpu/drm/i915/i915_irq.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 9bc4f4a8e12ec..547347241a47c 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -886,7 +886,8 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, */ spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
- /* preempt_disable_rt() should go right here in PREEMPT_RT patchset. */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_disable();
/* Get optional system timestamp before query. */ if (stime) @@ -950,7 +951,8 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, if (etime) *etime = ktime_get();
- /* preempt_enable_rt() should go right here in PREEMPT_RT patchset. */ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) + preempt_enable();
spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
From: Mike Galbraith umgwanakikbuti@gmail.com
Commit 8d7849db3eab7 ("drm/i915: Make sprite updates atomic")
started disabling interrupts across atomic updates. This breaks on PREEMPT_RT because within this section the code attempt to acquire spinlock_t locks which are sleeping locks on PREEMPT_RT.
According to the comment the interrupts are disabled to avoid random delays and not required for protection or synchronisation. If this needs to happen with disabled interrupts on PREEMPT_RT, and the whole section is restricted to register access then all sleeping locks need to be acquired before interrupts are disabled and some function maybe moved after enabling interrupts again. This includes: - prepare_to_wait() + finish_wait() due its wake queue. - drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock. - skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and maybe others due to intel_uncore::lock - drm_crtc_arm_vblank_event() due to drm_device::event_lock and drm_device::vblank_time_lock.
Don't disable interrupts on PREEMPT_RT during atomic updates.
[bigeasy: drop local locks, commit message]
Signed-off-by: Mike Galbraith umgwanakikbuti@gmail.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- drivers/gpu/drm/i915/display/intel_crtc.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c b/drivers/gpu/drm/i915/display/intel_crtc.c index 254e67141a776..7a39029b083f4 100644 --- a/drivers/gpu/drm/i915/display/intel_crtc.c +++ b/drivers/gpu/drm/i915/display/intel_crtc.c @@ -425,7 +425,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) */ intel_psr_wait_for_idle(new_crtc_state);
- local_irq_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_disable();
crtc->debug.min_vbl = min; crtc->debug.max_vbl = max; @@ -450,11 +451,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) break; }
- local_irq_enable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_enable();
timeout = schedule_timeout(timeout);
- local_irq_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_disable(); }
finish_wait(wq, &wait); @@ -487,7 +490,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) return;
irq_disable: - local_irq_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_disable(); }
#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_VBLANK_EVADE) @@ -566,7 +570,8 @@ void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state) new_crtc_state->uapi.event = NULL; }
- local_irq_enable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_irq_enable();
/* Send VRR Push to terminate Vblank */ intel_vrr_send_push(new_crtc_state);
On Tue, Oct 05, 2021 at 05:00:40PM +0200, Sebastian Andrzej Siewior wrote:
From: Mike Galbraith umgwanakikbuti@gmail.com
Commit 8d7849db3eab7 ("drm/i915: Make sprite updates atomic")
started disabling interrupts across atomic updates. This breaks on PREEMPT_RT because within this section the code attempt to acquire spinlock_t locks which are sleeping locks on PREEMPT_RT.
According to the comment the interrupts are disabled to avoid random delays and not required for protection or synchronisation. If this needs to happen with disabled interrupts on PREEMPT_RT, and the whole section is restricted to register access then all sleeping locks need to be acquired before interrupts are disabled and some function maybe moved after enabling interrupts again. This includes:
- prepare_to_wait() + finish_wait() due its wake queue.
- drm_crtc_vblank_put() -> vblank_disable_fn() drm_device::vbl_lock.
- skl_pfit_enable(), intel_update_plane(), vlv_atomic_update_fifo() and maybe others due to intel_uncore::lock
- drm_crtc_arm_vblank_event() due to drm_device::event_lock and drm_device::vblank_time_lock.
Don't disable interrupts on PREEMPT_RT during atomic updates.
[bigeasy: drop local locks, commit message]
I have my doubts about meething the deadlines here. CONFIG_DRM_I915_DEBUG_VBLANK_EVADE should scream a bunch if it looks like we're missing it.
That said, we already miss the deadline sometimes, esp. with lockdep and whatnot enabled which makes the locking very expensive.
Also some ideas how to reduce the overhead: - Try to make the mmio accesses lockless as much s possible - Reduce the amount of work we do in the critical section
Anyways, RT hasn't really been on anyone's radar so no one has yet spent significant amount of brain cells on this.
Signed-off-by: Mike Galbraith umgwanakikbuti@gmail.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
drivers/gpu/drm/i915/display/intel_crtc.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_crtc.c b/drivers/gpu/drm/i915/display/intel_crtc.c index 254e67141a776..7a39029b083f4 100644 --- a/drivers/gpu/drm/i915/display/intel_crtc.c +++ b/drivers/gpu/drm/i915/display/intel_crtc.c @@ -425,7 +425,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) */ intel_psr_wait_for_idle(new_crtc_state);
- local_irq_disable();
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_disable();
crtc->debug.min_vbl = min; crtc->debug.max_vbl = max;
@@ -450,11 +451,13 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) break; }
local_irq_enable();
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_enable();
timeout = schedule_timeout(timeout);
local_irq_disable();
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_disable();
}
finish_wait(wq, &wait);
@@ -487,7 +490,8 @@ void intel_pipe_update_start(const struct intel_crtc_state *new_crtc_state) return;
irq_disable:
- local_irq_disable();
- if (!IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_disable();
}
#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_VBLANK_EVADE) @@ -566,7 +570,8 @@ void intel_pipe_update_end(struct intel_crtc_state *new_crtc_state) new_crtc_state->uapi.event = NULL; }
- local_irq_enable();
if (!IS_ENABLED(CONFIG_PREEMPT_RT))
local_irq_enable();
/* Send VRR Push to terminate Vblank */ intel_vrr_send_push(new_crtc_state);
-- 2.33.0
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915]
The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spinlock_t locks which are transformed into sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping locks on PREEMPT_RT. At the time the arguments are evaluated within trace point, preemption is disabled and so the locks must not be acquired on PREEMPT_RT.
Based on this I don't see any other way than disable trace points on PREMPT_RT.
Reported-by: Luca Abeni lucabe72@gmail.com Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- drivers/gpu/drm/i915/i915_trace.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 806ad688274bf..773e7362c4442 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -2,6 +2,10 @@ #if !defined(_I915_TRACE_H_) || defined(TRACE_HEADER_MULTI_READ) #define _I915_TRACE_H_
+#ifdef CONFIG_PREEMPT_RT +#define NOTRACE +#endif + #include <linux/stringify.h> #include <linux/types.h> #include <linux/tracepoint.h>
On Tue, Oct 05, 2021 at 05:00:41PM +0200, Sebastian Andrzej Siewior wrote:
Luca Abeni reported this: | BUG: scheduling while atomic: kworker/u8:2/15203/0x00000003 | CPU: 1 PID: 15203 Comm: kworker/u8:2 Not tainted 4.19.1-rt3 #10 | Call Trace: | rt_spin_lock+0x3f/0x50 | gen6_read32+0x45/0x1d0 [i915] | g4x_get_vblank_counter+0x36/0x40 [i915] | trace_event_raw_event_i915_pipe_update_start+0x7d/0xf0 [i915]
The tracing events use trace_i915_pipe_update_start() among other events use functions acquire spinlock_t locks which are transformed into sleeping locks on PREEMPT_RT. A few trace points use intel_get_crtc_scanline(), others use ->get_vblank_counter() wich also might acquire a sleeping locks on PREEMPT_RT. At the time the arguments are evaluated within trace point, preemption is disabled and so the locks must not be acquired on PREEMPT_RT.
Based on this I don't see any other way than disable trace points on PREMPT_RT.
I think the correct answer is to make uncore.lock a raw_spinlock. Without the tracepoints deubgging any of this is stuff pretty much impossible. We also take that lock a lot.
The horrible truth is that some of the hardware just keels over if two CPUs access the same mmio cacheline simultanously, so pretty much all mmio accesses need to be performed under uncore.lock.
The vblank locking situation is a lot more self inflicted. As in there are something like three different spinlocks in there for some reason. Not sure what to do about that. I guess one option would be to skip the vblank timestamp based stuff in the tracepoints for the time being. Most hardware should have a hardware vblank counter so the fallback isn't super common.
Reported-by: Luca Abeni lucabe72@gmail.com Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
drivers/gpu/drm/i915/i915_trace.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 806ad688274bf..773e7362c4442 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -2,6 +2,10 @@ #if !defined(_I915_TRACE_H_) || defined(TRACE_HEADER_MULTI_READ) #define _I915_TRACE_H_
+#ifdef CONFIG_PREEMPT_RT +#define NOTRACE +#endif
#include <linux/stringify.h> #include <linux/types.h>
#include <linux/tracepoint.h>
2.33.0
On 2021-10-06 12:34:19 [+0300], Ville Syrjälä wrote:
I think the correct answer is to make uncore.lock a raw_spinlock. Without the tracepoints deubgging any of this is stuff pretty much impossible. We also take that lock a lot.
Let me check if that works.
Sebastian
On 2021-10-06 12:15:21 [+0200], To Ville Syrjälä wrote:
On 2021-10-06 12:34:19 [+0300], Ville Syrjälä wrote:
I think the correct answer is to make uncore.lock a raw_spinlock. Without the tracepoints deubgging any of this is stuff pretty much impossible. We also take that lock a lot.
Let me check if that works.
Turned the uncore.lock into raw_spinlock_t, the debug.lock as well because it nests inside the former. Also intel_uncore_forcewake_domain::timer fires now in hardirq since it almost only acquires the uncore.lock. What worries me a little is the fw_domain_wait_ack_clear() / wait_ack_clear() which spin-waits up a whole ms. __gen6_gt_wait_for_thread_c0() has a 5ms limit and wait_ack_clear() has 50ms as upper limit. I know that it does not usually take long and if it takes that long than it is an error most likely but still... The full patch at the end of the email.
Now. Before that patch:
| T: 0 ( 1179) P:90 I:250 C: 182450 Min: 2 Act: 6 Avg: 5 Max: 21 | T: 1 ( 1180) P:90 I:250 C: 182437 Min: 2 Act: 6 Avg: 5 Max: 22 | T: 2 ( 1181) P:90 I:250 C: 182423 Min: 2 Act: 6 Avg: 5 Max: 23 | T: 3 ( 1182) P:90 I:250 C: 182401 Min: 2 Act: 6 Avg: 5 Max: 21 | T: 4 ( 1183) P:90 I:250 C: 182394 Min: 2 Act: 7 Avg: 5 Max: 27 | T: 5 ( 1184) P:90 I:250 C: 182381 Min: 2 Act: 6 Avg: 5 Max: 22 | T: 6 ( 1185) P:90 I:250 C: 182368 Min: 3 Act: 6 Avg: 5 Max: 23 | T: 7 ( 1186) P:90 I:250 C: 182356 Min: 2 Act: 8 Avg: 5 Max: 22
after:
| T: 0 ( 1183) P:90 I:250 C:1022143 Min: 3 Act: 3 Avg: 4 Max: 47 | T: 1 ( 1184) P:90 I:250 C:1022125 Min: 2 Act: 3 Avg: 4 Max: 51 | T: 2 ( 1185) P:90 I:250 C:1022110 Min: 2 Act: 5 Avg: 4 Max: 52 | T: 3 ( 1186) P:90 I:250 C:1022089 Min: 2 Act: 3 Avg: 4 Max: 52 | T: 4 ( 1187) P:90 I:250 C:1022083 Min: 2 Act: 3 Avg: 4 Max: 51 | T: 5 ( 1188) P:90 I:250 C:1022070 Min: 3 Act: 3 Avg: 4 Max: 50 | T: 6 ( 1189) P:90 I:250 C:1022058 Min: 3 Act: 5 Avg: 4 Max: 46 | T: 7 ( 1190) P:90 I:250 C:1022045 Min: 2 Act: 3 Avg: 4 Max: 51
This is cyclictest output. A little explanation: T: means thread number. There is one thread pinned to each CPU so I have 8 CPUs. This is an i7 SandyBridge so 4 cores + hyperthread with a built-in i915 (INTEL_SNB_D_GT1_IDS).
Max: is the maximal observed latency in us. The program fires a timer every 250us and measures the latency between the programmed time and now. Since it is the thread with the highest priority in the system it should be scheduled right away. Unless there is a preempt-disable/IRQ-off section somewhere.
I did the same test in both cases: started a video hoping for some HW acceleration and skipped forward / backwards in the clip a few times. As soon as I start jumping in the video the latencies rise. I don't observe it without the patch. The system is idle otherwise. If you have other tests in mind which put more / different load on the system, I'm all yours. I'm mostly curious how bad can it get.
Sebastian
diff --git a/drivers/gpu/drm/i915/display/i9xx_plane.c b/drivers/gpu/drm/i915/display/i9xx_plane.c index b1439ba78f67b..6f2cab95d8646 100644 --- a/drivers/gpu/drm/i915/display/i9xx_plane.c +++ b/drivers/gpu/drm/i915/display/i9xx_plane.c @@ -444,7 +444,7 @@ static void i9xx_update_plane(struct intel_plane *plane, else dspaddr_offset = linear_offset;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, DSPSTRIDE(i9xx_plane), plane_state->view.color_plane[0].stride); @@ -490,7 +490,7 @@ static void i9xx_update_plane(struct intel_plane *plane, intel_de_write_fw(dev_priv, DSPADDR(i9xx_plane), intel_plane_ggtt_offset(plane_state) + dspaddr_offset);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void i9xx_disable_plane(struct intel_plane *plane, @@ -513,7 +513,7 @@ static void i9xx_disable_plane(struct intel_plane *plane, */ dspcntr = i9xx_plane_ctl_crtc(crtc_state);
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, DSPCNTR(i9xx_plane), dspcntr); if (DISPLAY_VER(dev_priv) >= 4) @@ -521,7 +521,7 @@ static void i9xx_disable_plane(struct intel_plane *plane, else intel_de_write_fw(dev_priv, DSPADDR(i9xx_plane), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -539,11 +539,11 @@ g4x_primary_async_flip(struct intel_plane *plane, if (async_flip) dspcntr |= DISPPLANE_ASYNC_FLIP;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); intel_de_write_fw(dev_priv, DSPCNTR(i9xx_plane), dspcntr); intel_de_write_fw(dev_priv, DSPSURF(i9xx_plane), intel_plane_ggtt_offset(plane_state) + dspaddr_offset); - spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -557,10 +557,10 @@ vlv_primary_async_flip(struct intel_plane *plane, enum i9xx_plane_id i9xx_plane = plane->i9xx_plane; unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); intel_de_write_fw(dev_priv, DSPADDR_VLV(i9xx_plane), intel_plane_ggtt_offset(plane_state) + dspaddr_offset); - spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c b/drivers/gpu/drm/i915/display/intel_cursor.c index c7618fef01439..0ce998dbe1a27 100644 --- a/drivers/gpu/drm/i915/display/intel_cursor.c +++ b/drivers/gpu/drm/i915/display/intel_cursor.c @@ -274,7 +274,7 @@ static void i845_update_cursor(struct intel_plane *plane, pos = intel_cursor_position(plane_state); }
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
/* On these chipsets we can only modify the base/size/stride * whilst the cursor is disabled. @@ -295,7 +295,7 @@ static void i845_update_cursor(struct intel_plane *plane, intel_de_write_fw(dev_priv, CURPOS(PIPE_A), pos); }
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void i845_disable_cursor(struct intel_plane *plane, @@ -511,7 +511,7 @@ static void i9xx_update_cursor(struct intel_plane *plane, pos = intel_cursor_position(plane_state); }
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
/* * On some platforms writing CURCNTR first will also @@ -557,7 +557,7 @@ static void i9xx_update_cursor(struct intel_plane *plane, intel_de_write_fw(dev_priv, CURBASE(pipe), base); }
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void i9xx_disable_cursor(struct intel_plane *plane, diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c index ddfc17e21668a..682493f336451 100644 --- a/drivers/gpu/drm/i915/display/intel_fbc.c +++ b/drivers/gpu/drm/i915/display/intel_fbc.c @@ -212,10 +212,10 @@ static void i8xx_fbc_recompress(struct drm_i915_private *dev_priv) struct intel_fbc_reg_params *params = &dev_priv->fbc.params; enum i9xx_plane_id i9xx_plane = params->crtc.i9xx_plane;
- spin_lock_irq(&dev_priv->uncore.lock); + raw_spin_lock_irq(&dev_priv->uncore.lock); intel_de_write_fw(dev_priv, DSPADDR(i9xx_plane), intel_de_read_fw(dev_priv, DSPADDR(i9xx_plane))); - spin_unlock_irq(&dev_priv->uncore.lock); + raw_spin_unlock_irq(&dev_priv->uncore.lock); }
static void i965_fbc_recompress(struct drm_i915_private *dev_priv) @@ -223,10 +223,10 @@ static void i965_fbc_recompress(struct drm_i915_private *dev_priv) struct intel_fbc_reg_params *params = &dev_priv->fbc.params; enum i9xx_plane_id i9xx_plane = params->crtc.i9xx_plane;
- spin_lock_irq(&dev_priv->uncore.lock); + raw_spin_lock_irq(&dev_priv->uncore.lock); intel_de_write_fw(dev_priv, DSPSURF(i9xx_plane), intel_de_read_fw(dev_priv, DSPSURF(i9xx_plane))); - spin_unlock_irq(&dev_priv->uncore.lock); + raw_spin_unlock_irq(&dev_priv->uncore.lock); }
/* This function forces a CFB recompression through the nuke operation. */ diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c b/drivers/gpu/drm/i915/display/intel_sprite.c index 08116f41da26a..ab98ce1b05e22 100644 --- a/drivers/gpu/drm/i915/display/intel_sprite.c +++ b/drivers/gpu/drm/i915/display/intel_sprite.c @@ -444,7 +444,7 @@ vlv_update_plane(struct intel_plane *plane,
linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0);
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, SPSTRIDE(pipe, plane_id), plane_state->view.color_plane[0].stride); @@ -481,7 +481,7 @@ vlv_update_plane(struct intel_plane *plane, vlv_update_clrc(plane_state); vlv_update_gamma(plane_state);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -493,12 +493,12 @@ vlv_disable_plane(struct intel_plane *plane, enum plane_id plane_id = plane->id; unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, SPCNTR(pipe, plane_id), 0); intel_de_write_fw(dev_priv, SPSURF(pipe, plane_id), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static bool @@ -868,7 +868,7 @@ ivb_update_plane(struct intel_plane *plane,
linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0);
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, SPRSTRIDE(pipe), plane_state->view.color_plane[0].stride); @@ -904,7 +904,7 @@ ivb_update_plane(struct intel_plane *plane,
ivb_update_gamma(plane_state);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -915,7 +915,7 @@ ivb_disable_plane(struct intel_plane *plane, enum pipe pipe = plane->pipe; unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, SPRCTL(pipe), 0); /* Disable the scaler */ @@ -923,7 +923,7 @@ ivb_disable_plane(struct intel_plane *plane, intel_de_write_fw(dev_priv, SPRSCALE(pipe), 0); intel_de_write_fw(dev_priv, SPRSURF(pipe), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static bool @@ -1196,7 +1196,7 @@ g4x_update_plane(struct intel_plane *plane,
linear_offset = intel_fb_xy_to_linear(x, y, plane_state, 0);
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, DVSSTRIDE(pipe), plane_state->view.color_plane[0].stride); @@ -1228,7 +1228,7 @@ g4x_update_plane(struct intel_plane *plane, else ilk_update_gamma(plane_state);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -1239,14 +1239,14 @@ g4x_disable_plane(struct intel_plane *plane, enum pipe pipe = plane->pipe; unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, DVSCNTR(pipe), 0); /* Disable the scaler */ intel_de_write_fw(dev_priv, DVSSCALE(pipe), 0); intel_de_write_fw(dev_priv, DVSSURF(pipe), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static bool diff --git a/drivers/gpu/drm/i915/display/skl_scaler.c b/drivers/gpu/drm/i915/display/skl_scaler.c index 37eabeff8197f..9c6923d19665d 100644 --- a/drivers/gpu/drm/i915/display/skl_scaler.c +++ b/drivers/gpu/drm/i915/display/skl_scaler.c @@ -433,7 +433,7 @@ void skl_pfit_enable(const struct intel_crtc_state *crtc_state) ps_ctrl = skl_scaler_get_filter_select(crtc_state->hw.scaling_filter, 0); ps_ctrl |= PS_SCALER_EN | scaler_state->scalers[id].mode;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
skl_scaler_setup_filter(dev_priv, pipe, id, 0, crtc_state->hw.scaling_filter); @@ -449,7 +449,7 @@ void skl_pfit_enable(const struct intel_crtc_state *crtc_state) intel_de_write_fw(dev_priv, SKL_PS_WIN_SZ(pipe, id), width << 16 | height);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
void @@ -520,13 +520,13 @@ static void skl_detach_scaler(struct intel_crtc *crtc, int id) struct drm_i915_private *dev_priv = to_i915(dev); unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, SKL_PS_CTRL(crtc->pipe, id), 0); intel_de_write_fw(dev_priv, SKL_PS_WIN_POS(crtc->pipe, id), 0); intel_de_write_fw(dev_priv, SKL_PS_WIN_SZ(crtc->pipe, id), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
/* diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c index 724e7b04f3b63..81243bca715e5 100644 --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c @@ -649,7 +649,7 @@ skl_disable_plane(struct intel_plane *plane, enum pipe pipe = plane->pipe; unsigned long irqflags;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
if (icl_is_hdr_plane(dev_priv, plane_id)) intel_de_write_fw(dev_priv, PLANE_CUS_CTL(pipe, plane_id), 0); @@ -659,7 +659,7 @@ skl_disable_plane(struct intel_plane *plane, intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), 0); intel_de_write_fw(dev_priv, PLANE_SURF(pipe, plane_id), 0);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static bool @@ -1056,7 +1056,7 @@ skl_program_plane(struct intel_plane *plane, aux_dist |= skl_plane_stride(plane_state, aux_plane); }
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, PLANE_STRIDE(pipe, plane_id), stride); intel_de_write_fw(dev_priv, PLANE_POS(pipe, plane_id), @@ -1116,7 +1116,7 @@ skl_program_plane(struct intel_plane *plane, intel_de_write_fw(dev_priv, PLANE_SURF(pipe, plane_id), intel_plane_ggtt_offset(plane_state) + surf_addr);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void @@ -1137,13 +1137,13 @@ skl_plane_async_flip(struct intel_plane *plane, if (async_flip) plane_ctl |= PLANE_CTL_ASYNC_FLIP;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), plane_ctl); intel_de_write_fw(dev_priv, PLANE_SURF(pipe, plane_id), intel_plane_ggtt_offset(plane_state) + surf_addr);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); }
static void diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index de3ac58fceec3..7527937c9b131 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -152,10 +152,10 @@ void gen6_ggtt_invalidate(struct i915_ggtt *ggtt) { struct intel_uncore *uncore = ggtt->vm.gt->uncore;
- spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); intel_uncore_write_fw(uncore, GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); intel_uncore_read_fw(uncore, GFX_FLSH_CNTL_GEN6); - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock); }
static void gen8_ggtt_invalidate(struct i915_ggtt *ggtt) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 62d40c9866427..a4d93e7acbded 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -416,10 +416,10 @@ void intel_gt_flush_ggtt_writes(struct intel_gt *gt) with_intel_runtime_pm_if_in_use(uncore->rpm, wakeref) { unsigned long flags;
- spin_lock_irqsave(&uncore->lock, flags); + raw_spin_lock_irqsave(&uncore->lock, flags); intel_uncore_posting_read_fw(uncore, RING_HEAD(RENDER_RING_BASE)); - spin_unlock_irqrestore(&uncore->lock, flags); + raw_spin_unlock_irqrestore(&uncore->lock, flags); } }
diff --git a/drivers/gpu/drm/i915/gt/intel_rc6.c b/drivers/gpu/drm/i915/gt/intel_rc6.c index 799d382eea799..f27b44ea56be1 100644 --- a/drivers/gpu/drm/i915/gt/intel_rc6.c +++ b/drivers/gpu/drm/i915/gt/intel_rc6.c @@ -751,7 +751,7 @@ u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, const i915_reg_t reg)
fw_domains = intel_uncore_forcewake_for_reg(uncore, reg, FW_REG_READ);
- spin_lock_irqsave(&uncore->lock, flags); + raw_spin_lock_irqsave(&uncore->lock, flags); intel_uncore_forcewake_get__locked(uncore, fw_domains);
/* On VLV and CHV, residency time is in CZ units rather than 1.28us */ @@ -794,7 +794,7 @@ u64 intel_rc6_residency_ns(struct intel_rc6 *rc6, const i915_reg_t reg) rc6->cur_residency[i] = time_hw;
intel_uncore_forcewake_put__locked(uncore, fw_domains); - spin_unlock_irqrestore(&uncore->lock, flags); + raw_spin_unlock_irqrestore(&uncore->lock, flags);
return mul_u64_u32_div(time_hw, mul, div); } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index aae609d7d85dd..ced21d60a6153 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1255,7 +1255,7 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
fw = wal_get_fw_for_rmw(uncore, wal);
- spin_lock_irqsave(&uncore->lock, flags); + raw_spin_lock_irqsave(&uncore->lock, flags); intel_uncore_forcewake_get__locked(uncore, fw);
for (i = 0, wa = wal->list; i < wal->count; i++, wa++) { @@ -1273,7 +1273,7 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal) }
intel_uncore_forcewake_put__locked(uncore, fw); - spin_unlock_irqrestore(&uncore->lock, flags); + raw_spin_unlock_irqrestore(&uncore->lock, flags); }
void intel_gt_apply_workarounds(struct intel_gt *gt) @@ -1294,7 +1294,7 @@ static bool wa_list_verify(struct intel_gt *gt,
fw = wal_get_fw_for_rmw(uncore, wal);
- spin_lock_irqsave(&uncore->lock, flags); + raw_spin_lock_irqsave(&uncore->lock, flags); intel_uncore_forcewake_get__locked(uncore, fw);
for (i = 0, wa = wal->list; i < wal->count; i++, wa++) @@ -1303,7 +1303,7 @@ static bool wa_list_verify(struct intel_gt *gt, wal->name, from);
intel_uncore_forcewake_put__locked(uncore, fw); - spin_unlock_irqrestore(&uncore->lock, flags); + raw_spin_unlock_irqrestore(&uncore->lock, flags);
return ok; } diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 547347241a47c..5833f42eef953 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -678,7 +678,7 @@ u32 i915_get_vblank_counter(struct drm_crtc *crtc) high_frame = PIPEFRAME(pipe); low_frame = PIPEFRAMEPIXEL(pipe);
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
/* * High & low register fields aren't synchronized, so make sure @@ -691,7 +691,7 @@ u32 i915_get_vblank_counter(struct drm_crtc *crtc) high2 = intel_de_read_fw(dev_priv, high_frame) & PIPE_FRAME_HIGH_MASK; } while (high1 != high2);
- spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
high1 >>= PIPE_FRAME_HIGH_SHIFT; pixel = low & PIPE_PIXEL_MASK; @@ -884,10 +884,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, * register reads, potentially with preemption disabled, so the * following code must not block on uncore.lock. */ - spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); - - if (IS_ENABLED(CONFIG_PREEMPT_RT)) - preempt_disable(); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
/* Get optional system timestamp before query. */ if (stime) @@ -951,10 +948,7 @@ static bool i915_get_crtc_scanoutpos(struct drm_crtc *_crtc, if (etime) *etime = ktime_get();
- if (IS_ENABLED(CONFIG_PREEMPT_RT)) - preempt_enable(); - - spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
/* * While in vblank, position will be negative @@ -992,9 +986,9 @@ int intel_get_crtc_scanline(struct intel_crtc *crtc) unsigned long irqflags; int position;
- spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); + raw_spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); position = __intel_get_crtc_scanline(crtc); - spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + raw_spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
return position; } diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 0b488d49694ca..31c67eac2bcde 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -344,9 +344,9 @@ engines_sample(struct intel_gt *gt, unsigned int period_ns) continue;
if (exclusive_mmio_access(i915)) { - spin_lock_irqsave(&engine->uncore->lock, flags); + raw_spin_lock_irqsave(&engine->uncore->lock, flags); engine_sample(engine, period_ns); - spin_unlock_irqrestore(&engine->uncore->lock, flags); + raw_spin_unlock_irqrestore(&engine->uncore->lock, flags); } else { engine_sample(engine, period_ns); } diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 65bc3709f54c5..c7e1892731110 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -2009,6 +2009,7 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, &crtc_state->wm.vlv.fifo_state; int sprite0_start, sprite1_start, fifo_size; u32 dsparb, dsparb2, dsparb3; + unsigned long flags;
if (!crtc_state->fifo_changed) return; @@ -2031,7 +2032,7 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state, * intel_pipe_update_start() has already disabled interrupts * for us, so a plain spin_lock() is sufficient here. */ - spin_lock(&uncore->lock); + raw_spin_lock_irqsave(&uncore->lock, flags);
switch (crtc->pipe) { case PIPE_A: @@ -2091,7 +2092,7 @@ static void vlv_atomic_update_fifo(struct intel_atomic_state *state,
intel_uncore_posting_read_fw(uncore, DSPARB);
- spin_unlock(&uncore->lock); + raw_spin_unlock_irqrestore(&uncore->lock, flags); }
#undef VLV_FIFO diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index 6b38bc2811c1b..97c750605ec02 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -39,7 +39,7 @@ void intel_uncore_mmio_debug_init_early(struct intel_uncore_mmio_debug *mmio_debug) { - spin_lock_init(&mmio_debug->lock); + raw_spin_lock_init(&mmio_debug->lock); mmio_debug->unclaimed_mmio_check = 1; }
@@ -118,7 +118,7 @@ fw_domain_arm_timer(struct intel_uncore_forcewake_domain *d) hrtimer_start_range_ns(&d->timer, NSEC_PER_MSEC, NSEC_PER_MSEC, - HRTIMER_MODE_REL); + HRTIMER_MODE_REL_HARD); }
static inline int @@ -390,7 +390,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer) if (xchg(&domain->active, false)) return HRTIMER_RESTART;
- spin_lock_irqsave(&uncore->lock, irqflags); + raw_spin_lock_irqsave(&uncore->lock, irqflags);
uncore->fw_domains_timer &= ~domain->mask;
@@ -398,7 +398,7 @@ intel_uncore_fw_release_timer(struct hrtimer *timer) if (--domain->wake_count == 0) uncore->funcs.force_wake_put(uncore, domain->mask);
- spin_unlock_irqrestore(&uncore->lock, irqflags); + raw_spin_unlock_irqrestore(&uncore->lock, irqflags);
return HRTIMER_NORESTART; } @@ -431,7 +431,7 @@ intel_uncore_forcewake_reset(struct intel_uncore *uncore) intel_uncore_fw_release_timer(&domain->timer); }
- spin_lock_irqsave(&uncore->lock, irqflags); + raw_spin_lock_irqsave(&uncore->lock, irqflags);
for_each_fw_domain(domain, uncore, tmp) { if (hrtimer_active(&domain->timer)) @@ -446,7 +446,7 @@ intel_uncore_forcewake_reset(struct intel_uncore *uncore) break; }
- spin_unlock_irqrestore(&uncore->lock, irqflags); + raw_spin_unlock_irqrestore(&uncore->lock, irqflags); cond_resched(); }
@@ -459,7 +459,7 @@ intel_uncore_forcewake_reset(struct intel_uncore *uncore) fw_domains_reset(uncore, uncore->fw_domains); assert_forcewakes_inactive(uncore);
- spin_unlock_irqrestore(&uncore->lock, irqflags); + raw_spin_unlock_irqrestore(&uncore->lock, irqflags);
return fw; /* track the lost user forcewake domains */ } @@ -561,12 +561,12 @@ static void forcewake_early_sanitize(struct intel_uncore *uncore, iosf_mbi_punit_acquire(); intel_uncore_forcewake_reset(uncore); if (restore_forcewake) { - spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); uncore->funcs.force_wake_get(uncore, restore_forcewake);
if (intel_uncore_has_fifo(uncore)) uncore->fifo_count = fifo_free_entries(uncore); - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock); } iosf_mbi_punit_release(); } @@ -649,9 +649,9 @@ void intel_uncore_forcewake_get(struct intel_uncore *uncore,
assert_rpm_wakelock_held(uncore->rpm);
- spin_lock_irqsave(&uncore->lock, irqflags); + raw_spin_lock_irqsave(&uncore->lock, irqflags); __intel_uncore_forcewake_get(uncore, fw_domains); - spin_unlock_irqrestore(&uncore->lock, irqflags); + raw_spin_unlock_irqrestore(&uncore->lock, irqflags); }
/** @@ -664,14 +664,14 @@ void intel_uncore_forcewake_get(struct intel_uncore *uncore, */ void intel_uncore_forcewake_user_get(struct intel_uncore *uncore) { - spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); if (!uncore->user_forcewake_count++) { intel_uncore_forcewake_get__locked(uncore, FORCEWAKE_ALL); - spin_lock(&uncore->debug->lock); + raw_spin_lock(&uncore->debug->lock); mmio_debug_suspend(uncore->debug); - spin_unlock(&uncore->debug->lock); + raw_spin_unlock(&uncore->debug->lock); } - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock); }
/** @@ -683,19 +683,19 @@ void intel_uncore_forcewake_user_get(struct intel_uncore *uncore) */ void intel_uncore_forcewake_user_put(struct intel_uncore *uncore) { - spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); if (!--uncore->user_forcewake_count) { - spin_lock(&uncore->debug->lock); + raw_spin_lock(&uncore->debug->lock); mmio_debug_resume(uncore->debug);
if (check_for_unclaimed_mmio(uncore)) drm_info(&uncore->i915->drm, "Invalid mmio detected during user access\n"); - spin_unlock(&uncore->debug->lock); + raw_spin_unlock(&uncore->debug->lock);
intel_uncore_forcewake_put__locked(uncore, FORCEWAKE_ALL); } - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock); }
/** @@ -753,9 +753,9 @@ void intel_uncore_forcewake_put(struct intel_uncore *uncore, if (!uncore->funcs.force_wake_put) return;
- spin_lock_irqsave(&uncore->lock, irqflags); + raw_spin_lock_irqsave(&uncore->lock, irqflags); __intel_uncore_forcewake_put(uncore, fw_domains); - spin_unlock_irqrestore(&uncore->lock, irqflags); + raw_spin_unlock_irqrestore(&uncore->lock, irqflags); }
/** @@ -821,7 +821,7 @@ void assert_forcewakes_active(struct intel_uncore *uncore, if (!uncore->funcs.force_wake_get) return;
- spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock);
assert_rpm_wakelock_held(uncore->rpm);
@@ -847,7 +847,7 @@ void assert_forcewakes_active(struct intel_uncore *uncore, break; }
- spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock); }
/* We give fast paths for the really cool registers */ @@ -1520,12 +1520,12 @@ unclaimed_reg_debug(struct intel_uncore *uncore, lockdep_assert_held(&uncore->lock);
if (before) - spin_lock(&uncore->debug->lock); + raw_spin_lock(&uncore->debug->lock);
__unclaimed_reg_debug(uncore, reg, read, before);
if (!before) - spin_unlock(&uncore->debug->lock); + raw_spin_unlock(&uncore->debug->lock); }
#define __vgpu_read(x) \ @@ -1585,12 +1585,12 @@ __gen2_read(64) unsigned long irqflags; \ u##x val = 0; \ assert_rpm_wakelock_held(uncore->rpm); \ - spin_lock_irqsave(&uncore->lock, irqflags); \ + raw_spin_lock_irqsave(&uncore->lock, irqflags); \ unclaimed_reg_debug(uncore, reg, true, true)
#define GEN6_READ_FOOTER \ unclaimed_reg_debug(uncore, reg, true, false); \ - spin_unlock_irqrestore(&uncore->lock, irqflags); \ + raw_spin_unlock_irqrestore(&uncore->lock, irqflags); \ trace_i915_reg_rw(false, reg, val, sizeof(val), trace); \ return val
@@ -1694,12 +1694,12 @@ __gen2_write(32) unsigned long irqflags; \ trace_i915_reg_rw(true, reg, val, sizeof(val), trace); \ assert_rpm_wakelock_held(uncore->rpm); \ - spin_lock_irqsave(&uncore->lock, irqflags); \ + raw_spin_lock_irqsave(&uncore->lock, irqflags); \ unclaimed_reg_debug(uncore, reg, false, true)
#define GEN6_WRITE_FOOTER \ unclaimed_reg_debug(uncore, reg, false, false); \ - spin_unlock_irqrestore(&uncore->lock, irqflags) + raw_spin_unlock_irqrestore(&uncore->lock, irqflags)
#define __gen6_write(x) \ static void \ @@ -1828,7 +1828,7 @@ static int __fw_domain_init(struct intel_uncore *uncore,
d->mask = BIT(domain_id);
- hrtimer_init(&d->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); + hrtimer_init(&d->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD); d->timer.function = intel_uncore_fw_release_timer;
uncore->fw_domains |= BIT(domain_id); @@ -1961,11 +1961,11 @@ static int intel_uncore_fw_domains_init(struct intel_uncore *uncore) if (ret) goto out;
- spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); fw_domains_get_with_thread_status(uncore, FORCEWAKE_RENDER); ecobus = __raw_uncore_read32(uncore, ECOBUS); fw_domains_put(uncore, FORCEWAKE_RENDER); - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock);
if (!(ecobus & FORCEWAKE_MT_ENABLE)) { drm_info(&i915->drm, "No MT forcewake available on Ivybridge, this can result in issues\n"); @@ -2077,7 +2077,7 @@ static void uncore_mmio_cleanup(struct intel_uncore *uncore) void intel_uncore_init_early(struct intel_uncore *uncore, struct drm_i915_private *i915) { - spin_lock_init(&uncore->lock); + raw_spin_lock_init(&uncore->lock); uncore->i915 = i915; uncore->rpm = &i915->runtime_pm; uncore->debug = &i915->mmio_debug; @@ -2438,7 +2438,7 @@ int __intel_wait_for_register(struct intel_uncore *uncore,
might_sleep_if(slow_timeout_ms);
- spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); intel_uncore_forcewake_get__locked(uncore, fw);
ret = __intel_wait_for_register_fw(uncore, @@ -2446,7 +2446,7 @@ int __intel_wait_for_register(struct intel_uncore *uncore, fast_timeout_us, 0, ®_value);
intel_uncore_forcewake_put__locked(uncore, fw); - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock);
if (ret && slow_timeout_ms) ret = __wait_for(reg_value = intel_uncore_read_notrace(uncore, @@ -2467,9 +2467,9 @@ bool intel_uncore_unclaimed_mmio(struct intel_uncore *uncore) { bool ret;
- spin_lock_irq(&uncore->debug->lock); + raw_spin_lock_irq(&uncore->debug->lock); ret = check_for_unclaimed_mmio(uncore); - spin_unlock_irq(&uncore->debug->lock); + raw_spin_unlock_irq(&uncore->debug->lock);
return ret; } @@ -2479,7 +2479,7 @@ intel_uncore_arm_unclaimed_mmio_detection(struct intel_uncore *uncore) { bool ret = false;
- spin_lock_irq(&uncore->debug->lock); + raw_spin_lock_irq(&uncore->debug->lock);
if (unlikely(uncore->debug->unclaimed_mmio_check <= 0)) goto out; @@ -2497,7 +2497,7 @@ intel_uncore_arm_unclaimed_mmio_detection(struct intel_uncore *uncore) }
out: - spin_unlock_irq(&uncore->debug->lock); + raw_spin_unlock_irq(&uncore->debug->lock);
return ret; } @@ -2582,13 +2582,13 @@ u32 intel_uncore_read_with_mcr_steering(struct intel_uncore *uncore, GEN8_MCR_SELECTOR, FW_REG_READ | FW_REG_WRITE);
- spin_lock_irq(&uncore->lock); + raw_spin_lock_irq(&uncore->lock); intel_uncore_forcewake_get__locked(uncore, fw_domains);
val = intel_uncore_read_with_mcr_steering_fw(uncore, reg, slice, subslice);
intel_uncore_forcewake_put__locked(uncore, fw_domains); - spin_unlock_irq(&uncore->lock); + raw_spin_unlock_irq(&uncore->lock);
return val; } diff --git a/drivers/gpu/drm/i915/intel_uncore.h b/drivers/gpu/drm/i915/intel_uncore.h index 3c0b0a8b5250d..83062ce85aee8 100644 --- a/drivers/gpu/drm/i915/intel_uncore.h +++ b/drivers/gpu/drm/i915/intel_uncore.h @@ -38,7 +38,7 @@ struct intel_uncore; struct intel_gt;
struct intel_uncore_mmio_debug { - spinlock_t lock; /** lock is also taken in irq contexts. */ + raw_spinlock_t lock; /** lock is also taken in irq contexts. */ int unclaimed_mmio_check; int saved_mmio_check; u32 suspend_count; @@ -125,7 +125,7 @@ struct intel_uncore { struct drm_i915_private *i915; struct intel_runtime_pm *rpm;
- spinlock_t lock; /** lock is also taken in irq contexts. */ + raw_spinlock_t lock; /** lock is also taken in irq contexts. */
unsigned int flags; #define UNCORE_HAS_FORCEWAKE BIT(0)
The order of the header files is important. If this header file is included after tracepoint.h was included then the NOTRACE here becomes a nop. Currently this happens for two .c files which use the tracepoitns behind DRM_I915_LOW_LEVEL_TRACEPOINTS.
Cc: Steven Rostedt rostedt@goodmis.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de --- drivers/gpu/drm/i915/i915_trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 773e7362c4442..5ff6c0a37fdfa 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -826,7 +826,7 @@ DEFINE_EVENT(i915_request, i915_request_add, TP_ARGS(rq) );
-#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) +#if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) && !defined(NOTRACE) DEFINE_EVENT(i915_request, i915_request_guc_submit, TP_PROTO(struct i915_request *rq), TP_ARGS(rq)
Disabling interrupts and invoking the irq_work function directly breaks on PREEMPT_RT. PREEMPT_RT does not invoke all irq_work from hardirq context because some of the user have spinlock_t locking in the callback function. These locks are then turned into a sleeping locks which can not be acquired with disabled interrupts.
Using irq_work_queue() has the benefit that the irqwork will be invoked in the regular context. In general there is "no" delay between enqueuing the callback and its invocation because the interrupt is raised right away on architectures which support it (which includes x86).
Use irq_work_queue() + irq_work_sync() instead invoking the callback directly.
Reported-by: Clark Williams williams@redhat.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 209cf265bf746..6e1b9068d944c 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -311,10 +311,9 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b) /* Kick the work once more to drain the signalers, and disarm the irq */ irq_work_sync(&b->irq_work); while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) { - local_irq_disable(); - signal_irq_work(&b->irq_work); - local_irq_enable(); + irq_work_queue(&b->irq_work); cond_resched(); + irq_work_sync(&b->irq_work); } }
execlists_dequeue() is invoked from a function which uses local_irq_disable() to disable interrupts so the spin_lock() behaves like spin_lock_irq(). This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not the same as spin_lock_irq().
execlists_dequeue_irq() and execlists_dequeue() has each one caller only. If intel_engine_cs::active::lock is acquired and released with the _irq suffix then it behaves almost as if execlists_dequeue() would be invoked with disabled interrupts. The difference is the last part of the function which is then invoked with enabled interrupts. I can't tell if this makes a difference. From looking at it, it might work to move the last unlock at the end of the function as I didn't find anything that would acquire the lock again.
Reported-by: Clark Williams williams@redhat.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com --- .../drm/i915/gt/intel_execlists_submission.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index de5f9c86b9a44..dbf44f9567449 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -1283,7 +1283,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * and context switches) submission. */
- spin_lock(&sched_engine->lock); + spin_lock_irq(&sched_engine->lock);
/* * If the queue is higher priority than the last @@ -1383,7 +1383,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * Even if ELSP[1] is occupied and not worthy * of timeslices, our queue might be. */ - spin_unlock(&sched_engine->lock); + spin_unlock_irq(&sched_engine->lock); return; } } @@ -1409,7 +1409,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
if (last && !can_merge_rq(last, rq)) { spin_unlock(&ve->base.sched_engine->lock); - spin_unlock(&engine->sched_engine->lock); + spin_unlock_irq(&engine->sched_engine->lock); return; /* leave this for another sibling */ }
@@ -1571,7 +1571,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) */ sched_engine->queue_priority_hint = queue_prio(sched_engine); i915_sched_engine_reset_on_empty(sched_engine); - spin_unlock(&sched_engine->lock); + spin_unlock_irq(&sched_engine->lock);
/* * We can skip poking the HW if we ended up with exactly the same set @@ -1597,13 +1597,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } }
-static void execlists_dequeue_irq(struct intel_engine_cs *engine) -{ - local_irq_disable(); /* Suspend interrupts across request submission */ - execlists_dequeue(engine); - local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ -} - static void clear_ports(struct i915_request **ports, int count) { memset_p((void **)ports, NULL, count); @@ -2427,7 +2420,7 @@ static void execlists_submission_tasklet(struct tasklet_struct *t) }
if (!engine->execlists.pending[0]) { - execlists_dequeue_irq(engine); + execlists_dequeue(engine); start_timeslice(engine); }
The !irqs_disabled() check triggers on PREEMPT_RT even with i915_sched_engine::lock acquired. The reason is the lock is transformed into a sleeping lock on PREEMPT_RT and does not disable interrupts.
There is no need to check for disabled interrupts. The lockdep annotation below already check if the lock has been acquired by the caller and will yell if the interrupts are not disabled.
Remove the !irqs_disabled() check.
Reported-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- drivers/gpu/drm/i915/i915_request.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 79da5eca60af5..b9dd6100c6d17 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -559,7 +559,6 @@ bool __i915_request_submit(struct i915_request *request)
RQ_TRACE(request, "\n");
- GEM_BUG_ON(!irqs_disabled()); lockdep_assert_held(&engine->sched_engine->lock);
/* @@ -668,7 +667,6 @@ void __i915_request_unsubmit(struct i915_request *request) */ RQ_TRACE(request, "\n");
- GEM_BUG_ON(!irqs_disabled()); lockdep_assert_held(&engine->sched_engine->lock);
/*
This is a revert of commits d67739268cf0e ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") 6c69a45445af9 ("drm/i915/gt: Mark context->active_count as protected by timeline->mutex")
The existing code leads to a different behaviour depending on wheather lockdep is enabled or not. Any following lock that is acquired without disabling interrupts (but needs to) will not be noticed by lockdep.
This it not just a lockdep annotation but is used but an actual mutex_t that is properly used as a lock but in case of __timeline_mark_lock() lockdep is only told that it is acquired but no lock has been acquired.
It appears that its purporse is just satisfy the lockdep_assert_held() check in intel_context_mark_active(). The other problem with disabling interrupts is that on PREEMPT_RT interrupts are also disabled which leads to problems for instance later during memory allocation.
Add an argument to intel_context_mark_active() which is true if the lock must have been acquired, false if other magic is involved and the lock is not needed. Use the `false' argument only from within switch_to_kernel_context() and remove __timeline_mark_lock().
Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de --- drivers/gpu/drm/i915/gt/intel_context.h | 6 ++- .../gpu/drm/i915/gt/intel_engine_heartbeat.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_pm.c | 38 +------------------ drivers/gpu/drm/i915/i915_request.c | 7 ++-- drivers/gpu/drm/i915/i915_request.h | 3 +- 5 files changed, 12 insertions(+), 44 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index c410989507466..bed60dff93eff 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -161,9 +161,11 @@ static inline void intel_context_enter(struct intel_context *ce) ce->ops->enter(ce); }
-static inline void intel_context_mark_active(struct intel_context *ce) +static inline void intel_context_mark_active(struct intel_context *ce, + bool timeline_mutex_needed) { - lockdep_assert_held(&ce->timeline->mutex); + if (timeline_mutex_needed) + lockdep_assert_held(&ce->timeline->mutex); ++ce->active_count; }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c index 74775ae961b2b..02c0ab9fbb4b8 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_heartbeat.c @@ -42,7 +42,7 @@ heartbeat_create(struct intel_context *ce, gfp_t gfp) struct i915_request *rq;
intel_context_enter(ce); - rq = __i915_request_create(ce, gfp); + rq = __i915_request_create(ce, gfp, true); intel_context_exit(ce);
return rq; diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 1f07ac4e0672a..d75638d1d561e 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -80,39 +80,6 @@ static int __engine_unpark(struct intel_wakeref *wf) return 0; }
-#if IS_ENABLED(CONFIG_LOCKDEP) - -static unsigned long __timeline_mark_lock(struct intel_context *ce) -{ - unsigned long flags; - - local_irq_save(flags); - mutex_acquire(&ce->timeline->mutex.dep_map, 2, 0, _THIS_IP_); - - return flags; -} - -static void __timeline_mark_unlock(struct intel_context *ce, - unsigned long flags) -{ - mutex_release(&ce->timeline->mutex.dep_map, _THIS_IP_); - local_irq_restore(flags); -} - -#else - -static unsigned long __timeline_mark_lock(struct intel_context *ce) -{ - return 0; -} - -static void __timeline_mark_unlock(struct intel_context *ce, - unsigned long flags) -{ -} - -#endif /* !IS_ENABLED(CONFIG_LOCKDEP) */ - static void duration(struct dma_fence *fence, struct dma_fence_cb *cb) { struct i915_request *rq = to_request(fence); @@ -159,7 +126,6 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) { struct intel_context *ce = engine->kernel_context; struct i915_request *rq; - unsigned long flags; bool result = true;
/* GPU is pointing to the void, as good as in the kernel context. */ @@ -201,10 +167,9 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine) * engine->wakeref.count, we may see the request completion and retire * it causing an underflow of the engine->wakeref. */ - flags = __timeline_mark_lock(ce); GEM_BUG_ON(atomic_read(&ce->timeline->active_count) < 0);
- rq = __i915_request_create(ce, GFP_NOWAIT); + rq = __i915_request_create(ce, GFP_NOWAIT, false); if (IS_ERR(rq)) /* Context switch failed, hope for the best! Maybe reset? */ goto out_unlock; @@ -233,7 +198,6 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
result = false; out_unlock: - __timeline_mark_unlock(ce, flags); return result; }
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index b9dd6100c6d17..3bab8f651b4e7 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -833,7 +833,8 @@ static void __i915_request_ctor(void *arg) }
struct i915_request * -__i915_request_create(struct intel_context *ce, gfp_t gfp) +__i915_request_create(struct intel_context *ce, gfp_t gfp, + bool timeline_mutex_needed) { struct intel_timeline *tl = ce->timeline; struct i915_request *rq; @@ -957,7 +958,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
rq->infix = rq->ring->emit; /* end of header; start of user payload */
- intel_context_mark_active(ce); + intel_context_mark_active(ce, timeline_mutex_needed); list_add_tail_rcu(&rq->link, &tl->requests);
return rq; @@ -993,7 +994,7 @@ i915_request_create(struct intel_context *ce) i915_request_retire(rq);
intel_context_enter(ce); - rq = __i915_request_create(ce, GFP_KERNEL); + rq = __i915_request_create(ce, GFP_KERNEL, true); intel_context_exit(ce); /* active reference transferred to request */ if (IS_ERR(rq)) goto err_unlock; diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h index 1bc1349ba3c25..ba1ced79c8d2c 100644 --- a/drivers/gpu/drm/i915/i915_request.h +++ b/drivers/gpu/drm/i915/i915_request.h @@ -320,7 +320,8 @@ static inline bool dma_fence_is_i915(const struct dma_fence *fence) struct kmem_cache *i915_request_slab_cache(void);
struct i915_request * __must_check -__i915_request_create(struct intel_context *ce, gfp_t gfp); +__i915_request_create(struct intel_context *ce, gfp_t gfp, + bool timeline_mutex_needed); struct i915_request * __must_check i915_request_create(struct intel_context *ce);
On Tue, Oct 05, 2021 at 05:00:46PM +0200, Sebastian Andrzej Siewior wrote:
This is a revert of commits d67739268cf0e ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") 6c69a45445af9 ("drm/i915/gt: Mark context->active_count as protected by timeline->mutex")
The existing code leads to a different behaviour depending on wheather lockdep is enabled or not. Any following lock that is acquired without disabling interrupts (but needs to) will not be noticed by lockdep.
This it not just a lockdep annotation but is used but an actual mutex_t that is properly used as a lock but in case of __timeline_mark_lock() lockdep is only told that it is acquired but no lock has been acquired.
It appears that its purporse is just satisfy the lockdep_assert_held() check in intel_context_mark_active(). The other problem with disabling interrupts is that on PREEMPT_RT interrupts are also disabled which leads to problems for instance later during memory allocation.
Add an argument to intel_context_mark_active() which is true if the lock must have been acquired, false if other magic is involved and the lock is not needed. Use the `false' argument only from within switch_to_kernel_context() and remove __timeline_mark_lock().
Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
Eeew, nice find.
-static inline void intel_context_mark_active(struct intel_context *ce) +static inline void intel_context_mark_active(struct intel_context *ce,
bool timeline_mutex_needed)
{
- lockdep_assert_held(&ce->timeline->mutex);
- if (timeline_mutex_needed)
++ce->active_count;lockdep_assert_held(&ce->timeline->mutex);
}
Chris, might it be possible to write that something like:
lockdep_assert(lockdep_is_held(&ce->timeline->mutex) || engine_is_parked(ce));
instead?
Otherwise,
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
On 2021-10-05 21:16:17 [+0200], Peter Zijlstra wrote:
-static inline void intel_context_mark_active(struct intel_context *ce) +static inline void intel_context_mark_active(struct intel_context *ce,
bool timeline_mutex_needed)
{
- lockdep_assert_held(&ce->timeline->mutex);
- if (timeline_mutex_needed)
++ce->active_count;lockdep_assert_held(&ce->timeline->mutex);
}
Chris, might it be possible to write that something like:
lockdep_assert(lockdep_is_held(&ce->timeline->mutex) || engine_is_parked(ce));
instead?
This looks indeed way better given Torvald's yelling in similar cases.
Otherwise,
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
Sebastian
Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: 511e5fb0c3ab42d4897dcc5c0fbd1765e8a3d622 ("[PATCH 8/8] drm/i915: Don't disable interrupts and pretend a lock as been acquired in __timeline_mark_lock().") url: https://github.com/0day-ci/linux/commits/Sebastian-Andrzej-Siewior/drm-i915-... base: git://anongit.freedesktop.org/drm-intel for-linux-next
in testcase: kernel-selftests version: kernel-selftests-x86_64-c8c9111a-1_20210929 with following parameters:
group: x86 ucode: 0xde
test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
on test machine: 4 threads 1 sockets Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz with 32G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag Reported-by: kernel test robot oliver.sang@intel.com
kern :warn : [ 22.629411] ============================= kern :warn : [ 22.629781] WARNING: suspicious RCU usage kern :warn : [ 22.630161] 5.15.0-rc1-00267-g511e5fb0c3ab #1 Not tainted kern :warn : [ 22.630629] ----------------------------- kern :warn : [ 22.630987] drivers/gpu/drm/i915/i915_request.h:612 suspicious rcu_dereference_protected() usage! kern :warn : [ 22.631697] other info that might help us debug this:
kern :warn : [ 22.632455] rcu_scheduler_active = 2, debug_locks = 1 kern :warn : [ 22.633068] 2 locks held by systemd-udevd/220: kern :warn : [ 22.633456] #0: ffff8881012e7248 (&dev->mutex){....}-{3:3}, at: __driver_attach+0x93/0x180 kern :warn : [ 22.634148] #1: ffff8881032a21e0 (wakeref.mutex#3){+.+.}-{3:3}, at: __intel_wakeref_put_last+0x1b/0x80 [i915] kern :warn : [ 22.635067] stack backtrace: kern :warn : [ 22.635515] CPU: 0 PID: 220 Comm: systemd-udevd Not tainted 5.15.0-rc1-00267-g511e5fb0c3ab #1 kern :warn : [ 22.636184] Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0067.2018.0814.1500 08/14/2018 kern :warn : [ 22.636982] Call Trace: kern :warn : [ 22.637225] dump_stack_lvl+0x45/0x59 kern :warn : [ 22.637556] __i915_request_commit+0x158/0x1c0 [i915] kern :warn : [ 22.638103] __engine_park+0x98/0x280 [i915] kern :warn : [ 22.638575] ? mutex_trylock+0x141/0x180 kern :warn : [ 22.638921] ? __intel_wakeref_put_last+0x1b/0x80 [i915] kern :warn : [ 22.639467] ____intel_wakeref_put_last+0x1e/0x80 [i915] kern :warn : [ 22.640025] intel_gt_resume+0x244/0x2c0 [i915] kern :warn : [ 22.640521] intel_gt_init+0x146/0x300 [i915] kern :warn : [ 22.640999] i915_gem_init+0x129/0x1c0 [i915] kern :warn : [ 22.641490] i915_driver_probe+0x1b5/0x440 [i915] kern :warn : [ 22.641997] i915_pci_probe+0x54/0x140 [i915] kern :warn : [ 22.642471] local_pci_probe+0x45/0x80 kern :info : [ 22.642479] raid6: sse2x2 gen() 10681 MB/s kern :warn : [ 22.642823] pci_device_probe+0x16b/0x200 kern :warn : [ 22.643532] ? sysfs_do_create_link_sd+0x69/0x100 kern :warn : [ 22.643994] really_probe+0xc0/0x380 kern :warn : [ 22.644370] __driver_probe_device+0xfe/0x180 kern :warn : [ 22.644758] driver_probe_device+0x1e/0xc0 kern :warn : [ 22.645127] __driver_attach+0x9e/0x180 kern :warn : [ 22.645473] ? __device_attach_driver+0x100/0x100 kern :warn : [ 22.645880] ? __device_attach_driver+0x100/0x100 kern :warn : [ 22.646283] bus_for_each_dev+0x7b/0xc0 kern :warn : [ 22.646630] bus_add_driver+0x150/0x200 kern :warn : [ 22.646972] driver_register+0x6c/0xc0 kern :warn : [ 22.647306] i915_init+0x37/0xf4 [i915] kern :warn : [ 22.647745] ? 0xffffffffc0800000 kern :warn : [ 22.648068] do_one_initcall+0x5b/0x340 kern :warn : [ 22.648406] ? do_init_module+0x23/0x280 kern :warn : [ 22.648754] ? kmem_cache_alloc_trace+0x520/0x600 kern :warn : [ 22.649153] ? lock_is_held_type+0xd5/0x140 kern :warn : [ 22.649521] do_init_module+0x5c/0x280 kern :warn : [ 22.649855] load_module+0x11b7/0x1540 kern :warn : [ 22.650214] ? __do_sys_finit_module+0xae/0x140 kern :warn : [ 22.650599] __do_sys_finit_module+0xae/0x140 kern :warn : [ 22.650991] do_syscall_64+0x5c/0x80 kern :warn : [ 22.651313] ? lockdep_hardirqs_on+0x79/0x100 kern :warn : [ 22.651688] ? do_syscall_64+0x69/0x80 kern :warn : [ 22.652019] ? do_syscall_64+0x69/0x80 kern :warn : [ 22.652351] ? lockdep_hardirqs_on+0x79/0x100 kern :warn : [ 22.652724] ? do_syscall_64+0x69/0x80 kern :warn : [ 22.653055] ? asm_exc_page_fault+0x8/0x30 kern :warn : [ 22.653414] ? lockdep_hardirqs_on+0x79/0x100 kern :warn : [ 22.653789] entry_SYSCALL_64_after_hwframe+0x44/0xae kern :warn : [ 22.654213] RIP: 0033:0x7f29cde5cf59 kern :warn : [ 22.654534] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 07 6f 0c 00 f7 d8 64 89 01 48 kern :warn : [ 22.655914] RSP: 002b:00007ffe23342518 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 kern :warn : [ 22.656516] RAX: ffffffffffffffda RBX: 0000561851c9c110 RCX: 00007f29cde5cf59 kern :warn : [ 22.657083] RDX: 0000000000000000 RSI: 00007f29cdd61cad RDI: 0000000000000010 kern :warn : [ 22.657649] RBP: 00007f29cdd61cad R08: 0000000000000000 R09: 0000000000000000 kern :warn : [ 22.658220] R10: 0000000000000010 R11: 0000000000000246 R12: 0000000000000000 kern :warn : [ 22.658787] R13: 0000561851c93f00 R14: 0000000000020000 R15: 0000561851c9c110
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
--- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation
Thanks, Oliver Sang
dri-devel@lists.freedesktop.org