Re: [PATCH v4 5/5] drm/i915: Solve the GPU reset vs. modeset deadlocks with an rw_semaphore

30 Jun 2017


      Quoting Daniel Vetter (2017-06-30 16:30:59)
...
On Fri, Jun 30, 2017 at 04:53:19PM +0300, Ville Syrjälä wrote:
...
On Fri, Jun 30, 2017 at 03:35:03PM +0200, Daniel Vetter wrote:
...
On Thu, Jun 29, 2017 at 10:26:08PM +0300, Ville Syrjälä wrote:
...
On Thu, Jun 29, 2017 at 06:57:30PM +0100, Chris Wilson wrote:
...
Quoting ville.syrjala@linux.intel.com (2017-06-29 15:36:42)
...
From: Ville Syrjälä ville.syrjala@linux.intel.com
Introduce an rw_semaphore to protect the display commits. All normal
commits use down_read() and hence can proceed in parallel, but GPU reset
will use down_write() making sure no other commits are in progress when
we have to pull the plug on the display engine on pre-g4x platforms.
There are no modeset/gem locks taken inside __intel_atomic_commit_tail()
itself, and we wait for all dependencies before the down_read(), and
thus there is no chance of deadlocks with this scheme.
This matches what I thought should be done (I didn't think of using
rwsem just a mutex, nice touch). The point I got stuck on was what
should be done after the reset? I expected another modeset to return the
state back or otherwise the inflight would get confused?
I guess that can happen. For instance, if we have a crtc_enable() in flight,
and then we do a reset before it gets committed we would end up doing
crtc_enable() twice in a row without a crtc_disable in between. For page
flips and such this shouldn't be a big deal in general.
atomic commits are ordered. You have to wait for the previous ones to
complete before you do a new one. If you don't do that, then all hell
breaks loose.
What we're effectively doing here is inserting two new commits in
the middle of the timeline, one to disable everything, and another
one to re-enable everything. At the end of the the re-enable we would
want the hardware state should match exactly what was there before
the disable, hence any commits still in the timeline should work
correctly. That is, their old_state will match the hardware state
when it comes time to commit them.
But we can only do that properly after we start to track the committed
state. Without that tracking we can get into trouble wrt. the
hardware state not matching the old state when it comes time to
commit the new state.
Yeah, but my point is that this here is an extremely fancy and fragile
(and afaics in this form, incomplete) fix for something that in the past
was fixed much, much simpler. Why do we need this massive amount of
complexity now? Who's asking for all this (we don't even have anyone yet
asking for fully queued atomic commits, much less on gen4)?
It was never "fixed", it was always borked; broken by design.
-Chris

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH v4 5/5] drm/i915: Solve the GPU reset vs. modeset deadlocks with an rw_semaphore