OMAP DSS hardware supports changing the output port to which an overlay manager's video stream goes. For example, DPI video stream can come from any of the four overlay managers on OMAP5.
However, as it's difficult to manage the change in the driver, the omapdss driver does not support that at the moment, and has a hardcoded overlay manager per output.
omapdrm, on the other hand, uses the hardware features to find out which overlay manager to use for an output, which causes problems. For example, on OMAP5, omapdrm tries to use DIGIT overlay manager for DPI output, instead of the LCD3 required by the omapdss driver.
This patch changes the omapdrm to use the omapdss driver's hardcoded overlay managers, which fixes the issue.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_drv.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c b/drivers/gpu/drm/omapdrm/omap_drv.c index 002b9721e85a..26fda74c1e48 100644 --- a/drivers/gpu/drm/omapdrm/omap_drv.c +++ b/drivers/gpu/drm/omapdrm/omap_drv.c @@ -286,14 +286,13 @@ static int omap_modeset_init(struct drm_device *dev) for (id = 0; id < priv->num_crtcs; id++) { struct drm_crtc *crtc = priv->crtcs[id]; enum omap_channel crtc_channel; - enum omap_dss_output_id supported_outputs;
crtc_channel = omap_crtc_channel(crtc); - supported_outputs = - dss_feat_get_supported_outputs(crtc_channel);
- if (supported_outputs & output->id) + if (output->dispc_channel == crtc_channel) { encoder->possible_crtcs |= (1 << id); + break; + } }
omap_dss_put_device(output);
The DRM documentation says:
"If a page flip is already pending, the page_flip operation must return -EBUSY."
Currently omapdrm returns -EINVAL instead. Fix omapdrm by returning -EBUSY.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_crtc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 2d28dc337cfb..0812b0f80167 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -360,7 +360,7 @@ static int omap_crtc_page_flip_locked(struct drm_crtc *crtc, if (omap_crtc->old_fb) { spin_unlock_irqrestore(&dev->event_lock, flags); dev_err(dev->dev, "already a pending flip\n"); - return -EINVAL; + return -EBUSY; }
omap_crtc->event = event;
omap_crtc's apply_worker does:
omap_irq_register(dev, &omap_crtc->apply_irq); dispc_mgr_go(channel);
This is supposed to enable the vsync irq, and set the GO bit. The vsync handler will later check if the HW has cleared the GO bit and queue apply work.
However, what may happen is that the vsync irq is enabled, and it gets ran before the GO bit is set by the apply_worker. In this case the vsync handler will see the GO bit as cleared, and queues the apply work too early.
This causes WARN'ings from dispc driver, and possibly other problems.
This patch fixes the issue by adding a new variable, 'go_bit_set' which is used to track the supposed state of GO bit and helps the apply worker and irq handler handle the job without a race.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_crtc.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 0812b0f80167..193979f97bdb 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -47,6 +47,9 @@ struct omap_crtc { bool enabled; bool full_update;
+ /* tracks the state of GO bit between irq handler and apply worker */ + bool go_bit_set; + struct omap_drm_apply apply;
struct omap_drm_irq apply_irq; @@ -442,11 +445,16 @@ static void omap_crtc_apply_irq(struct omap_drm_irq *irq, uint32_t irqstatus) if (!omap_crtc->error_irq.registered) __omap_irq_register(crtc->dev, &omap_crtc->error_irq);
- if (!dispc_mgr_go_busy(omap_crtc->channel)) { + /* make sure we see the most recent 'go_bit_set' */ + rmb(); + if (omap_crtc->go_bit_set && !dispc_mgr_go_busy(omap_crtc->channel)) { struct omap_drm_private *priv = crtc->dev->dev_private; DBG("%s: apply done", omap_crtc->name); __omap_irq_unregister(crtc->dev, &omap_crtc->apply_irq); + omap_crtc->go_bit_set = false; + /* make sure apple_worker sees 'go_bit_set = false' */ + wmb(); queue_work(priv->wq, &omap_crtc->apply_work); } } @@ -472,7 +480,9 @@ static void apply_worker(struct work_struct *work) * If we are still pending a previous update, wait.. when the * pending update completes, we get kicked again. */ - if (omap_crtc->apply_irq.registered) + /* make sure we see the most recent 'go_bit_set' */ + rmb(); + if (omap_crtc->go_bit_set) goto out;
/* finish up previous apply's: */ @@ -502,6 +512,9 @@ static void apply_worker(struct work_struct *work) if (dispc_mgr_is_enabled(channel)) { omap_irq_register(dev, &omap_crtc->apply_irq); dispc_mgr_go(channel); + omap_crtc->go_bit_set = true; + /* make sure the irq handler sees 'go_bit_set' */ + wmb(); } else { struct omap_drm_private *priv = dev->dev_private; queue_work(priv->wq, &omap_crtc->apply_work);
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc); + struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON); + + drm_modeset_unlock_all(dev); + omap_crtc_flush(crtc); + drm_modeset_lock_all(dev); }
static int omap_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Cheers, Daniel
- omap_crtc_flush(crtc);
- drm_modeset_lock_all(dev);
}
static int omap_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
1.9.1
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Sep 3, 2014 at 10:25 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
what about just dropping and re-acquiring crtc->mutex, since that is all the worker actually needs..
it would be a bigger task to get rid of the mutex on the worker, since we'd have to double buffer state. It seemed like rather than re-invent atomic, it would be better just to wait for atomic to land and then restructure the driver
BR, -R
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Cheers, Daniel
omap_crtc_flush(crtc);
drm_modeset_lock_all(dev);
}
static int omap_crtc_mode_set_base(struct drm_crtc *crtc, int x, int y,
1.9.1
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Wed, Sep 03, 2014 at 11:05:08AM -0400, Rob Clark wrote:
On Wed, Sep 3, 2014 at 10:25 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
what about just dropping and re-acquiring crtc->mutex, since that is all the worker actually needs..
it would be a bigger task to get rid of the mutex on the worker, since we'd have to double buffer state. It seemed like rather than re-invent atomic, it would be better just to wait for atomic to land and then restructure the driver
Well atomic kinda has the same idea to not take locks in the worker and instead relying on ordering. -Daniel
On Wed, Sep 3, 2014 at 11:21 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Sep 03, 2014 at 11:05:08AM -0400, Rob Clark wrote:
On Wed, Sep 3, 2014 at 10:25 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
what about just dropping and re-acquiring crtc->mutex, since that is all the worker actually needs..
it would be a bigger task to get rid of the mutex on the worker, since we'd have to double buffer state. It seemed like rather than re-invent atomic, it would be better just to wait for atomic to land and then restructure the driver
Well atomic kinda has the same idea to not take locks in the worker and instead relying on ordering.
so, I am not quite sure that would work.. (and maybe this is an extra thing to think about for atomic).
The problem is that hw has no hw cursor. So cursor is implemented using plane. But thanks to retarded userspace that rapidly enables/disables cursor in some cases, we really need plane updates to be async. The way it currently works is userspace thread updates plane state as rapidly as it wants. When worker eventually gets to run, it takes the most current state, possibly skipping 1000's of enable/disable that happened in the mean time.
But for that to work, at least the way the driver is structured currently, the worker needs to synchronize against userspace somehow.
BR, -R
-Daniel
Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Wed, Sep 3, 2014 at 6:00 PM, Rob Clark robdclark@gmail.com wrote:
so, I am not quite sure that would work.. (and maybe this is an extra thing to think about for atomic).
The problem is that hw has no hw cursor. So cursor is implemented using plane. But thanks to retarded userspace that rapidly enables/disables cursor in some cases, we really need plane updates to be async. The way it currently works is userspace thread updates plane state as rapidly as it wants. When worker eventually gets to run, it takes the most current state, possibly skipping 1000's of enable/disable that happened in the mean time.
But for that to work, at least the way the driver is structured currently, the worker needs to synchronize against userspace somehow.
We could cancel the async worker before the update. But that still leaves the problem that in general this needs full driver support to work, and we can't really assume that. I guess we need to make a special hack for the cursor emulation code to make sure this keeps working: Cursor updates would only update the sw state and then rearm a fully async cursor update work item, except when the cursor size changed since the driver might reject that. I need to think more about this. -Daniel
On 03/09/14 17:25, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Yes, it's ugly. But doing it with either queuing or caching would be a major change. I was just trying to do smallish fixes to the driver for now.
How about only unlocking/locking the crtc->mutex as Rob suggested? I think it should work, but I didn't try it yet. Would that be as bad for the locking rework?
Tomi
On Mon, Sep 08, 2014 at 03:53:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:25, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Yes, it's ugly. But doing it with either queuing or caching would be a major change. I was just trying to do smallish fixes to the driver for now.
How about only unlocking/locking the crtc->mutex as Rob suggested? I think it should work, but I didn't try it yet. Would that be as bad for the locking rework?
Same problem really, you shouldn't drop ww mutexes and reacquire them in the atomic world. ww mutexes have some debug infrastructure for that (ww_acquire_done) to catch abusers of this.
So if you want to go forward with this it needs at least a big FIXME comment explaining that this is wrong. If you use locking to enforce ordering constraints that usually doesn't work well, and dropping locks to wait for async workers is plainly a locking design bug. -Daniel
On Mon, Sep 8, 2014 at 9:24 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Sep 08, 2014 at 03:53:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:25, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Yes, it's ugly. But doing it with either queuing or caching would be a major change. I was just trying to do smallish fixes to the driver for now.
How about only unlocking/locking the crtc->mutex as Rob suggested? I think it should work, but I didn't try it yet. Would that be as bad for the locking rework?
Same problem really, you shouldn't drop ww mutexes and reacquire them in the atomic world. ww mutexes have some debug infrastructure for that (ww_acquire_done) to catch abusers of this.
So if you want to go forward with this it needs at least a big FIXME comment explaining that this is wrong. If you use locking to enforce ordering constraints that usually doesn't work well, and dropping locks to wait for async workers is plainly a locking design bug.
well, the locking in core makes it in some ways a bit of a midlayer.. ;-)
Some crtc->funcs->wait_until_flushed() sort of thing that drm_mode_setcrtc() could call after dropping locks would go a long ways to fix this case. (Although post-atomic, may no longer be required.)
BR, -R
-Daniel
Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
On Mon, Sep 08, 2014 at 09:39:40AM -0400, Rob Clark wrote:
On Mon, Sep 8, 2014 at 9:24 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Sep 08, 2014 at 03:53:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:25, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Yes, it's ugly. But doing it with either queuing or caching would be a major change. I was just trying to do smallish fixes to the driver for now.
How about only unlocking/locking the crtc->mutex as Rob suggested? I think it should work, but I didn't try it yet. Would that be as bad for the locking rework?
Same problem really, you shouldn't drop ww mutexes and reacquire them in the atomic world. ww mutexes have some debug infrastructure for that (ww_acquire_done) to catch abusers of this.
So if you want to go forward with this it needs at least a big FIXME comment explaining that this is wrong. If you use locking to enforce ordering constraints that usually doesn't work well, and dropping locks to wait for async workers is plainly a locking design bug.
well, the locking in core makes it in some ways a bit of a midlayer.. ;-)
It's the locking for the drm core layer, it better be done in drm core. Drivers are free to use it, but if it doesn't fit they can always do their own. And if you look at i915 we actually have a metric pile of mutexes taken from modeset code and other places exactly because the drm core modeset locking can't work for everyone.
Some crtc->funcs->wait_until_flushed() sort of thing that drm_mode_setcrtc() could call after dropping locks would go a long ways to fix this case. (Although post-atomic, may no longer be required.)
Now _this_ smells like a proper midlayer - it enforces behaviour sequencing by providing a bunch of callbacks for drivers to hook into.
Also, locking isn't to enforce ordering constraints at all, that's the job of flush_workqueue, completion and friends. And if such a sync point would result in deadlocks if your worker also uses the modeset locks then your driver needs to grow internal mutexes to coordinate the data accessed by workers. In i915 we've added a pile of such cases in the past (and will add more), e.g. the pps mutex, backlight spinlock, edp mutex, ...
ADF does the same mistake of "hey let's do the synchronization in the core because driver writers get it wrong all the time". Which doesn't make it better ;-)
Now if your idea was to provide this as a helper, which either enforces the ordering correctly to not need locks, or has it's own locking of in-flux requests or similar, then I'm in favour. But adding such a callback at the core is nacked with extreme prejudice from me ...
Cheers, Daniel
On Mon, Sep 8, 2014 at 9:50 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Sep 08, 2014 at 09:39:40AM -0400, Rob Clark wrote:
On Mon, Sep 8, 2014 at 9:24 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Mon, Sep 08, 2014 at 03:53:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:25, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:05PM +0300, Tomi Valkeinen wrote:
Currently modesetting is not done synchronously, but it queues work that is done later. In theory this is fine, but the driver doesn't handle it at properly. This means that if an application first does a full modeset, then immediately afterwards starts page flipping, the page flipping will not work properly as there's modeset work still in the queue.
The result with my test application was that a backbuffer was shown on the screen.
Fixing this properly would be rather big undertaking. Thus this patch fixes the issue by making the modesetting synchronous, by waiting for the queued work to be done at the end of omap_crtc->commit().
The ugly part here is that the background work takes crtc->mutex, and the modesetting also holds that lock, which means that the background work never gets done. To get around this, we unclock, wait, and lock again.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_crtc.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 193979f97bdb..3261fbf94957 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -277,8 +277,13 @@ static void omap_crtc_prepare(struct drm_crtc *crtc) static void omap_crtc_commit(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct drm_device *dev = crtc->dev; DBG("%s", omap_crtc->name); omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
- drm_modeset_unlock_all(dev);
This will run afoul of the upcoming locking rework in the atomic work. And I'm fairly sure that the crtc helpers will fall over badly if someone submits a concurrent setCrtc while you've dropped the locks here.
Can't you instead just drop the locking from the worker? As long as you synchronize like here at the right places it can't race. I expect that you might want to synchronize in the crtc_prepare hook, too. But beyond that it should work.
As-is nacked because future headaches for me ;-)
Yes, it's ugly. But doing it with either queuing or caching would be a major change. I was just trying to do smallish fixes to the driver for now.
How about only unlocking/locking the crtc->mutex as Rob suggested? I think it should work, but I didn't try it yet. Would that be as bad for the locking rework?
Same problem really, you shouldn't drop ww mutexes and reacquire them in the atomic world. ww mutexes have some debug infrastructure for that (ww_acquire_done) to catch abusers of this.
So if you want to go forward with this it needs at least a big FIXME comment explaining that this is wrong. If you use locking to enforce ordering constraints that usually doesn't work well, and dropping locks to wait for async workers is plainly a locking design bug.
well, the locking in core makes it in some ways a bit of a midlayer.. ;-)
It's the locking for the drm core layer, it better be done in drm core. Drivers are free to use it, but if it doesn't fit they can always do their own. And if you look at i915 we actually have a metric pile of mutexes taken from modeset code and other places exactly because the drm core modeset locking can't work for everyone.
note that I'm not actually advocating removing/changing the locking in core.. I was just pointing out that the way it is currently structured can get in the way.
Some crtc->funcs->wait_until_flushed() sort of thing that drm_mode_setcrtc() could call after dropping locks would go a long ways to fix this case. (Although post-atomic, may no longer be required.)
Now _this_ smells like a proper midlayer - it enforces behaviour sequencing by providing a bunch of callbacks for drivers to hook into.
well, I was thinking more in terms of an optional callback so that drivers which (at least currently) need to work like omapdrm can actually do what they need to do.
that said, I think when atomic eventually lands, it would be a good idea to revisit how omapdrm works.. I think atomic should give a better way for it to do what it needs to do..
In the mean time, we need *something*. Although I think dropping and re-grabbing locks is an ok temporary solution. It wouldn't exactly be the only driver that does this.
BR, -R
Also, locking isn't to enforce ordering constraints at all, that's the job of flush_workqueue, completion and friends. And if such a sync point would result in deadlocks if your worker also uses the modeset locks then your driver needs to grow internal mutexes to coordinate the data accessed by workers. In i915 we've added a pile of such cases in the past (and will add more), e.g. the pps mutex, backlight spinlock, edp mutex, ...
ADF does the same mistake of "hey let's do the synchronization in the core because driver writers get it wrong all the time". Which doesn't make it better ;-)
Now if your idea was to provide this as a helper, which either enforces the ordering correctly to not need locks, or has it's own locking of in-flux requests or similar, then I'm in favour. But adding such a callback at the core is nacked with extreme prejudice from me ...
Cheers, Daniel
Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch
Clear omap_obj's paddr when unmapping the memory, so that it's easier to catch bad use of the paddr.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_gem.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c index e4849413ee80..b342e7b3bf00 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -829,6 +829,7 @@ int omap_gem_put_paddr(struct drm_gem_object *obj) dev_err(obj->dev->dev, "could not release unmap: %d\n", ret); } + omap_obj->paddr = 0; omap_obj->block = NULL; } }
omap_framebuffer_pin() and omap_framebuffer_unpin() are currently broken, as they cannot be called multiple times (i.e. pin, pin, unpin, unpin), which is what happens in certain cases. This issue causes the driver to possibly use 0 as an address for a displayed buffer, leading to OCP error from DSS.
This patch fixes the issue by adding a simple pin_count, used to track the number of pins.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_fb.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_fb.c b/drivers/gpu/drm/omapdrm/omap_fb.c index 2a5cacdc344b..58f2af32ede8 100644 --- a/drivers/gpu/drm/omapdrm/omap_fb.c +++ b/drivers/gpu/drm/omapdrm/omap_fb.c @@ -86,6 +86,7 @@ struct plane {
struct omap_framebuffer { struct drm_framebuffer base; + int pin_count; const struct format *format; struct plane planes[4]; }; @@ -261,6 +262,11 @@ int omap_framebuffer_pin(struct drm_framebuffer *fb) struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb); int ret, i, n = drm_format_num_planes(fb->pixel_format);
+ if (omap_fb->pin_count > 0) { + omap_fb->pin_count++; + return 0; + } + for (i = 0; i < n; i++) { struct plane *plane = &omap_fb->planes[i]; ret = omap_gem_get_paddr(plane->bo, &plane->paddr, true); @@ -269,6 +275,8 @@ int omap_framebuffer_pin(struct drm_framebuffer *fb) omap_gem_dma_sync(plane->bo, DMA_TO_DEVICE); }
+ omap_fb->pin_count++; + return 0;
fail: @@ -287,6 +295,11 @@ int omap_framebuffer_unpin(struct drm_framebuffer *fb) struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb); int ret, i, n = drm_format_num_planes(fb->pixel_format);
+ omap_fb->pin_count--; + + if (omap_fb->pin_count > 0) + return 0; + for (i = 0; i < n; i++) { struct plane *plane = &omap_fb->planes[i]; ret = omap_gem_put_paddr(plane->bo);
omap_crtc_flush() is used to wait for scheduled work to be done for the give crtc. However, it's not quite right at the moment.
omap_crtc_flush() does wait for work that is ran via vsync irq to be done. However, work is also queued to the driver's priv->wq workqueue, which is not handled by omap_crtc_flush().
Improve omap_crtc_flush() to flush the workqueue so that work there will be ran.
This fixes a race issue on module unload, where an unpin work may be on the work queue, but does not get ran before drm core starts tearing the driver down, leading to a WARN.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_crtc.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 3261fbf94957..506cf8bc804f 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -655,21 +655,42 @@ static void omap_crtc_post_apply(struct omap_drm_apply *apply) /* nothing needed for post-apply */ }
+static bool omap_crtc_work_pending(struct omap_crtc *omap_crtc) +{ + return !list_empty(&omap_crtc->pending_applies) || + !list_empty(&omap_crtc->queued_applies) || + omap_crtc->event || omap_crtc->old_fb; +} + +/* + * Wait for any work on the workqueue to be finished, and any work which will + * be run via vsync irq to be done. + * + * Note that work on workqueue could schedule new vsync work, and vice versa. + */ void omap_crtc_flush(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc); + struct omap_drm_private *priv = crtc->dev->dev_private; int loops = 0;
- while (!list_empty(&omap_crtc->pending_applies) || - !list_empty(&omap_crtc->queued_applies) || - omap_crtc->event || omap_crtc->old_fb) { + while (true) { + /* first flush the wq, so that any scheduled work is done */ + flush_workqueue(priv->wq); + + /* if we have nothing queued for this crtc, we're done */ + if (!omap_crtc_work_pending(omap_crtc)) + break;
if (++loops > 10) { - dev_err(crtc->dev->dev, - "omap_crtc_flush() timeout\n"); + dev_err(crtc->dev->dev, "omap_crtc_flush() timeout\n"); break; }
+ /* + * wait for a bit so that a vsync has (probably) happened, and + * that the crtc work is (probably) done. + */ schedule_timeout_uninterruptible(msecs_to_jiffies(20)); } }
On Wed, Sep 03, 2014 at 02:55:08PM +0300, Tomi Valkeinen wrote:
omap_crtc_flush() is used to wait for scheduled work to be done for the give crtc. However, it's not quite right at the moment.
omap_crtc_flush() does wait for work that is ran via vsync irq to be done. However, work is also queued to the driver's priv->wq workqueue, which is not handled by omap_crtc_flush().
Improve omap_crtc_flush() to flush the workqueue so that work there will be ran.
This fixes a race issue on module unload, where an unpin work may be on the work queue, but does not get ran before drm core starts tearing the driver down, leading to a WARN.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
I didn't really dig into details, but isn't that the same workqueue as used by the async modeset code? So the same deadlocks might happen ...
lockdep won't complain though since you essentially open-code a workqueue_flush, and lockdep also doesn't complain about all possible deadlocks (due to some design issues with lockdep). -Daniel
drivers/gpu/drm/omapdrm/omap_crtc.c | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c index 3261fbf94957..506cf8bc804f 100644 --- a/drivers/gpu/drm/omapdrm/omap_crtc.c +++ b/drivers/gpu/drm/omapdrm/omap_crtc.c @@ -655,21 +655,42 @@ static void omap_crtc_post_apply(struct omap_drm_apply *apply) /* nothing needed for post-apply */ }
+static bool omap_crtc_work_pending(struct omap_crtc *omap_crtc) +{
- return !list_empty(&omap_crtc->pending_applies) ||
!list_empty(&omap_crtc->queued_applies) ||
omap_crtc->event || omap_crtc->old_fb;
+}
+/*
- Wait for any work on the workqueue to be finished, and any work which will
- be run via vsync irq to be done.
- Note that work on workqueue could schedule new vsync work, and vice versa.
- */
void omap_crtc_flush(struct drm_crtc *crtc) { struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
- struct omap_drm_private *priv = crtc->dev->dev_private; int loops = 0;
- while (!list_empty(&omap_crtc->pending_applies) ||
!list_empty(&omap_crtc->queued_applies) ||
omap_crtc->event || omap_crtc->old_fb) {
while (true) {
/* first flush the wq, so that any scheduled work is done */
flush_workqueue(priv->wq);
/* if we have nothing queued for this crtc, we're done */
if (!omap_crtc_work_pending(omap_crtc))
break;
if (++loops > 10) {
dev_err(crtc->dev->dev,
"omap_crtc_flush() timeout\n");
dev_err(crtc->dev->dev, "omap_crtc_flush() timeout\n"); break;
}
/*
* wait for a bit so that a vsync has (probably) happened, and
* that the crtc work is (probably) done.
*/
schedule_timeout_uninterruptible(msecs_to_jiffies(20)); }
}
1.9.1
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On 03/09/14 17:27, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:08PM +0300, Tomi Valkeinen wrote:
omap_crtc_flush() is used to wait for scheduled work to be done for the give crtc. However, it's not quite right at the moment.
omap_crtc_flush() does wait for work that is ran via vsync irq to be done. However, work is also queued to the driver's priv->wq workqueue, which is not handled by omap_crtc_flush().
Improve omap_crtc_flush() to flush the workqueue so that work there will be ran.
This fixes a race issue on module unload, where an unpin work may be on the work queue, but does not get ran before drm core starts tearing the driver down, leading to a WARN.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
I didn't really dig into details, but isn't that the same workqueue as used by the async modeset code? So the same deadlocks might happen ...
Yes, we have just one workqueue in the driver.
Hmm, deadlocks with what lock? The modeconfig or crtc->mutex? I don't think they are locked at any place where omap_crtc_flush is called.
lockdep won't complain though since you essentially open-code a workqueue_flush, and lockdep also doesn't complain about all possible deadlocks (due to some design issues with lockdep).
What do you mean "open-code a workqueue_flush"?. I use flush_workqueue there. We have two things to wait for: work on the workqueue and work which is triggered by the vsync irq. So we loop and test for both of those, until there's no more work.
Tomi
On Mon, Sep 08, 2014 at 04:03:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:27, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:08PM +0300, Tomi Valkeinen wrote:
omap_crtc_flush() is used to wait for scheduled work to be done for the give crtc. However, it's not quite right at the moment.
omap_crtc_flush() does wait for work that is ran via vsync irq to be done. However, work is also queued to the driver's priv->wq workqueue, which is not handled by omap_crtc_flush().
Improve omap_crtc_flush() to flush the workqueue so that work there will be ran.
This fixes a race issue on module unload, where an unpin work may be on the work queue, but does not get ran before drm core starts tearing the driver down, leading to a WARN.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
I didn't really dig into details, but isn't that the same workqueue as used by the async modeset code? So the same deadlocks might happen ...
Yes, we have just one workqueue in the driver.
Hmm, deadlocks with what lock? The modeconfig or crtc->mutex? I don't think they are locked at any place where omap_crtc_flush is called.
Oh, I presumed you're using _flush in the relevant modeset functions - we do that in i915 to make sure that all the pageflips and other stuff completed before we do another modeset. But omap only calls this at driver unload, so no direct problem.
lockdep won't complain though since you essentially open-code a workqueue_flush, and lockdep also doesn't complain about all possible deadlocks (due to some design issues with lockdep).
What do you mean "open-code a workqueue_flush"?. I use flush_workqueue there. We have two things to wait for: work on the workqueue and work which is triggered by the vsync irq. So we loop and test for both of those, until there's no more work.
Oops, missed that. Ordering looks wrong though since if the irq can latch the workqueue you need to wait for irqs to happen first before flushing. And obviously queue the work before signalling the completion of the interrupt. But since this seems to lack locking anyway and is only used for unload it doesn't really matter. -Daniel
On 08/09/14 16:31, Daniel Vetter wrote:
On Mon, Sep 08, 2014 at 04:03:18PM +0300, Tomi Valkeinen wrote:
On 03/09/14 17:27, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:08PM +0300, Tomi Valkeinen wrote:
omap_crtc_flush() is used to wait for scheduled work to be done for the give crtc. However, it's not quite right at the moment.
omap_crtc_flush() does wait for work that is ran via vsync irq to be done. However, work is also queued to the driver's priv->wq workqueue, which is not handled by omap_crtc_flush().
Improve omap_crtc_flush() to flush the workqueue so that work there will be ran.
This fixes a race issue on module unload, where an unpin work may be on the work queue, but does not get ran before drm core starts tearing the driver down, leading to a WARN.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
I didn't really dig into details, but isn't that the same workqueue as used by the async modeset code? So the same deadlocks might happen ...
Yes, we have just one workqueue in the driver.
Hmm, deadlocks with what lock? The modeconfig or crtc->mutex? I don't think they are locked at any place where omap_crtc_flush is called.
Oh, I presumed you're using _flush in the relevant modeset functions - we
No. That's the locking issue again. We can't flush when holding the crtc mutex, as the works in the workqueue also try to grab it...
do that in i915 to make sure that all the pageflips and other stuff completed before we do another modeset. But omap only calls this at driver unload, so no direct problem.
At the moment yes, but in this series I add the same omap_crtc_flush() call to two new places: dev_preclose and omap_crtc_commit. Of which the omap_crtc_commit is the problematic one, discussed in the mail thread for patch 4.
lockdep won't complain though since you essentially open-code a workqueue_flush, and lockdep also doesn't complain about all possible deadlocks (due to some design issues with lockdep).
What do you mean "open-code a workqueue_flush"?. I use flush_workqueue there. We have two things to wait for: work on the workqueue and work which is triggered by the vsync irq. So we loop and test for both of those, until there's no more work.
Oops, missed that. Ordering looks wrong though since if the irq can latch the workqueue you need to wait for irqs to happen first before flushing. And obviously queue the work before signalling the completion of the interrupt. But since this seems to lack locking anyway and is only used for unload it doesn't really matter.
Yeah, well, the workqueue can create work for the irq also. I don't know if it does, currently, but I think it's safer to presume that both workqueue and the irq can create work to the other.
But that's why I have a loop there. So we flush, then check if there is work for the irq. If yes, sleep a bit and go back to start. So if the irq work created new work for the wq, we flush that. And if that work created new work for the irq, we check that again. Etc.
Tomi
On Tue, Sep 9, 2014 at 3:07 AM, Tomi Valkeinen tomi.valkeinen@ti.com wrote:
lockdep won't complain though since you essentially open-code a workqueue_flush, and lockdep also doesn't complain about all possible deadlocks (due to some design issues with lockdep).
What do you mean "open-code a workqueue_flush"?. I use flush_workqueue there. We have two things to wait for: work on the workqueue and work which is triggered by the vsync irq. So we loop and test for both of those, until there's no more work.
Oops, missed that. Ordering looks wrong though since if the irq can latch the workqueue you need to wait for irqs to happen first before flushing. And obviously queue the work before signalling the completion of the interrupt. But since this seems to lack locking anyway and is only used for unload it doesn't really matter.
from a quick look at omap_crtc_flush(), you probably also race w/ apply worker list manipulation (like moving from queued_applies to pending_applies)
Yeah, well, the workqueue can create work for the irq also. I don't know if it does, currently, but I think it's safer to presume that both workqueue and the irq can create work to the other.
But that's why I have a loop there. So we flush, then check if there is work for the irq. If yes, sleep a bit and go back to start. So if the irq work created new work for the wq, we flush that. And if that work created new work for the irq, we check that again. Etc.
I think adding an internal per-crtc apply_lock would solve a lot of problems. I was originally shying away from recommending that, mostly because I didn't want to think about it and I though there was an easier solution.
But with an apply_lock we could at least add some locking to _flush() and make it not racy as well..
Although, I wonder if some waitqueue scheme would be a bit more sane.. ie. end of apply_worker, if there is nothing more queued up, signal the event that _flush() is waiting on.
BR, -R
On 09/09/14 13:43, Rob Clark wrote:
Although, I wonder if some waitqueue scheme would be a bit more sane.. ie. end of apply_worker, if there is nothing more queued up, signal the event that _flush() is waiting on.
Maybe, but we would still need either the separate apply_lock, or unlock the crtc->mutex to allow the workqueue to run.
Tomi
We already wait for all scheduled work to be done on the driver's unload function. However, I think we need to wait on preclose() also, so that when an application closes the drm file descriptor, we are sure that there's no left around.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_drv.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c b/drivers/gpu/drm/omapdrm/omap_drv.c index 26fda74c1e48..f59fa6600cf8 100644 --- a/drivers/gpu/drm/omapdrm/omap_drv.c +++ b/drivers/gpu/drm/omapdrm/omap_drv.c @@ -594,7 +594,20 @@ static void dev_lastclose(struct drm_device *dev)
static void dev_preclose(struct drm_device *dev, struct drm_file *file) { + struct omap_drm_private *priv = dev->dev_private; + int i; + DBG("preclose: dev=%p", dev); + + /* + * Flush crtcs to finish any pending work. + * Note: this may not be correct if there are multiple applications + * using the drm device, and could possibly result in a timeout from + * omap_crtc_flush() if an other application is actively queuing new + * work. + */ + for (i = 0; i < priv->num_crtcs; i++) + omap_crtc_flush(priv->crtcs[i]); }
static void dev_postclose(struct drm_device *dev, struct drm_file *file)
On Wed, Sep 03, 2014 at 02:55:09PM +0300, Tomi Valkeinen wrote:
We already wait for all scheduled work to be done on the driver's unload function. However, I think we need to wait on preclose() also, so that when an application closes the drm file descriptor, we are sure that there's no left around.
The justification (likely, didn't check omapdrm code) for this is that we need to clear out all outstanding drm_events. Currently i915 and some other drivers (but not all) have that open-coded. We should probably track drm_events on a per-fd list and move this logic into the core instead of adding more driver-specific hacks. -Daniel
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com
drivers/gpu/drm/omapdrm/omap_drv.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c b/drivers/gpu/drm/omapdrm/omap_drv.c index 26fda74c1e48..f59fa6600cf8 100644 --- a/drivers/gpu/drm/omapdrm/omap_drv.c +++ b/drivers/gpu/drm/omapdrm/omap_drv.c @@ -594,7 +594,20 @@ static void dev_lastclose(struct drm_device *dev)
static void dev_preclose(struct drm_device *dev, struct drm_file *file) {
- struct omap_drm_private *priv = dev->dev_private;
- int i;
- DBG("preclose: dev=%p", dev);
- /*
* Flush crtcs to finish any pending work.
* Note: this may not be correct if there are multiple applications
* using the drm device, and could possibly result in a timeout from
* omap_crtc_flush() if an other application is actively queuing new
* work.
*/
- for (i = 0; i < priv->num_crtcs; i++)
omap_crtc_flush(priv->crtcs[i]);
}
static void dev_postclose(struct drm_device *dev, struct drm_file *file)
1.9.1
dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
On 03/09/14 17:32, Daniel Vetter wrote:
On Wed, Sep 03, 2014 at 02:55:09PM +0300, Tomi Valkeinen wrote:
We already wait for all scheduled work to be done on the driver's unload function. However, I think we need to wait on preclose() also, so that when an application closes the drm file descriptor, we are sure that there's no left around.
Hmm, that's supposed to say "there's no work left around".
The justification (likely, didn't check omapdrm code) for this is that we need to clear out all outstanding drm_events. Currently i915 and some other drivers (but not all) have that open-coded. We should probably track drm_events on a per-fd list and move this logic into the core instead of adding more driver-specific hacks.
To be honest, I'm not 100% sure what we need to wait for =). I'm still learning DRM, and omapdrm's work handling is not the easiest one to follow. But yes, at least the page-flip event is one that rings alarm bells in my head, if the fb is allowed to be closed and there's still an event outstanding.
With this patch we wait until all work is done in the preclose. The performance is not an issue here, so I went for "better safe than sorry".
Tomi
unpin_worker() calls omap_framebuffer_unpin() without any locks, which looks very suspicious. However, both pin and unpin are always called via the driver's private workqueue, so the access is synchronized that way.
Add a comment to make this clear.
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ti.com --- drivers/gpu/drm/omapdrm/omap_plane.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c b/drivers/gpu/drm/omapdrm/omap_plane.c index 891a4dc608af..743d04981d71 100644 --- a/drivers/gpu/drm/omapdrm/omap_plane.c +++ b/drivers/gpu/drm/omapdrm/omap_plane.c @@ -71,6 +71,10 @@ static void unpin_worker(struct drm_flip_work *work, void *val) container_of(work, struct omap_plane, unpin_work); struct drm_device *dev = omap_plane->base.dev;
+ /* + * omap_framebuffer_pin/unpin are always called from priv->wq, + * so there's no need for locking here. + */ omap_framebuffer_unpin(val); mutex_lock(&dev->mode_config.mutex); drm_framebuffer_unreference(val);
dri-devel@lists.freedesktop.org