The panfrost driver tries to kill in-flight jobs on FD close after destroying the FD scheduler entities. For this to work properly, we need to make sure the jobs popped from the scheduler entities have been queued at the HW level before declaring the entity idle, otherwise we might iterate over a list that doesn't contain those jobs.
Suggested-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Cc: Lucas Stach l.stach@pengutronix.de --- drivers/gpu/drm/scheduler/sched_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 81496ae2602e..aa776ebe326a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -811,10 +811,10 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
- complete(&entity->entity_idle); - - if (!sched_job) + if (!sched_job) { + complete(&entity->entity_idle); continue; + }
s_fence = sched_job->s_fence;
@@ -823,6 +823,7 @@ static int drm_sched_main(void *param)
trace_drm_run_job(sched_job, entity); fence = sched->ops->run_job(sched_job); + complete(&entity->entity_idle); drm_sched_fence_scheduled(s_fence);
if (!IS_ERR_OR_NULL(fence)) {
On 24/06/2021 15:08, Boris Brezillon wrote:
The panfrost driver tries to kill in-flight jobs on FD close after destroying the FD scheduler entities. For this to work properly, we need to make sure the jobs popped from the scheduler entities have been queued at the HW level before declaring the entity idle, otherwise we might iterate over a list that doesn't contain those jobs.
Suggested-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Cc: Lucas Stach l.stach@pengutronix.de
Reviewed-by: Steven Price steven.price@arm.com
drivers/gpu/drm/scheduler/sched_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 81496ae2602e..aa776ebe326a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -811,10 +811,10 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
complete(&entity->entity_idle);
if (!sched_job)
if (!sched_job) {
complete(&entity->entity_idle); continue;
}
s_fence = sched_job->s_fence;
@@ -823,6 +823,7 @@ static int drm_sched_main(void *param)
trace_drm_run_job(sched_job, entity); fence = sched->ops->run_job(sched_job);
complete(&entity->entity_idle);
drm_sched_fence_scheduled(s_fence);
if (!IS_ERR_OR_NULL(fence)) {
Am Donnerstag, dem 24.06.2021 um 16:08 +0200 schrieb Boris Brezillon:
The panfrost driver tries to kill in-flight jobs on FD close after destroying the FD scheduler entities. For this to work properly, we need to make sure the jobs popped from the scheduler entities have been queued at the HW level before declaring the entity idle, otherwise we might iterate over a list that doesn't contain those jobs.
Suggested-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Cc: Lucas Stach l.stach@pengutronix.de
Not sure how much it's worth to review my own suggestion, but the implementation looks correct to me. I don't see any downsides for the existing drivers and it solves the race window for drivers that want to cancel jobs on the HW submission queue, without introducing yet another synchronization point.
Reviewed-by: Lucas Stach l.stach@pengutronix.de
drivers/gpu/drm/scheduler/sched_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 81496ae2602e..aa776ebe326a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -811,10 +811,10 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
complete(&entity->entity_idle);
if (!sched_job)
if (!sched_job) {
complete(&entity->entity_idle); continue;
}
s_fence = sched_job->s_fence;
@@ -823,6 +823,7 @@ static int drm_sched_main(void *param)
trace_drm_run_job(sched_job, entity); fence = sched->ops->run_job(sched_job);
complete(&entity->entity_idle);
drm_sched_fence_scheduled(s_fence);
if (!IS_ERR_OR_NULL(fence)) {
On Mon, 28 Jun 2021 11:46:08 +0200 Lucas Stach l.stach@pengutronix.de wrote:
Am Donnerstag, dem 24.06.2021 um 16:08 +0200 schrieb Boris Brezillon:
The panfrost driver tries to kill in-flight jobs on FD close after destroying the FD scheduler entities. For this to work properly, we need to make sure the jobs popped from the scheduler entities have been queued at the HW level before declaring the entity idle, otherwise we might iterate over a list that doesn't contain those jobs.
Suggested-by: Lucas Stach l.stach@pengutronix.de Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Cc: Lucas Stach l.stach@pengutronix.de
Not sure how much it's worth to review my own suggestion, but the implementation looks correct to me. I don't see any downsides for the existing drivers and it solves the race window for drivers that want to cancel jobs on the HW submission queue, without introducing yet another synchronization point.
Reviewed-by: Lucas Stach l.stach@pengutronix.de
Queued to drm-misc-next.
Thanks,
Boris
drivers/gpu/drm/scheduler/sched_main.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 81496ae2602e..aa776ebe326a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -811,10 +811,10 @@ static int drm_sched_main(void *param)
sched_job = drm_sched_entity_pop_job(entity);
complete(&entity->entity_idle);
if (!sched_job)
if (!sched_job) {
complete(&entity->entity_idle); continue;
}
s_fence = sched_job->s_fence;
@@ -823,6 +823,7 @@ static int drm_sched_main(void *param)
trace_drm_run_job(sched_job, entity); fence = sched->ops->run_job(sched_job);
complete(&entity->entity_idle);
drm_sched_fence_scheduled(s_fence);
if (!IS_ERR_OR_NULL(fence)) {
dri-devel@lists.freedesktop.org