Re: [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled

8 Jul 2021

On Thu, Jul 8, 2021 at 8:56 PM Andrey Grodzovsky
andrey.grodzovsky@amd.com wrote:
...
On 2021-07-08 1:37 p.m., Daniel Vetter wrote:
...
It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.
It's definitely not enough on arm.
Fix this proplery, which means

explain the need for the barrier in both places
point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.
While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.
Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.
Signed-off-by: Daniel Vetter daniel.vetter@intel.com
Cc: "Christian König" christian.koenig@amd.com
Cc: Steven Price steven.price@arm.com
Cc: Daniel Vetter daniel.vetter@ffwll.ch
Cc: Andrey Grodzovsky andrey.grodzovsky@amd.com
Cc: Lee Jones lee.jones@linaro.org
Cc: Boris Brezillon boris.brezillon@collabora.com

drivers/gpu/drm/scheduler/sched_entity.c | 27 ++++++++++++++++++++++--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 64d398166644..4e1124ed80e0 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
              dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED);
  dma_fence_put(entity->last_scheduled);


entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished);



/*


 * if the queue is empty we allow drm_sched_job_arm() to locklessly



Probably meant drm_sched_entity_select_rq here
Which is called from drm_sched_job_arm but yes. I'll switch it around.
...
...

 * access ->last_scheduled. This only works if we set the pointer before


 * we dequeue and if we a write barrier here.


 */


smp_wmb();


spsc_queue_pop(&entity->job_queue);
return sched_job;

}

@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
      struct drm_gpu_scheduler *sched;
      struct drm_sched_rq *rq;

if (spsc_queue_count(&entity->job_queue) || !entity->sched_list)




/* single possible engine and already selected */


if (!entity->sched_list)


        return;



/* queue non-empty, stay on the same engine */


if (spsc_queue_count(&entity->job_queue))
        return;



Shouldn't smp_rmb be here in between ? Given the queue is empty we want to
be certain we are reading the most recent value of entity->last_scheduled
Yeah I had a load_acquire barrier here earlier and then put the
smp_rmb() on the wrong side. Will fix.
...
Andrey
...

fence = READ_ONCE(entity->last_scheduled);




fence = entity->last_scheduled;



/*


 * Only when the queue is empty are we guaranteed the the scheduler


 * thread cannot change ->last_scheduled. To enforce ordering we need


 * a read barrier here. See drm_sched_entity_pop_job() for the other


 * side.


 */


smp_rmb();



/* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
        return;




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled