Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

7 Sep 2021


      Am 07.09.21 um 10:47 schrieb Daniel Vetter:
...
[SNIP]
...
...
...
...
If we embed it, then I think it should start existing latest from
drm_sched_job_arm. Maybe not yet initialized, but at least allocated. So
the right thing to do here is to have the hw fence as a pointer in
struct drm_sched_job. And check in drm_sched_job_arm() that it's at least
allocated.
Why we need to allocate the HW fence if it's embedded within a job struct ?
the hw fence is a refcounted struct, and the drm_sched_job is a different
struct. And we didn't have a dri-devel discussion about whether it's
correct to conflate these two lifetimes, amdgpu folks simply hacked
something together.
Obviously scheduler level changes must be discussed at dri-devel forum
level.
What happened here and as Monk already mentioned - we had internal
discussion
about how to fix the problem in the header of this thread - avoiding
accessing feed job
from TDR handler without the current hack in place of removal and back
insertion
into pending list. It's there we we came up (I think Christian first
mentioned this) with the
idea of embedding the HW fence into amdgpu job - both for avoiding memory
allocations
but also - because this allows to use the HW fence's recounting as a
solution to the above
problem by simply grabbing a reference to the next fence in the pending list
as a first step
in the TDR handler. What we didn't take into account at the time is that
indeed this change
cannot be limited to amdgpu level - this we noticed much later during final
code reviews.
Not sure where this fell through cracks, but imo at least changing where
the hw fence is allocated is a very fundamental change, and latest then
you should have discussed this on dri-devel.
I'm the one who kicked this off in April and I made a nice internal 
presentation to explain what the problems is etc... So the idea of 
embedding the hardware fence into the job came from me.
But during the presentation I also noted that we need to sync up with a 
guy named Daniel Vetter because it was his patch set which surfaced this 
issue by annotating fence completion prerequisite in lockdep.
...
But even the tdr races would probably have been good to start on
dri-devel. Now it looks like Monk&team have lost 6 months for nothing.
Well to make it clear I've noted during the presentation in April that 
this needs to be discussed with you, I've also noted to the first guy 
working on this that this needs to be discussed on dri-devel instead of 
internally and I'm pretty sure that I've noted this a couple of more 
times after it moved to somebody else. And IIRC Andrey also noted that 
we should not discuss this internally pretty early as well.
So if people are not listening it is not a surprise that they spend time 
on stuff which isn't upstreamable like this.
Christian.
...
-Daniel
...
Andrey
...
...
...
Otherwise we're just diverging across drivers and tempting them to do the
wrong thing with the current ->run_job callback interface.
Maybe we should switch from embedding in driver level job struct as it's now
to drm_sched_job and just leave the fence initialization to driver specific
code ?
Maybe? Like I've not been involved in these discussion ont he amd side at
all, I'm just noticing that we do have a now rather inconsistently used
inteface across drivers. Which is no good.
-Daniel
...
Andrey
...
Can you guys look into this?
-Daniel
...
>> Another thing I recently pondered for tdr races looking at i915 code is
>> whether the tdr should first block the completion fence for that job. My
>> motivation is to have a race-free error capture (if the completion races
>> then we might start evicting memory and everything goes boom), but maybe
>> that helps here too. Some kind of atomic "block this fence from
>> completing thing.
>>
>> Or I'm I completely guessing in the wrong direction?
> I think we already do it here - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.boo...
Ah yes this works becase drm/sched has separate hw fence from the logical
job fence.
-Daniel
> Andrey > > >> -Daniel >> >>> Andrey >>> >>> >>>>> Andrey >>>>> >>>>> >>>>>>> Andrey >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> + >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> - >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>>

spin_unlock(&sched->job_list_lock); status = job->sched->ops->timedout_job(job); @@ -342,6 +342,7 @@ static void drm_sched_job_timedout(struct work_struct *work) job->sched->ops->free_job(job); sched->free_guilty = false; } dma_fence_put(fence); } else { spin_unlock(&sched->job_list_lock); } @@ -392,20 +393,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* * Reinsert back the bad job here - now it's safe as * drm_sched_get_cleanup_job cannot race against us and release the * bad job at this point - we parked (waited for) any in progress * (earlier) cleanups and drm_sched_get_cleanup_job will not be called * now until the scheduler thread is unparked. */ if (bad && bad->sched == sched) /* * Add at the head of the queue to reflect it was the earliest * job extracted. */ list_add(&bad->list, &sched->pending_list); /* * Iterate the job list from later to  earlier one and either deactive * their HW callbacks or remove them from pending list if they already 2.25.1
>>>>>>>>>
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll....

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

>>>>>>>>>