Re: [PATCH 1/2] drm/ttm: set ttm_buffer_object pointer as null after it's freed

10 Sep 2018


      Am 10.09.2018 um 15:05 schrieb Tom St Denis:
...
On 2018-09-10 9:04 a.m., Christian König wrote:
...
Hi Tom,
I'm talking about adding new printks to figure out what the heck is 
going wrong here.
Thanks,
Christian.
Hi Christian,
Sure, if you want to send me a simple patch that adds more printk I'll 
gladly give it a try (doubly so since my workstation depends on our 
staging tree to work properly...).
Just add a printk to ttm_bo_bulk_move_helper to print pos->first and 
pos->last.
And another one to amdgpu_bo_destroy to printk the value of tbo.
Christian.
...
Tom
...
Am 10.09.2018 um 14:59 schrieb Tom St Denis:
...
Hi Christian,
Are you adding new traces or turning on existing ones?  Would you 
like me to try them out in my setup?
Tom
On 2018-09-10 8:49 a.m., Christian König wrote:
...
Am 10.09.2018 um 14:05 schrieb Huang Rui:
...
On Mon, Sep 10, 2018 at 05:25:48PM +0800, Koenig, Christian wrote:
...
Am 10.09.2018 um 11:23 schrieb Huang Rui:
> On Mon, Sep 10, 2018 at 11:00:04AM +0200, Christian König wrote:
>> Hi Ray,
>>
>> well those patches doesn't make sense, the pointer is only 
>> local to
>> the function.
> You're right.
> I narrowed it with gdb dump from ttm_bo_bulk_move_lru_tail+0x2b, 
> the
> use-after-free should be in below codes:
>
> man = &bulk->tt[i].first->bdev->man[TTM_PL_TT];
> ttm_bo_bulk_move_helper(&bulk->tt[i], &man->lru[i], false);
>
> Is there a case, when orignal bo is destroyed in the bulk pos, 
> but it
> doesn't update pos->first pointer, then we still use it during 
> the bulk
> moving?
Only when a per VM BO is freed or the VM destroyed.
The first case should now be handled by "drm/amdgpu: set 
bulk_moveable
to false when a per VM is released" and when we use a destroyed 
VM we
would see other problems as well.
If a VM instance is teared down, all BOs which belong that VM 
should be
removed from LRU. But how can we submit cmd based on a destroyed 
VM? You
know, we do the bulk move at last step of submission.
Well exactly that's the point this can't happen :)
Otherwise we would crash because of using freed up memory much 
earlier in the command submission.
The best idea I have to track this down further is to add some 
trace_printk in ttm_bo_bulk_move_helper and amdgpu_bo_destroy and 
see why and when we are actually using a destroyed BO.
Christian.
...
Thanks,
Ray
...
BTW: Just pushed this commit to the repository, should show up 
any second.
Christian.
> Thanks,
> Ray
>
>> Regards,
>> Christian.
>>
>> Am 10.09.2018 um 10:57 schrieb Huang Rui:
>>> It avoids to be refered again after freed.
>>>
>>> Signed-off-by: Huang Rui ray.huang@amd.com
>>> Cc: Christian König christian.koenig@amd.com
>>> Cc: Tom StDenis Tom.StDenis@amd.com
>>> ---
>>>    drivers/gpu/drm/ttm/ttm_bo.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c 
>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 138c989..d3ef5f8 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -54,6 +54,7 @@ static struct attribute ttm_bo_count = {
>>>    static void ttm_bo_default_destroy(struct ttm_buffer_object 
>>> *bo)
>>>    {
>>>        kfree(bo);
>>> +    bo = NULL;
>>>    }
>>>    static inline int ttm_mem_type_from_place(const struct 
>>> ttm_place *place,
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [PATCH 1/2] drm/ttm: set ttm_buffer_object pointer as null after it's freed