[PATCH] drm/ttm: fix potential null ptr deref in when mem space alloc fails

List overview All Threads
Download

newer

older

[PATCH] drm/bridge: anx7625: Set...

[linux-next:master] BUILD...

Robert Beckett

18 Mar 2022 18 Mar '22

7:50 p.m.

when allocating a resource in place it is common to free the buffer's resource, then allocate a new resource in a different placement.

e.g. amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space.

In this situation, bo->resource will be null as it is cleared during the initial freeing of the previous resource. This leads to a null deref.

Fixes: d3116756a710 (drm/ttm: rename bo->mem and make it a pointer)

Signed-off-by: Robert Beckett bob.beckett@collabora.com --- drivers/gpu/drm/ttm/ttm_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index db3dc7ef5382..62b29ee7d040 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -875,7 +875,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, }

error: - if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) + if (bo->resource && bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) ttm_bo_move_to_lru_tail_unlocked(bo);

return ret;

-- 2.25.1

Show replies by date

Christian König

21 Mar 21 Mar

9:51 a.m.

Am 18.03.22 um 20:50 schrieb Robert Beckett:

...

when allocating a resource in place it is common to free the buffer's resource, then allocate a new resource in a different placement.

e.g. amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space.

Well yes I'm working the drivers towards this, but NAK at the moment. Currently bo->resource is never expected to be NULL.

And yes I'm searching for this bug in amdgpu for quite a while. Where exactly does that happen?

Amdgpu is supposed to allocate a new resource first, then do a swap and the free the old one.

Thanks, Christian.

...

In this situation, bo->resource will be null as it is cleared during the initial freeing of the previous resource. This leads to a null deref.

Fixes: d3116756a710 (drm/ttm: rename bo->mem and make it a pointer)

Signed-off-by: Robert Beckett bob.beckett@collabora.com

drivers/gpu/drm/ttm/ttm_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index db3dc7ef5382..62b29ee7d040 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -875,7 +875,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, }

error:

if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count)

if (bo->resource && bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) ttm_bo_move_to_lru_tail_unlocked(bo);

return ret;

Robert Beckett

3:44 p.m.

On 21/03/2022 09:51, Christian König wrote:

...

Am 18.03.22 um 20:50 schrieb Robert Beckett:

...
when allocating a resource in place it is common to free the buffer's resource, then allocate a new resource in a different placement.

e.g. amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space.

Well yes I'm working the drivers towards this, but NAK at the moment. Currently bo->resource is never expected to be NULL.

And yes I'm searching for this bug in amdgpu for quite a while. Where exactly does that happen?

in my case, I am writing new code for i915 that does this. I will switch it to allocate the new resource first, then free the old one if successful.

For the existing amd case, see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv...

amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space. If the ttm_bo_mem_space call fails (e.g. due to memory pressure), then the error path will try to deref bo->resource, which will be null at that point.

to fix this, I honestly don't see a reason to not also have the safety check for null there. It could check early and return an error if it is null. I think that defensive programming here makes sense, better than a null deref if someone programs it wrong.

...

Amdgpu is supposed to allocate a new resource first, then do a swap and the free the old one.

Thanks, Christian.

...
In this situation, bo->resource will be null as it is cleared during the initial freeing of the previous resource. This leads to a null deref.

Fixes: d3116756a710 (drm/ttm: rename bo->mem and make it a pointer)

Signed-off-by: Robert Beckett bob.beckett@collabora.com

drivers/gpu/drm/ttm/ttm_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index db3dc7ef5382..62b29ee7d040 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -875,7 +875,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, } error: - if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) + if (bo->resource && bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) ttm_bo_move_to_lru_tail_unlocked(bo); return ret;

Christian König

22 Mar 22 Mar

7:17 a.m.

Am 21.03.22 um 16:44 schrieb Robert Beckett:

...

On 21/03/2022 09:51, Christian König wrote:

...
Am 18.03.22 um 20:50 schrieb Robert Beckett:

...
when allocating a resource in place it is common to free the buffer's resource, then allocate a new resource in a different placement.

e.g. amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space.

Well yes I'm working the drivers towards this, but NAK at the moment. Currently bo->resource is never expected to be NULL.

And yes I'm searching for this bug in amdgpu for quite a while. Where exactly does that happen?

in my case, I am writing new code for i915 that does this. I will switch it to allocate the new resource first, then free the old one if successful.

For the existing amd case, see https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel...

amdgpu_bo_create_kernel_at calls ttm_resource_free, then calls ttm_bo_mem_space. If the ttm_bo_mem_space call fails (e.g. due to memory pressure), then the error path will try to deref bo->resource, which will be null at that point.

Yeah, but that's a special handling only used during driver startup. We somehow have this on systems with DMA-buf sharing as well.

...

to fix this, I honestly don't see a reason to not also have the safety check for null there. It could check early and return an error if it is null. I think that defensive programming here makes sense, better than a null deref if someone programs it wrong.

Having it here is fine, the problem is you need to have that at tons of other places as well.

Maybe I should send you my WIP patch set for this? If you handle all the other cases as well I'm perfectly fine with this.

Regards, Christian.

...

...
Amdgpu is supposed to allocate a new resource first, then do a swap and the free the old one.

Thanks, Christian.

...
In this situation, bo->resource will be null as it is cleared during the initial freeing of the previous resource. This leads to a null deref.

Fixes: d3116756a710 (drm/ttm: rename bo->mem and make it a pointer)

Signed-off-by: Robert Beckett bob.beckett@collabora.com

drivers/gpu/drm/ttm/ttm_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index db3dc7ef5382..62b29ee7d040 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -875,7 +875,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, } error: - if (bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) + if (bo->resource && bo->resource->mem_type == TTM_PL_SYSTEM && !bo->pin_count) ttm_bo_move_to_lru_tail_unlocked(bo); return ret;

1136

Age (days ago)

1140

Last active (days ago)

dri-devel@lists.freedesktop.org

3 comments

2 participants

tags (0)

participants (2)

Christian König
Robert Beckett