This patch set is to fix a bug in amdgpu / radeon drm that results in a crash when dma_map_sg combines elemnets within a scatterlist table.
There are 2 shortfalls in the current kernel.
1) AMDGPU / RADEON assumes that the requested and created scatterlist table lengths using from dma_map_sg are equal. This may not be the case using the newer dma-iommu implementation
2) drm_prime does not fetch the length of the scatterlist via the correct dma macro, this can use the incorrect length being used (>0) in places where dma_map_sg has updated the table elements.
The sg_dma_len macro is representative of the length of the sg item after dma_map_sg
Example Crash :
[drm:amdgpu_ttm_backend_bind [amdgpu]] *ERROR* failed to pin userptr
This happens in OpenCL applications, causing them to crash or hang, on either amdgpu-pro or ROCm OpenCL implementations
Shane Francis (3): drm/prime: use dma length macro when mapping sg to arrays drm/amdgpu: fix scatter-gather mapping with user pages drm/radeon: fix scatter-gather mapping with user pages
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- drivers/gpu/drm/drm_prime.c | 4 +++- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 3 files changed, 5 insertions(+), 3 deletions(-)
As dma_map_sg can reorganize scatter-gather lists in a way that can cause some later segments to be empty we should always use the sg_dma_len macro to fetch the actual length.
This could now be 0 and not need to be mapped to a page or address array
Signed-off-by: Shane Francis bigbeeshane@gmail.com --- drivers/gpu/drm/drm_prime.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 86d9b0e45c8c..e4eff5b84597 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -965,12 +965,14 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages, u32 len, index; dma_addr_t addr;
+ index = 0; for_each_sg(sgt->sgl, sg, sgt->nents, count) { - len = sg->length; + len = sg_dma_len(sg); page = sg_page(sg); addr = sg_dma_address(sg);
+ while (len > 0) { if (WARN_ON(index >= max_entries)) return -1;
Am 25.03.20 um 01:24 schrieb Shane Francis:
As dma_map_sg can reorganize scatter-gather lists in a way that can cause some later segments to be empty we should always use the sg_dma_len macro to fetch the actual length.
This could now be 0 and not need to be mapped to a page or address array
Signed-off-by: Shane Francis bigbeeshane@gmail.com
drivers/gpu/drm/drm_prime.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 86d9b0e45c8c..e4eff5b84597 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -965,12 +965,14 @@ int drm_prime_sg_to_page_addr_arrays(struct sg_table *sgt, struct page **pages, u32 len, index; dma_addr_t addr;
Unrelated newline change, apart from that the patch looks good to me.
Christian.
index = 0; for_each_sg(sgt->sgl, sg, sgt->nents, count) {
len = sg->length;
len = sg_dma_len(sg);
page = sg_page(sg); addr = sg_dma_address(sg);
while (len > 0) { if (WARN_ON(index >= max_entries)) return -1;
Calls to dma_map_sg may return segments / entries than requested if they fall on page bounderies. The old implementation did not support this use case.
Signed-off-by: Shane Francis bigbeeshane@gmail.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index dee446278417..c6e9885c071f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -974,7 +974,7 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm) /* Map SG to device */ r = -ENOMEM; nents = dma_map_sg(adev->dev, ttm->sg->sgl, ttm->sg->nents, direction); - if (nents != ttm->sg->nents) + if (nents == 0) goto release_sg;
/* convert SG to linear array of pages and dma addresses */
Calls to dma_map_sg may return segments / entries than requested if they fall on page bounderies. The old implementation did not support this use case.
Signed-off-by: Shane Francis bigbeeshane@gmail.com --- drivers/gpu/drm/radeon/radeon_ttm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 3b92311d30b9..b3380ffab4c2 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -528,7 +528,7 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
r = -ENOMEM; nents = dma_map_sg(rdev->dev, ttm->sg->sgl, ttm->sg->nents, direction); - if (nents != ttm->sg->nents) + if (nents == 0) goto release_sg;
drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
dri-devel@lists.freedesktop.org