Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
Tianlin Li (2): drm/radeon: have the callers of set_memory_*() check the return value drm/radeon: change call sites to handle return value properly.
drivers/gpu/drm/radeon/r100.c | 3 ++- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++---- drivers/gpu/drm/radeon/rs400.c | 3 ++- 4 files changed, 23 insertions(+), 7 deletions(-)
Have the callers of set_memory_*() in drm/radeon check the return value. Change the return type of the callers properly.
Signed-off-by: Tianlin Li tli@digitalocean.com --- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++---- 2 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 30e32adc1fc6..a23e58397293 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -661,7 +661,7 @@ struct radeon_gart { };
int radeon_gart_table_ram_alloc(struct radeon_device *rdev); -void radeon_gart_table_ram_free(struct radeon_device *rdev); +int radeon_gart_table_ram_free(struct radeon_device *rdev); int radeon_gart_table_vram_alloc(struct radeon_device *rdev); void radeon_gart_table_vram_free(struct radeon_device *rdev); int radeon_gart_table_vram_pin(struct radeon_device *rdev); diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index d4d3778d0a98..59039ab602e8 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -71,6 +71,7 @@ int radeon_gart_table_ram_alloc(struct radeon_device *rdev) { void *ptr; + int ret;
ptr = pci_alloc_consistent(rdev->pdev, rdev->gart.table_size, &rdev->gart.table_addr); @@ -80,8 +81,16 @@ int radeon_gart_table_ram_alloc(struct radeon_device *rdev) #ifdef CONFIG_X86 if (rdev->family == CHIP_RS400 || rdev->family == CHIP_RS480 || rdev->family == CHIP_RS690 || rdev->family == CHIP_RS740) { - set_memory_uc((unsigned long)ptr, + ret = set_memory_uc((unsigned long)ptr, rdev->gart.table_size >> PAGE_SHIFT); + if (ret) { + pci_free_consistent(rdev->pdev, rdev->gart.table_size, + (void *)rdev->gart.ptr, + rdev->gart.table_addr); + rdev->gart.ptr = NULL; + rdev->gart.table_addr = 0; + return ret; + } } #endif rdev->gart.ptr = ptr; @@ -98,16 +107,20 @@ int radeon_gart_table_ram_alloc(struct radeon_device *rdev) * (r1xx-r3xx, non-pcie r4xx, rs400). These asics require the * gart table to be in system memory. */ -void radeon_gart_table_ram_free(struct radeon_device *rdev) +int radeon_gart_table_ram_free(struct radeon_device *rdev) { + int ret; + if (rdev->gart.ptr == NULL) { - return; + return 0; } #ifdef CONFIG_X86 if (rdev->family == CHIP_RS400 || rdev->family == CHIP_RS480 || rdev->family == CHIP_RS690 || rdev->family == CHIP_RS740) { - set_memory_wb((unsigned long)rdev->gart.ptr, + ret = set_memory_wb((unsigned long)rdev->gart.ptr, rdev->gart.table_size >> PAGE_SHIFT); + if (ret) + return ret; } #endif pci_free_consistent(rdev->pdev, rdev->gart.table_size, @@ -115,6 +128,7 @@ void radeon_gart_table_ram_free(struct radeon_device *rdev) rdev->gart.table_addr); rdev->gart.ptr = NULL; rdev->gart.table_addr = 0; + return 0; }
/**
Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it. Fix those call sites in drm/radeon to handle retval properly. Since fini functions are always void, print errors for the failures.
Signed-off-by: Tianlin Li tli@digitalocean.com --- drivers/gpu/drm/radeon/r100.c | 3 ++- drivers/gpu/drm/radeon/rs400.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index 110fb38004b1..7eafe15ba124 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -706,7 +706,8 @@ void r100_pci_gart_fini(struct radeon_device *rdev) { radeon_gart_fini(rdev); r100_pci_gart_disable(rdev); - radeon_gart_table_ram_free(rdev); + if (radeon_gart_table_ram_free(rdev)) + DRM_ERROR("radeon: failed free system ram for GART page table.\n"); }
int r100_irq_set(struct radeon_device *rdev) diff --git a/drivers/gpu/drm/radeon/rs400.c b/drivers/gpu/drm/radeon/rs400.c index 117f60af1ee4..de3674f5fe23 100644 --- a/drivers/gpu/drm/radeon/rs400.c +++ b/drivers/gpu/drm/radeon/rs400.c @@ -210,7 +210,8 @@ void rs400_gart_fini(struct radeon_device *rdev) { radeon_gart_fini(rdev); rs400_gart_disable(rdev); - radeon_gart_table_ram_free(rdev); + if (radeon_gart_table_ram_free(rdev)) + DRM_ERROR("radeon: failed free system ram for GART page table.\n"); }
#define RS400_PTE_UNSNOOPED (1 << 0)
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
Apart from that certainly a good idea to add __must_check to the functions.
Regards, Christian.
Tianlin Li (2): drm/radeon: have the callers of set_memory_*() check the return value drm/radeon: change call sites to handle return value properly.
drivers/gpu/drm/radeon/r100.c | 3 ++- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++---- drivers/gpu/drm/radeon/rs400.c | 3 ++- 4 files changed, 23 insertions(+), 7 deletions(-)
On Jan 8, 2020, at 6:56 AM, Christian König christian.koenig@amd.com wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
Thank you. I will fix that in patch 1 and remove patch 2 (since no need to fix the call sites to handle the retval).
Best regards, Tianlin
Apart from that certainly a good idea to add __must_check to the functions.
Regards, Christian.
Tianlin Li (2): drm/radeon: have the callers of set_memory_*() check the return value drm/radeon: change call sites to handle return value properly.
drivers/gpu/drm/radeon/r100.c | 3 ++- drivers/gpu/drm/radeon/radeon.h | 2 +- drivers/gpu/drm/radeon/radeon_gart.c | 22 ++++++++++++++++++---- drivers/gpu/drm/radeon/rs400.c | 3 ++- 4 files changed, 23 insertions(+), 7 deletions(-)
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
If that's true it should be removed from the tree. We need to be able to correctly make these kinds of changes in the kernel.
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
This seems like a bad idea -- we shouldn't be papering over failures like this when there is logic available to deal with it.
Apart from that certainly a good idea to add __must_check to the functions.
Agreed!
-Kees
On Wed, Jan 8, 2020 at 12:39 PM Kees Cook keescook@chromium.org wrote:
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
If that's true it should be removed from the tree. We need to be able to correctly make these kinds of changes in the kernel.
This driver supports about 15 years of hardware generations. Newer cards are still prevalent, but the older stuff is less so. It still works and people use it based on feedback I've seen, but the older stuff has no active development at this point. This change just happens to target those older chips.
Alex
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
This seems like a bad idea -- we shouldn't be papering over failures like this when there is logic available to deal with it.
Apart from that certainly a good idea to add __must_check to the functions.
Agreed!
-Kees
-- Kees Cook _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Am 08.01.20 um 18:51 schrieb Alex Deucher:
On Wed, Jan 8, 2020 at 12:39 PM Kees Cook keescook@chromium.org wrote:
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
If that's true it should be removed from the tree. We need to be able to correctly make these kinds of changes in the kernel.
This driver supports about 15 years of hardware generations. Newer cards are still prevalent, but the older stuff is less so. It still works and people use it based on feedback I've seen, but the older stuff has no active development at this point. This change just happens to target those older chips.
Just a few weeks back we've got a mail from somebody using an integrated R128 in a laptop.
After a few mails back and force we figured out that his nearly 20 years old hardware was finally failing.
Up till that he was still successfully updating his kernel from time to time and the driver still worked. I find that pretty impressive.
Alex
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
This seems like a bad idea -- we shouldn't be papering over failures like this when there is logic available to deal with it.
Well I certainly agree to that, but we are talking about a call which happens only once during driver load/unload. If necessary we could also print an error when something goes wrong, but please no larger refactoring of return values and call paths.
It is perfectly possible that this call actually failed on somebodies hardware, but we never noticed because the driver still works fine. If we now handle the error it is possible that the module never loads and the user gets a black screen instead.
Regards, Christian.
Apart from that certainly a good idea to add __must_check to the functions.
Agreed!
-Kees
-- Kees Cook _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
Hi
Am 09.01.20 um 11:15 schrieb Christian König:
Am 08.01.20 um 18:51 schrieb Alex Deucher:
On Wed, Jan 8, 2020 at 12:39 PM Kees Cook keescook@chromium.org wrote:
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
If that's true it should be removed from the tree. We need to be able to correctly make these kinds of changes in the kernel.
This driver supports about 15 years of hardware generations. Newer cards are still prevalent, but the older stuff is less so. It still works and people use it based on feedback I've seen, but the older stuff has no active development at this point. This change just happens to target those older chips.
Just a few weeks back we've got a mail from somebody using an integrated R128 in a laptop.
After a few mails back and force we figured out that his nearly 20 years old hardware was finally failing.
Up till that he was still successfully updating his kernel from time to time and the driver still worked. I find that pretty impressive.
Alex
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
This seems like a bad idea -- we shouldn't be papering over failures like this when there is logic available to deal with it.
Well I certainly agree to that, but we are talking about a call which happens only once during driver load/unload. If necessary we could also print an error when something goes wrong, but please no larger refactoring of return values and call paths.
IMHO radeon should be marked as orphaned or obsolete then.
Best regards Thomas
It is perfectly possible that this call actually failed on somebodies hardware, but we never noticed because the driver still works fine. If we now handle the error it is possible that the module never loads and the user gets a black screen instead.
Regards, Christian.
Apart from that certainly a good idea to add __must_check to the functions.
Agreed!
-Kees
-- Kees Cook _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, Jan 9, 2020 at 5:49 AM Thomas Zimmermann tzimmermann@suse.de wrote:
Hi
Am 09.01.20 um 11:15 schrieb Christian König:
Am 08.01.20 um 18:51 schrieb Alex Deucher:
On Wed, Jan 8, 2020 at 12:39 PM Kees Cook keescook@chromium.org wrote:
On Wed, Jan 08, 2020 at 01:56:47PM +0100, Christian König wrote:
Am 07.01.20 um 20:25 schrieb Tianlin Li:
Right now several architectures allow their set_memory_*() family of functions to fail, but callers may not be checking the return values. If set_memory_*() returns with an error, call-site assumptions may be infact wrong to assume that it would either succeed or not succeed at all. Ideally, the failure of set_memory_*() should be passed up the call stack, and callers should examine the failure and deal with it.
Need to fix the callers and add the __must_check attribute. They also may not provide any level of atomicity, in the sense that the memory protections may be left incomplete on failure. This issue likely has a few steps on effects architectures: 1)Have all callers of set_memory_*() helpers check the return value. 2)Add __must_check to all set_memory_*() helpers so that new uses do not ignore the return value. 3)Add atomicity to the calls so that the memory protections aren't left in a partial state.
This series is part of step 1. Make drm/radeon check the return value of set_memory_*().
I'm a little hesitate merge that. This hardware is >15 years old and nobody of the developers have any system left to test this change on.
If that's true it should be removed from the tree. We need to be able to correctly make these kinds of changes in the kernel.
This driver supports about 15 years of hardware generations. Newer cards are still prevalent, but the older stuff is less so. It still works and people use it based on feedback I've seen, but the older stuff has no active development at this point. This change just happens to target those older chips.
Just a few weeks back we've got a mail from somebody using an integrated R128 in a laptop.
After a few mails back and force we figured out that his nearly 20 years old hardware was finally failing.
Up till that he was still successfully updating his kernel from time to time and the driver still worked. I find that pretty impressive.
Alex
Would it be to much of a problem to just add something like: r = set_memory_*(); (void)r; /* Intentionally ignored */.
This seems like a bad idea -- we shouldn't be papering over failure like this when there is logic available to deal with it.
Well I certainly agree to that, but we are talking about a call which happens only once during driver load/unload. If necessary we could also print an error when something goes wrong, but please no larger refactoring of return values and call paths.
IMHO radeon should be marked as orphaned or obsolete then.
As I said this covers about 15-17 years of GPUs (~60 asic families). The older stuff is hard to test these days because it's PCI or AGP hardware. So far it works for most people. The newer stuff is still tested as used regularly.
Alex
Best regards Thomas
It is perfectly possible that this call actually failed on somebodies hardware, but we never noticed because the driver still works fine. If we now handle the error it is possible that the module never loads and the user gets a black screen instead.
Regards, Christian.
Apart from that certainly a good idea to add __must_check to the functions.
Agreed!
-Kees
-- Kees Cook _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
dri-devel@lists.freedesktop.org