Hello,
Here are 2 fixes and one improvement for the page fault handling. Those bugs were found while working on indirect draw supports which requires the allocation of a big heap buffer for varyings, and the vertex/tiler shaders seem to have access pattern that trigger those issues. I remember discussing the first issue with Steve or Robin a while back, but we never hit it before (now we do :)).
The last patch is a perf improvement: no need to re-enable hardware interrupts if we know the threaded irq handler will be woken up right away.
Regards,
Boris
Changes in v2: * Rework the MMU irq handling loop to avoid a goto
Boris Brezillon (3): drm/panfrost: Clear MMU irqs before handling the fault drm/panfrost: Don't try to map pages that are already mapped drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs
drivers/gpu/drm/panfrost/panfrost_mmu.c | 39 +++++++++++++++---------- 1 file changed, 24 insertions(+), 15 deletions(-)
When a fault is handled it will unblock the GPU which will continue executing its shader and might fault almost immediately on a different page. If we clear interrupts after handling the fault we might miss new faults, so clear them before.
Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Reviewed-by: Steven Price steven.price@arm.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 7c1b3481b785..904d63450862 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -593,6 +593,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type = (fault_status >> 8) & 0x3; source_id = (fault_status >> 16);
+ mmu_write(pfdev, MMU_INT_CLEAR, mask); + /* Page fault only */ ret = -1; if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0) @@ -616,8 +618,6 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type, access_type_name(pfdev, fault_status), source_id);
- mmu_write(pfdev, MMU_INT_CLEAR, mask); - status &= ~mask; }
We allocate 2MB chunks at a time, so it might appear that a page fault has already been handled by a previous page fault when we reach panfrost_mmu_map_fault_addr(). Bail out in that case to avoid mapping the same area twice.
Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com Reviewed-by: Steven Price steven.price@arm.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 904d63450862..21e552d1ac71 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -488,8 +488,14 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, } bo->base.pages = pages; bo->base.pages_use_count = 1; - } else + } else { pages = bo->base.pages; + if (pages[page_offset]) { + /* Pages are already mapped, bail out. */ + mutex_unlock(&bo->base.pages_lock); + goto out; + } + }
mapping = bo->base.base.filp->f_mapping; mapping_set_unevictable(mapping); @@ -522,6 +528,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,
dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);
+out: panfrost_gem_mapping_put(bomapping);
return 0;
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.
v2: * Rework the loop to avoid a goto
Signed-off-by: Boris Brezillon boris.brezillon@collabora.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 26 +++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..0581186ebfb3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -578,22 +578,20 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) { struct panfrost_device *pfdev = data; u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); - int i, ret; + int ret;
- for (i = 0; status; i++) { - u32 mask = BIT(i) | BIT(i + 16); + while (status) { + u32 as = ffs(status | (status >> 16)) - 1; + u32 mask = BIT(as) | BIT(as + 16); u64 addr; u32 fault_status; u32 exception_type; u32 access_type; u32 source_id;
- if (!(status & mask)) - continue; - - fault_status = mmu_read(pfdev, AS_FAULTSTATUS(i)); - addr = mmu_read(pfdev, AS_FAULTADDRESS_LO(i)); - addr |= (u64)mmu_read(pfdev, AS_FAULTADDRESS_HI(i)) << 32; + fault_status = mmu_read(pfdev, AS_FAULTSTATUS(as)); + addr = mmu_read(pfdev, AS_FAULTADDRESS_LO(as)); + addr |= (u64)mmu_read(pfdev, AS_FAULTADDRESS_HI(as)) << 32;
/* decode the fault status */ exception_type = fault_status & 0xFF; @@ -604,8 +602,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data)
/* Page fault only */ ret = -1; - if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0) - ret = panfrost_mmu_map_fault_addr(pfdev, i, addr); + if ((status & mask) == BIT(as) && (exception_type & 0xF8) == 0xC0) + ret = panfrost_mmu_map_fault_addr(pfdev, as, addr);
if (ret) /* terminal fault, print info about the fault */ @@ -617,7 +615,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) "exception type 0x%X: %s\n" "access type 0x%X: %s\n" "source id 0x%X\n", - i, addr, + as, addr, "TODO", fault_status, (fault_status & (1 << 10) ? "DECODER FAULT" : "SLAVE FAULT"), @@ -626,6 +624,10 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) source_id);
status &= ~mask; + + /* If we received new MMU interrupts, process them before returning. */ + if (!status) + status = mmu_read(pfdev, MMU_INT_RAWSTAT); }
mmu_write(pfdev, MMU_INT_MASK, ~0);
On 05/02/2021 11:17, Boris Brezillon wrote:
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.
v2:
- Rework the loop to avoid a goto
Signed-off-by: Boris Brezillon boris.brezillon@collabora.com
Reviewed-by: Steven Price steven.price@arm.com
drivers/gpu/drm/panfrost/panfrost_mmu.c | 26 +++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..0581186ebfb3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -578,22 +578,20 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) { struct panfrost_device *pfdev = data; u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT);
- int i, ret;
- int ret;
- for (i = 0; status; i++) {
u32 mask = BIT(i) | BIT(i + 16);
- while (status) {
u32 as = ffs(status | (status >> 16)) - 1;
u64 addr; u32 fault_status; u32 exception_type; u32 access_type; u32 source_id;u32 mask = BIT(as) | BIT(as + 16);
if (!(status & mask))
continue;
fault_status = mmu_read(pfdev, AS_FAULTSTATUS(i));
addr = mmu_read(pfdev, AS_FAULTADDRESS_LO(i));
addr |= (u64)mmu_read(pfdev, AS_FAULTADDRESS_HI(i)) << 32;
fault_status = mmu_read(pfdev, AS_FAULTSTATUS(as));
addr = mmu_read(pfdev, AS_FAULTADDRESS_LO(as));
addr |= (u64)mmu_read(pfdev, AS_FAULTADDRESS_HI(as)) << 32;
/* decode the fault status */ exception_type = fault_status & 0xFF;
@@ -604,8 +602,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data)
/* Page fault only */ ret = -1;
if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0)
ret = panfrost_mmu_map_fault_addr(pfdev, i, addr);
if ((status & mask) == BIT(as) && (exception_type & 0xF8) == 0xC0)
ret = panfrost_mmu_map_fault_addr(pfdev, as, addr);
if (ret) /* terminal fault, print info about the fault */
@@ -617,7 +615,7 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) "exception type 0x%X: %s\n" "access type 0x%X: %s\n" "source id 0x%X\n",
i, addr,
as, addr, "TODO", fault_status, (fault_status & (1 << 10) ? "DECODER FAULT" : "SLAVE FAULT"),
@@ -626,6 +624,10 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) source_id);
status &= ~mask;
/* If we received new MMU interrupts, process them before returning. */
if (!status)
status = mmu_read(pfdev, MMU_INT_RAWSTAT);
}
mmu_write(pfdev, MMU_INT_MASK, ~0);
On Fri, Feb 5, 2021 at 5:18 AM Boris Brezillon boris.brezillon@collabora.com wrote:
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.
v2:
- Rework the loop to avoid a goto
Signed-off-by: Boris Brezillon boris.brezillon@collabora.com
drivers/gpu/drm/panfrost/panfrost_mmu.c | 26 +++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-)
Reviewed-by: Rob Herring robh@kernel.org
On Fri, 5 Feb 2021 12:17:54 +0100 Boris Brezillon boris.brezillon@collabora.com wrote:
Hello,
Here are 2 fixes and one improvement for the page fault handling. Those bugs were found while working on indirect draw supports which requires the allocation of a big heap buffer for varyings, and the vertex/tiler shaders seem to have access pattern that trigger those issues. I remember discussing the first issue with Steve or Robin a while back, but we never hit it before (now we do :)).
The last patch is a perf improvement: no need to re-enable hardware interrupts if we know the threaded irq handler will be woken up right away.
Regards,
Boris
Changes in v2:
- Rework the MMU irq handling loop to avoid a goto
Queued to drm-misc-next.
Boris Brezillon (3): drm/panfrost: Clear MMU irqs before handling the fault drm/panfrost: Don't try to map pages that are already mapped drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs
drivers/gpu/drm/panfrost/panfrost_mmu.c | 39 +++++++++++++++---------- 1 file changed, 24 insertions(+), 15 deletions(-)
dri-devel@lists.freedesktop.org