At least GC7000 fails to enter runtime suspend for long periods of time since the MC becomes busy again even when the FE is idle. The rest of the series makes detecting similar issues easier to debug in the future by checking all known bits in debugfs and also warning in the EBUSY case.
Tested on GC7000 with a reduced runtime delay of 50ms. Patches are against next-20200226.
Thanks to Lucas Stach for pointing me in the right direction.
Guido Günther (5): drm/etnaviv: Fix typo in comment drm/etnaviv: Update idle bits drm/etnaviv: Consider all kwnown idle bits in debugfs drm/etnaviv: Ignore MC when checking runtime suspend idleness drm/etnaviv: Warn when GPU doesn't idle fast enough
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 26 ++++++++++++++++++++++---- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 2 files changed, 29 insertions(+), 4 deletions(-)
Use 'is' instead of 'it' so it becomes a valid sentence and spell 'resetting' correctly.
Signed-off-by: Guido Günther agx@sigxcpu.org --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 80b99acea1c4..873d9103164d 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -506,7 +506,7 @@ static int etnaviv_hw_reset(struct etnaviv_gpu *gpu) /* read idle register. */ idle = gpu_read(gpu, VIVS_HI_IDLE_STATE);
- /* try reseting again if FE it not idle */ + /* try resetting again if FE is not idle */ if ((idle & VIVS_HI_IDLE_STATE_FE) == 0) { dev_dbg(gpu->dev, "FE is not idle\n"); continue;
Update the state HI and common header from rnndb commit commit 19280a95a (rnndb: Update idle bits)
Signed-off-by: Guido Günther agx@sigxcpu.org --- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/state_hi.xml.h b/drivers/gpu/drm/etnaviv/state_hi.xml.h index 41d8da2b6f4f..8fe4598395f1 100644 --- a/drivers/gpu/drm/etnaviv/state_hi.xml.h +++ b/drivers/gpu/drm/etnaviv/state_hi.xml.h @@ -81,6 +81,13 @@ DEALINGS IN THE SOFTWARE. #define VIVS_HI_IDLE_STATE_IM 0x00000200 #define VIVS_HI_IDLE_STATE_FP 0x00000400 #define VIVS_HI_IDLE_STATE_TS 0x00000800 +#define VIVS_HI_IDLE_STATE_BL 0x00001000 +#define VIVS_HI_IDLE_STATE_ASYNCFE 0x00002000 +#define VIVS_HI_IDLE_STATE_MC 0x00004000 +#define VIVS_HI_IDLE_STATE_PPA 0x00008000 +#define VIVS_HI_IDLE_STATE_WD 0x00010000 +#define VIVS_HI_IDLE_STATE_NN 0x00020000 +#define VIVS_HI_IDLE_STATE_TP 0x00040000 #define VIVS_HI_IDLE_STATE_AXI_LP 0x80000000
#define VIVS_HI_AXI_CONFIG 0x00000008
We were missing out on some bits the vendor kernel driver knows about.
Signed-off-by: Guido Günther agx@sigxcpu.org --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 873d9103164d..187de610b325 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -930,6 +930,20 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct seq_file *m) seq_puts(m, "\t FP is not idle\n"); if ((idle & VIVS_HI_IDLE_STATE_TS) == 0) seq_puts(m, "\t TS is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_BL) == 0) + seq_puts(m, "\t BL is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_ASYNCFE) == 0) + seq_puts(m, "\t ASYNCFE is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_MC) == 0) + seq_puts(m, "\t MC is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_PPA) == 0) + seq_puts(m, "\t PPA is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_WD) == 0) + seq_puts(m, "\t WD is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_NN) == 0) + seq_puts(m, "\t NN is not idle\n"); + if ((idle & VIVS_HI_IDLE_STATE_TP) == 0) + seq_puts(m, "\t TP is not idle\n"); if (idle & VIVS_HI_IDLE_STATE_AXI_LP) seq_puts(m, "\t AXI low power mode\n");
Without that runtime suspend is often blocked due to etnaviv_gpu_rpm_suspend() returning -EBUSY since the FE seems to trigger the MC in its idle loop.
Ignoring the MC bit makes the GPU suspend as expected. This was tested on GC7000.
Signed-off-by: Guido Günther agx@sigxcpu.org --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index 187de610b325..da24e433f82a 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1820,8 +1820,9 @@ static int etnaviv_gpu_rpm_suspend(struct device *dev) if (atomic_read(&gpu->sched.hw_rq_count)) return -EBUSY;
- /* Check whether the hardware (except FE) is idle */ - mask = gpu->idle_mask & ~VIVS_HI_IDLE_STATE_FE; + /* Check whether the hardware (except FE and MC) is idle */ + mask = gpu->idle_mask & ~(VIVS_HI_IDLE_STATE_FE | + VIVS_HI_IDLE_STATE_MC); idle = gpu_read(gpu, VIVS_HI_IDLE_STATE) & mask; if (idle != mask) return -EBUSY;
If the GPU isn't idle after signalling pm_runtime_mark_last_busy() plus waiting for the autosuspend delay there's likely something wrong with the way we check idleness so warn about that.
Signed-off-by: Guido Günther agx@sigxcpu.org --- drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c index da24e433f82a..4fd16b8f8a7a 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c @@ -1824,8 +1824,11 @@ static int etnaviv_gpu_rpm_suspend(struct device *dev) mask = gpu->idle_mask & ~(VIVS_HI_IDLE_STATE_FE | VIVS_HI_IDLE_STATE_MC); idle = gpu_read(gpu, VIVS_HI_IDLE_STATE) & mask; - if (idle != mask) + if (idle != mask) { + dev_warn_ratelimited(dev, "GPU not yet idle, mask: 0x%08x\n", + idle); return -EBUSY; + }
return etnaviv_gpu_hw_suspend(gpu); }
On Mo, 2020-03-02 at 20:13 +0100, Guido Günther wrote:
At least GC7000 fails to enter runtime suspend for long periods of time since the MC becomes busy again even when the FE is idle. The rest of the series makes detecting similar issues easier to debug in the future by checking all known bits in debugfs and also warning in the EBUSY case.
Thanks, series applied to etnaviv/next.
Tested on GC7000 with a reduced runtime delay of 50ms. Patches are against next-20200226.
I've already wondered if 200ms is too long, 50ms sounds more reasonable. Do you have any numbers on the power draw on the i.MX8M with idle GPU, vs. being fully power gated?
Regards, Lucas
Thanks to Lucas Stach for pointing me in the right direction.
Guido Günther (5): drm/etnaviv: Fix typo in comment drm/etnaviv: Update idle bits drm/etnaviv: Consider all kwnown idle bits in debugfs drm/etnaviv: Ignore MC when checking runtime suspend idleness drm/etnaviv: Warn when GPU doesn't idle fast enough
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 26 ++++++++++++++++++++++---- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 2 files changed, 29 insertions(+), 4 deletions(-)
Hi Lucas, On Tue, Mar 03, 2020 at 12:55:04PM +0100, Lucas Stach wrote:
On Mo, 2020-03-02 at 20:13 +0100, Guido Günther wrote:
At least GC7000 fails to enter runtime suspend for long periods of time since the MC becomes busy again even when the FE is idle. The rest of the series makes detecting similar issues easier to debug in the future by checking all known bits in debugfs and also warning in the EBUSY case.
Thanks, series applied to etnaviv/next.
Thanks!
Tested on GC7000 with a reduced runtime delay of 50ms. Patches are against next-20200226.
I've already wondered if 200ms is too long, 50ms sounds more reasonable. Do you have any numbers on the power draw on the i.MX8M with idle GPU, vs. being fully power gated?
I don't have good numbers yet but i'll post here once i do. Cheers, -- Guido
Regards, Lucas
Thanks to Lucas Stach for pointing me in the right direction.
Guido Günther (5): drm/etnaviv: Fix typo in comment drm/etnaviv: Update idle bits drm/etnaviv: Consider all kwnown idle bits in debugfs drm/etnaviv: Ignore MC when checking runtime suspend idleness drm/etnaviv: Warn when GPU doesn't idle fast enough
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 26 ++++++++++++++++++++++---- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 2 files changed, 29 insertions(+), 4 deletions(-)
Hi, On Tue, Mar 03, 2020 at 12:55:04PM +0100, Lucas Stach wrote:
On Mo, 2020-03-02 at 20:13 +0100, Guido Günther wrote:
At least GC7000 fails to enter runtime suspend for long periods of time since the MC becomes busy again even when the FE is idle. The rest of the series makes detecting similar issues easier to debug in the future by checking all known bits in debugfs and also warning in the EBUSY case.
Thanks, series applied to etnaviv/next.
Tested on GC7000 with a reduced runtime delay of 50ms. Patches are against next-20200226.
I've already wondered if 200ms is too long, 50ms sounds more reasonable. Do you have any numbers on the power draw on the i.MX8M with idle GPU, vs. being fully power gated?
The difference is at least 250mW. It makes a huge difference over here. We hit https://lore.kernel.org/dri-devel/20200614064601.7872-1-navid.emamdoost@gmai... recently and you notice instantly when that happens when looking at the SoC temperature.
Cheers, -- Guido
Regards, Lucas
Thanks to Lucas Stach for pointing me in the right direction.
Guido Günther (5): drm/etnaviv: Fix typo in comment drm/etnaviv: Update idle bits drm/etnaviv: Consider all kwnown idle bits in debugfs drm/etnaviv: Ignore MC when checking runtime suspend idleness drm/etnaviv: Warn when GPU doesn't idle fast enough
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 26 ++++++++++++++++++++++---- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 2 files changed, 29 insertions(+), 4 deletions(-)
Would this power difference with the GPU also apply with the GC3000 in the i.MX6qp or the GC2000 in the i.MX6q?
On Thu, Jun 25, 2020 at 8:04 AM Guido Günther agx@sigxcpu.org wrote:
Hi, On Tue, Mar 03, 2020 at 12:55:04PM +0100, Lucas Stach wrote:
On Mo, 2020-03-02 at 20:13 +0100, Guido Günther wrote:
At least GC7000 fails to enter runtime suspend for long periods of
time since
the MC becomes busy again even when the FE is idle. The rest of the
series
makes detecting similar issues easier to debug in the future by
checking
all known bits in debugfs and also warning in the EBUSY case.
Thanks, series applied to etnaviv/next.
Tested on GC7000 with a reduced runtime delay of 50ms. Patches are against next-20200226.
I've already wondered if 200ms is too long, 50ms sounds more reasonable. Do you have any numbers on the power draw on the i.MX8M with idle GPU, vs. being fully power gated?
The difference is at least 250mW. It makes a huge difference over here. We hit
https://lore.kernel.org/dri-devel/20200614064601.7872-1-navid.emamdoost@gmai... recently and you notice instantly when that happens when looking at the SoC temperature.
Cheers, -- Guido
Regards, Lucas
Thanks to Lucas Stach for pointing me in the right direction.
Guido Günther (5): drm/etnaviv: Fix typo in comment drm/etnaviv: Update idle bits drm/etnaviv: Consider all kwnown idle bits in debugfs drm/etnaviv: Ignore MC when checking runtime suspend idleness drm/etnaviv: Warn when GPU doesn't idle fast enough
drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 26 ++++++++++++++++++++++---- drivers/gpu/drm/etnaviv/state_hi.xml.h | 7 +++++++ 2 files changed, 29 insertions(+), 4 deletions(-)
etnaviv mailing list etnaviv@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/etnaviv
dri-devel@lists.freedesktop.org