On Fri, Jan 7, 2022 at 4:27 PM Stephen Boyd swboyd@chromium.org wrote:
Quoting Rob Clark (2022-01-06 10:14:46)
From: Rob Clark robdclark@chromium.org
System suspend uses pm_runtime_force_suspend(), which cheekily bypasses the runpm reference counts. This doesn't actually work so well when the GPU is active. So add a reasonable delay waiting for the GPU to become idle.
Maybe also say:
Failure to wait during system wide suspend leads to GPU hangs seen on resume.
The fallout can actually be a lot more than just GPU hangs.. that is just the case that is easy (for us) to observe because the crash logging captures them. But sync/async external aborts are also possible.. and I think even just undefined behavior (ie. I think if the timing works out right, it can survive but just "lose" rendering that hadn't completed yet)
Alternatively we could just return -EBUSY in this case, but that has the disadvantage of causing system suspend to fail.
Signed-off-by: Rob Clark robdclark@chromium.org
drivers/gpu/drm/msm/adreno/adreno_device.c | 9 +++++++++ drivers/gpu/drm/msm/msm_gpu.c | 3 +++ drivers/gpu/drm/msm/msm_gpu.h | 3 +++ 3 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c index 93005839b5da..b677ca3fd75e 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_device.c +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c @@ -611,6 +611,15 @@ static int adreno_resume(struct device *dev) static int adreno_suspend(struct device *dev) { struct msm_gpu *gpu = dev_to_gpu(dev);
int ret = 0;
Please don't assign and then immediately overwrite.
ret = wait_event_timeout(gpu->retire_event,
!msm_gpu_active(gpu),
msecs_to_jiffies(1000));
if (ret == 0) {
The usual pattern is
long timeleft; timeleft = wait_event_timeout(...) if (!timeleft) { /* no time left; timed out */
Can it be the same pattern here? It helps because people sometimes forget that wait_event_timeout() returns the time that is left and not an error code when it times out.
ok, I'll update in v2..
BR, -R
dev_err(dev, "Timeout waiting for GPU to suspend\n");
return -EBUSY;
} return gpu->funcs->pm_suspend(gpu);
}