Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote:
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
Fix this by adding a proper return value for devices that don't implement PM QoS implicitly.
Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") Signed-off-by: Tero Kristo t-kristo@ti.com Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
3. I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Do you have a clue? Thanks!
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On 31 October 2017 at 14:55, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote:
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
Fix this by adding a proper return value for devices that don't implement PM QoS implicitly.
Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") Signed-off-by: Tero Kristo t-kristo@ti.com Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Do you have a clue? Thanks!
As I didn't have the time to review the original commit, before it got pushed as a fix, I am planning to review it now instead.
A vague guess is that the genpd governor prevents the device from being suspended. That was also the most tricky part of the changes from the original commit, which is causing problems.
I get back to this when I have reviewed it more thoroughly.
Kind regards Uffe
On Tue, Oct 31, 2017 at 3:04 PM, Ulf Hansson ulf.hansson@linaro.org wrote:
On 31 October 2017 at 14:55, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote: > The recent change to the PM QoS framework to introduce a proper > no constraint value overlooked to handle the devices which don't > implement PM QoS OPS. Runtime PM is one of the more severely > impacted subsystems, failing every attempt to runtime suspend > a device. This leads into some nasty second level issues like > probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
> Fix this by adding a proper return value for devices that don't > implement PM QoS implicitly. > > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") > Signed-off-by: Tero Kristo t-kristo@ti.com > Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Do you have a clue? Thanks!
As I didn't have the time to review the original commit, before it got pushed as a fix, I am planning to review it now instead.
A vague guess is that the genpd governor prevents the device from being suspended. That was also the most tricky part of the changes from the original commit, which is causing problems.
I think you are right.
I get back to this when I have reviewed it more thoroughly.
Thanks, and sorry for breaking stuff.
In retrospect I should just have not pushed it so late in the cycle.
Rafael
On Tue, 31 Oct 2017, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
Cc: Marta, Martin
Our CI is hitting this too.
BR, Jani.
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote:
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
Fix this by adding a proper return value for devices that don't implement PM QoS implicitly.
Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") Signed-off-by: Tero Kristo t-kristo@ti.com Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Do you have a clue? Thanks!
On Tue, Oct 31, 2017 at 05:37:50PM +0200, Jani Nikula wrote:
On Tue, 31 Oct 2017, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
Cc: Marta, Martin
Our CI is hitting this too.
Should be ok again, we've locally reverted this patch. But a big chunk of the pw runs overnight did hit it unfortunately :-( -Daniel
BR, Jani.
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote: > The recent change to the PM QoS framework to introduce a proper > no constraint value overlooked to handle the devices which don't > implement PM QoS OPS. Runtime PM is one of the more severely > impacted subsystems, failing every attempt to runtime suspend > a device. This leads into some nasty second level issues like > probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
> Fix this by adding a proper return value for devices that don't > implement PM QoS implicitly. > > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") > Signed-off-by: Tero Kristo t-kristo@ti.com > Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Do you have a clue? Thanks!
-- Jani Nikula, Intel Open Source Technology Center _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Hi Geert,
On Tuesday, 31 October 2017 15:55:02 EET Geert Uytterhoeven wrote:
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven wrote:
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote:
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
Fix this by adding a proper return value for devices that don't implement PM QoS implicitly.
Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") Signed-off-by: Tero Kristo t-kristo@ti.com Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS")
introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch: i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
- I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s
before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2
(21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Thank you for tracking this and notifying me. I like it even better now that the problem seems to be fixed without requiring any action from my side :-)
Do you have a clue? Thanks!
On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote:
The recent change to the PM QoS framework to introduce a proper no constraint value overlooked to handle the devices which don't implement PM QoS OPS. Runtime PM is one of the more severely impacted subsystems, failing every attempt to runtime suspend a device. This leads into some nasty second level issues like probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
Fix this by adding a proper return value for devices that don't implement PM QoS implicitly.
Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") Signed-off-by: Tero Kristo t-kristo@ti.com Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Sorry for the breakage.
OK, I'll just push the reverts to Linus later today.
Do you have a clue?
Well, kind of.
There is a change in behavior in domain_governor.c that should not have made any difference to my eyes, but maybe that's it.
Can you please check if the attached patch makes any difference?
Thanks, Rafael
Hi Rafael,
On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki rafael@kernel.org wrote:
On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote:
On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote: > The recent change to the PM QoS framework to introduce a proper > no constraint value overlooked to handle the devices which don't > implement PM QoS OPS. Runtime PM is one of the more severely > impacted subsystems, failing every attempt to runtime suspend > a device. This leads into some nasty second level issues like > probe failures and increased power consumption among other things.
Oh, that's bad.
Sorry about breaking it and thanks for the fix!
> Fix this by adding a proper return value for devices that don't > implement PM QoS implicitly. > > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") > Signed-off-by: Tero Kristo t-kristo@ti.com > Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com
Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Sorry for the breakage.
OK, I'll just push the reverts to Linus later today.
Do you have a clue?
Well, kind of.
There is a change in behavior in domain_governor.c that should not have made any difference to my eyes, but maybe that's it.
Can you please check if the attached patch makes any difference?
Thanks, but it doesn't seem to fix the issues.
Gr{oetje,eeting}s,
Geert
-- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael,
On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki rafael@kernel.org wrote:
On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote:
On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote: > On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote: > > The recent change to the PM QoS framework to introduce a proper > > no constraint value overlooked to handle the devices which don't > > implement PM QoS OPS. Runtime PM is one of the more severely > > impacted subsystems, failing every attempt to runtime suspend > > a device. This leads into some nasty second level issues like > > probe failures and increased power consumption among other things. > > Oh, that's bad. > > Sorry about breaking it and thanks for the fix! > > > Fix this by adding a proper return value for devices that don't > > implement PM QoS implicitly. > > > > Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") > > Signed-off-by: Tero Kristo t-kristo@ti.com > > Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com > > Applied.
And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Sorry for the breakage.
OK, I'll just push the reverts to Linus later today.
Do you have a clue?
Well, kind of.
There is a change in behavior in domain_governor.c that should not have made any difference to my eyes, but maybe that's it.
Can you please check if the attached patch makes any difference?
Thanks, but it doesn't seem to fix the issues.
Thanks for testing!
I've just pushed the reverts, but the PM QoS still needs to be fixed, so we have to get to the bottom of this.
The current theory goes that the changes in domain_governor.c are to blame. Is genpd involved in all of the issues with the PM QoS fix you have seen?
Thanks, Rafael
On 01/11/17 00:32, Rafael J. Wysocki wrote:
On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael,
On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki rafael@kernel.org wrote:
On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki rjw@rjwysocki.net wrote: > On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote: >> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com wrote: >>> The recent change to the PM QoS framework to introduce a proper >>> no constraint value overlooked to handle the devices which don't >>> implement PM QoS OPS. Runtime PM is one of the more severely >>> impacted subsystems, failing every attempt to runtime suspend >>> a device. This leads into some nasty second level issues like >>> probe failures and increased power consumption among other things. >> >> Oh, that's bad. >> >> Sorry about breaking it and thanks for the fix! >> >>> Fix this by adding a proper return value for devices that don't >>> implement PM QoS implicitly. >>> >>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") >>> Signed-off-by: Tero Kristo t-kristo@ti.com >>> Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com >> >> Applied. > > And pushed to Linus.
I'm afraid it is not sufficient.
Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") introduced two issues on Renesas platforms:
After boot up, many devices have changed their state from "suspended" to "active", according to /sys/kernel/debug/pm_genpd/pm_genpd_summary (comparing that file across boots is one of my standard tests). Interestingly, doing a system suspend/resume cycle restores their state to "suspended".
During system suspend, the following warning is printed on r8a7791/koelsch:
i2c-rcar e6530000.i2c: runtime PM trying to suspend device but
active child
I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR*
[CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") fixes the second issue, but not the first.
... nor the third.
Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM QoS") fixes both.
... all three.
Sorry for the breakage.
OK, I'll just push the reverts to Linus later today.
Do you have a clue?
Well, kind of.
There is a change in behavior in domain_governor.c that should not have made any difference to my eyes, but maybe that's it.
Can you please check if the attached patch makes any difference?
Thanks, but it doesn't seem to fix the issues.
Thanks for testing!
I've just pushed the reverts, but the PM QoS still needs to be fixed, so we have to get to the bottom of this.
The current theory goes that the changes in domain_governor.c are to blame. Is genpd involved in all of the issues with the PM QoS fix you have seen?
Thanks, Rafael
It seems the default values for pm_qos have changed with the patch, and that breaks genpd at least. I only fixed PM runtime initially, but you could try this diff to fix the genpd part also:
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index d68b056..7c8f643 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -34,9 +34,9 @@ enum pm_qos_flags_status { #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0 -#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 +#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE PM_QOS_LATENCY_ANY #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY -#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 +#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1) #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) -- Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
On Wed, Nov 1, 2017 at 11:28 AM, Tero Kristo t-kristo@ti.com wrote:
On 01/11/17 00:32, Rafael J. Wysocki wrote:
On Tue, Oct 31, 2017 at 7:07 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael,
On Tue, Oct 31, 2017 at 6:22 PM, Rafael J. Wysocki rafael@kernel.org wrote:
On Tue, Oct 31, 2017 at 2:55 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
Hi Rafael, Tero,
CC pinchartl, dri-devel
On Tue, Oct 31, 2017 at 2:10 PM, Geert Uytterhoeven geert@linux-m68k.org wrote:
CC linux-renesas-soc
On Tue, Oct 31, 2017 at 2:09 PM, Geert Uytterhoeven geert@linux-m68k.org wrote: > > On Tue, Oct 31, 2017 at 12:27 AM, Rafael J. Wysocki > rjw@rjwysocki.net wrote: >> >> On Monday, October 30, 2017 11:19:08 AM CET Rafael J. Wysocki wrote: >>> >>> On Mon, Oct 30, 2017 at 8:10 AM, Tero Kristo t-kristo@ti.com >>> wrote: >>>> >>>> The recent change to the PM QoS framework to introduce a proper >>>> no constraint value overlooked to handle the devices which don't >>>> implement PM QoS OPS. Runtime PM is one of the more severely >>>> impacted subsystems, failing every attempt to runtime suspend >>>> a device. This leads into some nasty second level issues like >>>> probe failures and increased power consumption among other things. >>> >>> >>> Oh, that's bad. >>> >>> Sorry about breaking it and thanks for the fix! >>> >>>> Fix this by adding a proper return value for devices that don't >>>> implement PM QoS implicitly. >>>> >>>> Fixes: 0cc2b4e5a020 ("PM / QoS: Fix device resume latency PM QoS") >>>> Signed-off-by: Tero Kristo t-kristo@ti.com >>>> Cc: Rafael J. Wysocki rafael.j.wysocki@intel.com >>> >>> >>> Applied. >> >> >> And pushed to Linus. > > > I'm afraid it is not sufficient. > > Commit 0cc2b4e5a020fc7f ("PM / QoS: Fix device resume latency PM > QoS") > introduced two issues on Renesas platforms: > 1. After boot up, many devices have changed their state from > "suspended" > to "active", according to > /sys/kernel/debug/pm_genpd/pm_genpd_summary > (comparing that file across boots is one of my standard tests). > Interestingly, doing a system suspend/resume cycle restores > their state > to "suspended". > > 2. During system suspend, the following warning is printed on > r8a7791/koelsch: > > i2c-rcar e6530000.i2c: runtime PM trying to suspend device > but > active child
- I've just bisected a seemingly unrelated issue to the same commit. On Salvator-XS with R-Car H3, initialization of the rcar-du driver
now takes more than 1 minute due to flip_done time outs, while it took 0.12s before:
[ 3.015035] [drm] Supports vblank timestamp caching Rev 2
(21.10.2013). [ 3.021721] [drm] No driver support for vblank timestamp query. [ 13.280738] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 23.520707] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 33.760708] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.000755] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 44.003597] Console: switching to colour frame buffer device 128x48 [ 54.240707] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.480706] [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:58:crtc-3] flip_done timed out [ 64.544876] rcar-du feb00000.display: fb0: frame buffer device [ 64.552013] [drm] Initialized rcar-du 1.0.0 20130110 for feb00000.display on minor 0 [ 64.559873] [drm] Device feb00000.display probed
> Commit 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm device > resume > latency") fixes the second issue, but not the first.
... nor the third.
> Reverting commits 2a9a86d5c81389cd ("PM / QoS: Fix default runtime_pm > device resume latency") and 0cc2b4e5a020fc7f ("PM / QoS: Fix device > resume > latency PM QoS") fixes both.
... all three.
Sorry for the breakage.
OK, I'll just push the reverts to Linus later today.
> Do you have a clue?
Well, kind of.
There is a change in behavior in domain_governor.c that should not have made any difference to my eyes, but maybe that's it.
Can you please check if the attached patch makes any difference?
Thanks, but it doesn't seem to fix the issues.
Thanks for testing!
I've just pushed the reverts, but the PM QoS still needs to be fixed, so we have to get to the bottom of this.
The current theory goes that the changes in domain_governor.c are to blame. Is genpd involved in all of the issues with the PM QoS fix you have seen?
Thanks, Rafael
It seems the default values for pm_qos have changed with the patch, and that breaks genpd at least. I only fixed PM runtime initially, but you could try this diff to fix the genpd part also:
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index d68b056..7c8f643 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -34,9 +34,9 @@ enum pm_qos_flags_status { #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0 -#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 +#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE PM_QOS_LATENCY_ANY #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY -#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 +#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1) #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0)
This is the original change in pm_qos.h (up to the GMail-induced whitespace breakage):
-#define PM_QOS_DEFAULT_VALUE -1 +#define PM_QOS_DEFAULT_VALUE (-1) +#define PM_QOS_LATENCY_ANY S32_MAX
#define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0 #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 +#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
-#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) #define PM_QOS_FLAG_REMOTE_WAKEUP (1 << 1)
so the only thing that really has changed is the addition of PM_QOS_RESUME_LATENCY_NO_CONSTRAINT, so I'm not really sure what you mean. Care to elaborate?
There is a bug in the genpd part of the original patch (the multiplication by NSEC_PER_USEC in dev_update_qos_constraint() should not be applied to the effective_constraint value), but it doesn't matter too much now that the problematic commit has been reverted.
I'll post a new version of it for testing shortly, but on top of a genpd governor patch to make it behave consistently.
Thanks, Rafael
[cut]
It seems the default values for pm_qos have changed with the patch, and that breaks genpd at least. I only fixed PM runtime initially, but you could try this diff to fix the genpd part also:
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index d68b056..7c8f643 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -34,9 +34,9 @@ enum pm_qos_flags_status { #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0 -#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 +#define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE PM_QOS_LATENCY_ANY #define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY -#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 +#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE (-1) #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0)
This is the original change in pm_qos.h (up to the GMail-induced whitespace breakage):
-#define PM_QOS_DEFAULT_VALUE -1 +#define PM_QOS_DEFAULT_VALUE (-1) +#define PM_QOS_LATENCY_ANY S32_MAX
#define PM_QOS_CPU_DMA_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_LAT_DEFAULT_VALUE (2000 * USEC_PER_SEC) #define PM_QOS_NETWORK_THROUGHPUT_DEFAULT_VALUE 0 #define PM_QOS_MEMORY_BANDWIDTH_DEFAULT_VALUE 0 #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 +#define PM_QOS_RESUME_LATENCY_NO_CONSTRAINT PM_QOS_LATENCY_ANY
OK, so I should have changed PM_QOS_RESUME_LATENCY_DEFAULT_VALUE to PM_QOS_LATENCY_ANY too, so that the default is still "no restriction".
#define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1)
-#define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) #define PM_QOS_FLAG_REMOTE_WAKEUP (1 << 1)
Thanks, Rafael
dri-devel@lists.freedesktop.org