19.08.2021 16:07, Ulf Hansson пишет:
On Wed, 18 Aug 2021 at 17:43, Dmitry Osipenko digetx@gmail.com wrote:
18.08.2021 13:08, Ulf Hansson пишет:
On Wed, 18 Aug 2021 at 11:50, Viresh Kumar viresh.kumar@linaro.org wrote:
On 18-08-21, 11:41, Ulf Hansson wrote:
On Wed, 18 Aug 2021 at 11:14, Viresh Kumar viresh.kumar@linaro.org wrote:
What we need here is just configure. So something like this then:
- genpd->get_performance_state() -> dev_pm_opp_get_current_opp() //New API -> dev_pm_genpd_set_performance_state(dev, current_opp->pstate);
This can be done just once from probe() then.
How would dev_pm_opp_get_current_opp() work? Do you have a suggestion?
The opp core already has a way of finding current OPP, that's what Dmitry is trying to use here. It finds it using clk_get_rate(), if that is zero, it picks the lowest freq possible.
I am sure I understand the problem. When a device is getting probed, it needs to consume power, how else can the corresponding driver successfully probe it?
Dmitry can answer that better, but a device doesn't necessarily need to consume energy in probe. It can consume bus clock, like APB we have, but the more energy consuming stuff can be left disabled until the time a user comes up. Probe will just end up registering the driver and initializing it.
That's perfectly fine, as then it's likely that it won't vote for an OPP, but can postpone that as well.
Perhaps the problem is rather that the HW may already carry a non-zero vote made from a bootloader. If the consumer driver tries to clear that vote (calling dev_pm_opp_set_rate(dev, 0), for example), it would still not lead to any updates of the performance state in genpd, because genpd internally has initialized the performance-state to zero.
We don't need to discover internal SoC devices because we use device-tree on ARM. For most devices power isn't required at a probe time because probe function doesn't touch h/w at all, thus devices are left in suspended state after probe.
We have three components comprising PM on Tegra:
- Power gate
- Clock state
- Voltage state
GENPD on/off represents the 'power gate'.
Clock and reset are controlled by device drivers using clk and rst APIs.
Voltage state is represented by GENPD's performance level.
GENPD core assumes that at a first rpm-resume of a consumer device, its genpd_performance=0. Not true for Tegra because h/w of the device is preconfigured to a non-zero perf level initially, h/w may not support zero level at all.
I think you may be misunderstanding genpd's behaviour around this, but let me elaborate.
In genpd_runtime_resume(), we try to restore the performance state for the device that genpd_runtime_suspend() *may* have dropped earlier. That means, if genpd_runtime_resume() is called prior genpd_runtime_suspend() for the first time, it means that genpd_runtime_resume() will *not* restore a performance state, but instead just leave the performance state as is for the device (see genpd_restore_performance_state()).
In other words, a consumer driver may use the following sequence to set an initial performance state for the device during ->probe():
... rate = clk_get_rate() dev_pm_opp_set_rate(rate)
pm_runtime_enable() pm_runtime_resume_and_get() ...
Note that, it's the consumer driver's responsibility to manage device specific resources, in its ->runtime_suspend|resume() callbacks. Typically that means dealing with clock gating/ungating, for example.
In the other scenario where a consumer driver prefers to *not* call pm_runtime_resume_and_get() in its ->probe(), because it doesn't need to power on the device to complete probing, then we don't want to vote for an OPP at all - and we also want the performance state for the device in genpd to be set to zero. Correct?
Yes
Is this the main problem you are trying to solve, because I think this doesn't work out of the box as of today?
The main problem is that the restored performance state is zero for the first genpd_runtime_resume(), while it's not zero from the h/w perspective.
There is another concern though, but perhaps it's not a problem after all. Viresh told us that dev_pm_opp_set_rate() may turn on resources like clock/regulators. That could certainly be problematic, in particular if the device and its genpd have OPP tables associated with it and the consumer driver wants to follow the above sequence in probe.
dev_pm_opp_set_rate() won't enable clocks and regulators, but it may change the clock rate and voltage. This is also platform/driver specific because it's up to OPP user how to configure OPP table. On Tegra we only assign clock to OPP table, regulators are unused.
Viresh, can you please chime in here and elaborate on some of the magic happening behind dev_pm_opp_set_rate() API - is there a problem here or not?
GENPD core assumes that consumer devices can work at any performance level. Not true for Tegra because voltage needs to be set in accordance to the clock rate before clock is enabled, otherwise h/w won't work properly, perhaps clock may be unstable or h/w won't be latching.
Correct. Genpd relies on the callers to use the OPP framework if there are constraints like you describe above.
That said, it's not forbidden for a consumer driver to call dev_pm_genpd_set_performance_state() directly, but then it better knows exactly what it's doing.
Performance level should be set to 0 while device is suspended.
Do you mean system suspend or runtime suspend? Or both?
Runtime suspend.
Performance level needs to be bumped on rpm-resume of a device in accordance to h/w state before hardware is enabled.
Assuming there was a performance state set for the device when genpd_runtime_suspend() was called, genpd_runtime_resume() will restore that state according to the sequence you described.
What do you think about adding API that will allow drivers to explicitly set the restored performance state of a power domain?
Another option could be to change the GENPD core, making it to set the rpm_pstate when dev_pm_genpd_set_performance_state(dev) is invoked and device is rpm-suspended, instead of calling the genpd->set_performance_state callback.
Then drivers will be able to sync the perf state at a probe time.
What do you think?
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index a934c679e6ce..cc15ab9eacc9 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -435,7 +435,7 @@ static void genpd_restore_performance_state(struct device *dev, int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) { struct generic_pm_domain *genpd; - int ret; + int ret = 0;
genpd = dev_to_genpd_safe(dev); if (!genpd) @@ -446,7 +446,10 @@ int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state) return -EINVAL;
genpd_lock(genpd); - ret = genpd_set_performance_state(dev, state); + if (pm_runtime_suspended(dev)) + dev_gpd_data(dev)->rpm_pstate = state; + else + ret = genpd_set_performance_state(dev, state); genpd_unlock(genpd);
return ret;