In case of failure we need to remove OPP table.
Use Linux classic error handling with goto usage.
Signed-off-by: Clément Péron peron.clem@gmail.com --- drivers/gpu/drm/panfrost/panfrost_devfreq.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c index 413987038fbf..62541f4edd81 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c @@ -90,8 +90,11 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) cur_freq = clk_get_rate(pfdev->clock);
opp = devfreq_recommended_opp(dev, &cur_freq, 0); - if (IS_ERR(opp)) - return PTR_ERR(opp); + if (IS_ERR(opp)) { + DRM_DEV_ERROR(dev, "Failed to set recommended OPP\n"); + ret = PTR_ERR(opp); + goto err_opp; + }
panfrost_devfreq_profile.initial_freq = cur_freq; dev_pm_opp_put(opp); @@ -100,8 +103,8 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL); if (IS_ERR(devfreq)) { DRM_DEV_ERROR(dev, "Couldn't initialize GPU devfreq\n"); - dev_pm_opp_of_remove_table(dev); - return PTR_ERR(devfreq); + ret = PTR_ERR(devfreq); + goto err_opp; } pfdev->devfreq.devfreq = devfreq;
@@ -112,6 +115,11 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) pfdev->devfreq.cooling = cooling;
return 0; + +err_opp: + dev_pm_opp_of_remove_table(dev); + + return ret; }
void panfrost_devfreq_fini(struct panfrost_device *pfdev)
OPP table can defined both frequency and voltage.
Register the mali regulator if it exist.
Signed-off-by: Clément Péron peron.clem@gmail.com --- drivers/gpu/drm/panfrost/panfrost_devfreq.c | 34 ++++++++++++++++++--- drivers/gpu/drm/panfrost/panfrost_device.h | 1 + 2 files changed, 31 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.c b/drivers/gpu/drm/panfrost/panfrost_devfreq.c index 62541f4edd81..2dc8e2355358 100644 --- a/drivers/gpu/drm/panfrost/panfrost_devfreq.c +++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.c @@ -78,12 +78,26 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) struct device *dev = &pfdev->pdev->dev; struct devfreq *devfreq; struct thermal_cooling_device *cooling; + const char *mali = "mali"; + struct opp_table *opp_table = NULL; + + /* Regulator is optional */ + opp_table = dev_pm_opp_set_regulators(dev, &mali, 1); + if (IS_ERR(opp_table)) { + ret = PTR_ERR(opp_table); + if (ret != -ENODEV) { + DRM_DEV_ERROR(dev, "Failed to set regulator: %d\n", ret); + return ret; + } + } + pfdev->devfreq.opp_table = opp_table;
ret = dev_pm_opp_of_add_table(dev); - if (ret == -ENODEV) /* Optional, continue without devfreq */ - return 0; - else if (ret) - return ret; + if (ret) { + if (ret == -ENODEV) /* Optional, continue without devfreq */ + ret = 0; + goto err_opp_reg; + }
panfrost_devfreq_reset(pfdev);
@@ -119,6 +133,12 @@ int panfrost_devfreq_init(struct panfrost_device *pfdev) err_opp: dev_pm_opp_of_remove_table(dev);
+err_opp_reg: + if (pfdev->devfreq.opp_table) { + dev_pm_opp_put_regulators(pfdev->devfreq.opp_table); + pfdev->devfreq.opp_table = NULL; + } + return ret; }
@@ -126,7 +146,13 @@ void panfrost_devfreq_fini(struct panfrost_device *pfdev) { if (pfdev->devfreq.cooling) devfreq_cooling_unregister(pfdev->devfreq.cooling); + dev_pm_opp_of_remove_table(&pfdev->pdev->dev); + + if (pfdev->devfreq.opp_table) { + dev_pm_opp_put_regulators(pfdev->devfreq.opp_table); + pfdev->devfreq.opp_table = NULL; + } }
void panfrost_devfreq_resume(struct panfrost_device *pfdev) diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index 06713811b92c..f6b0c779dfe5 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -86,6 +86,7 @@ struct panfrost_device { struct { struct devfreq *devfreq; struct thermal_cooling_device *cooling; + struct opp_table *opp_table; ktime_t busy_time; ktime_t idle_time; ktime_t time_last_update;
Hi Panfrost and OPP Maintainers,
On Sat, 11 Apr 2020 at 22:06, Clément Péron peron.clem@gmail.com wrote:
OPP table can defined both frequency and voltage.
Register the mali regulator if it exist.
After this patch, Panfrost update properly both voltage and frequency. But the GPU is still not properly down-clocked when temperature is high.
I try to add a cooling map like this : https://github.com/clementperon/linux/commit/955961c7c035abbf44e74f608fe8f05...
But got the following error: [ 2.712082] panfrost 1800000.gpu: [drm:panfrost_devfreq_init [panfrost]] Failed to register cooling device
Do you see what I'm missing?
Thanks for your help, Clement
On 11/04/2020 21:06, Clément Péron wrote:
This looks like it applies before 3e1399bccf51 ("drm/panfrost: Add support for multiple regulators") which is currently in drm-misc-next (and linux-next). You want something more like:
opp_table = dev_pm_opp_set_regulators(dev, pfdev->comp->supply_names, pfdev->comp->num_supplies);
Otherwise a platform with multiple regulators won't work correctly.
Also running on my firefly (RK3288) board I get the following warning:
debugfs: Directory 'ffa30000.gpu-mali' with parent 'vdd_gpu' already present!
This is due to the regulator debugfs entries getting created twice (once in panfrost_regulator_init() and once here).
I have been taking a look at doing the same thing (I picked up Martin Blumenstingl's patch series[1]), but haven't had much time to focus on this recently.
Thanks,
Steve
[1] https://lore.kernel.org/dri-devel/20200112001623.2121227-1-martin.blumenstin...
Hi Steven,
On Mon, 13 Apr 2020 at 15:18, Steven Price steven.price@arm.com wrote:
Thanks for you review, indeed I didn't see that multiple regulators support has been added. Will update in v2.
Is it a warning that should be consider as an error? Look's more an info no? What should be the correct behavior if a device want to register two times the same regulator?
Link to original discussion: https://lore.kernel.org/patchwork/patch/1176717/
Thanks, Clement
Hi,
On Mon, 13 Apr 2020 at 16:18, Clément Péron peron.clem@gmail.com wrote:
Or we can change the name from vdd_XXX to opp_vdd_XXX ? https://elixir.bootlin.com/linux/latest/source/drivers/opp/debugfs.c#L45
On 13/04/2020 15:31, Clément Péron wrote:
Yes, I'm not sure that it's actually a problem in practice. And it may well be correct to change this in the generic code rather than try to work around it in Panfrost. But we shouldn't spam the user with warnings as that makes real issues harder to see.
Your suggestion to change the name seems reasonable to me, but I don't fully understand the opp code, so we'd need some review from the OPP maintainers. Hopefully Viresh, Nishanth or Stephen can provide some insight.
Steve
Hi Steven,
On Mon, 13 Apr 2020 at 17:55, Steven Price steven.price@arm.com wrote:
Agree, I will send a v2 with the rename and see if OPP Maintainers agree.
Regards, Clement
Steve
Hi Steven,
On Mon, 13 Apr 2020 at 18:35, Clément Péron peron.clem@gmail.com wrote:
I try to reproduce but it can't regulator is mount at : ./regulator/vdd-gpu whereas OPP is mount : ./opp/soc-1800000.gpu/opp:756000000/supply-0/
I see that firefly as 2 regulators with the same name : vdd_gpu from syr828 (https://github.com/mopplayer/Firefly-RK3288-Kernel-With-Mali764/blob/master/...) vdd_gpu from rk808_dcdc2_reg (https://github.com/mopplayer/Firefly-RK3288-Kernel-With-Mali764/blob/master/...)
So i think the issue is from the firefly device-tree.
Regards, Clement
Hi Clément,
On 13/04/2020 18:28, Clément Péron wrote:
Getting a backtrace from the two occurrences, I see one added from:
(debugfs_create_dir) from [<c04633f8>] (create_regulator+0xe0/0x220) (create_regulator) from [<c04681d8>] (_regulator_get+0x168/0x204) (_regulator_get) from [<c04682e0>] (regulator_bulk_get+0x64/0xf4) (regulator_bulk_get) from [<c04696f0>] (devm_regulator_bulk_get+0x40/0x74) (devm_regulator_bulk_get) from [<bf00af44>] (panfrost_device_init+0x1b4/0x48c [panfrost]) (panfrost_device_init [panfrost]) from [<bf00a4d4>] (panfrost_probe+0x94/0x184 [panfrost]) (panfrost_probe [panfrost]) from [<c04ee694>] (platform_drv_probe+0x48/0x94)
And the other:
(debugfs_create_dir) from [<c04633f8>] (create_regulator+0xe0/0x220) (create_regulator) from [<c04681d8>] (_regulator_get+0x168/0x204) (_regulator_get) from [<c05c1280>] (dev_pm_opp_set_regulators+0x6c/0x184) (dev_pm_opp_set_regulators) from [<bf00b4ac>] (panfrost_devfreq_init+0x38/0x1ac [panfrost]) (panfrost_devfreq_init [panfrost]) from [<bf00a508>] (panfrost_probe+0xc8/0x184 [panfrost]) (panfrost_probe [panfrost]) from [<c04ee694>] (platform_drv_probe+0x48/0x94)
Both are created at /regulator/vdd_gpu
I'm using the DTB from the kernel tree (/arch/arm/boot/dts/rk3288-firefly.dts) which as far as I can see doesn't contain this problem. Certainly decompiling the DTB I can see only one mention of vdd_gpu (in syr828).
Steve
Hi Liam and Mark,
On Tue, 14 Apr 2020 at 15:10, Steven Price steven.price@arm.com wrote:
Sorry, you're right I have indeed the same issue: [ 2.903406] panfrost 1800000.gpu: Features: L2:0x07110206 Shader:0x00000000 Tiler:0x00000809 Mem:0x1 MMU:0x00002821 AS:0xf JS:0x7 [ 2.913297] sunxi-ir 7040000.ir: initialized sunXi IR driver [ 2.919901] panfrost 1800000.gpu: shader_present=0x3 l2_present=0x1 [ 2.920497] panfrost 1800000.gpu: Looking up mali-supply from device tree [ 3.036568] vdd-gpu: could not add device link 1800000.gpu err -17 [ 3.036580] debugfs: Directory '1800000.gpu-mali' with parent 'vdd-gpu' already present! [ 3.046312] panfrost 1800000.gpu: [drm:panfrost_devfreq_init [panfrost]] Failed to register cooling device [ 3.056322] [drm] Initialized panfrost 1.1.0 20180908 for 1800000.gpu on minor 0
We are having an issue with Panfrost driver registering two times the same regulator and giving an error when trying to create the debugfs folder.
Could you clarify if it is allowed for a device to register two times the same regulator?
I check Documentation/power/regulator/regulator.rst but this point is not specified.
Thanks, Clement
On Tue, Apr 14, 2020 at 08:20:23PM +0200, Clément Péron wrote:
Hi Liam and Mark,
You might want to flag stuff like this in the subject line, I very nearly deleted this without opening it since most of the email I get about panfrost appears to be coming from me having sent patches rather than being relevant.
Could you clarify if it is allowed for a device to register two times the same regulator?
I check Documentation/power/regulator/regulator.rst but this point is not specified.
We don't actively prevent it and I can't think what other than debugfs might run into problems (and that's just a warning) but it does seem like a weird thing to want to do and like it's pointing to some confusion in your code with two different parts of the device controlling the same supply independently. What's the use case here?
Hi Mark,
On Tue, 14 Apr 2020 at 20:55, Mark Brown broonie@kernel.org wrote:
Ok will do next time,
Panfrost first probe clock, reset and regulator in device_init: https://elixir.bootlin.com/linux/v5.7-rc1/source/drivers/gpu/drm/panfrost/pa... Then it probe for optional devfreq, devfreq will get opp tables and also get regulator again. https://elixir.bootlin.com/linux/v5.7-rc1/source/drivers/gpu/drm/panfrost/pa...
That's can be reworked and Panfrost can only probe regulator if there is no opp-table.
But if multiple regulator is not an issue and as each request is logic. The first in device_init assure to enable the regulator and the second in OPP assure the voltage level.
Maybe we can just fix this warning?
Thanks, Clement
On 14/04/2020 20:16, Clément Péron wrote:
This is what I was thinking about looking at. But it may make sense instead to extend the regulator API to allow multiple regualtor_get() calls for a single device. I haven't had time to dig into how difficult this would be.
From what I can see in the code, just silencing the warning would lead to 'odd' behaviour with debugfs. The first struct regulator Panfrost acquires is the one that is used purely for turning the GPU on (no voltage scaling). The second struct regulator is the one which is obtained by the OPP framework. However the debugfs entries point into the actual struct regulator, so it would be far more logical/useful if those were pointing into the second struct regulator.
Ideally calling regulator_get a second time for the same device would simply return the same struct regulator object (with a reference count increment).
Perhaps a better approach would be for Panfrost to hand over the struct regulator objects it has already got to the OPP framework. I.e. open code dev_pm_opp_set_regulators(), but instead of calling regulator_get_optional() simply populate the regulators we already have?
The other benefit of that is it would provide a clear hand-over of responsibility between Panfrost handling it's own regulators and the OPP framework picking up the work. The disadvantage is that Panfrost would have to track whether the regulators have been handed over or not.
Steve
On Thu, Apr 16, 2020 at 02:42:13PM +0100, Steven Price wrote:
On 14/04/2020 20:16, Clément Péron wrote:
That's can be reworked and Panfrost can only probe regulator if there is no opp-table.
To repeat what I said before we don't actively stop this, it's just not something that seems particularly tasteful and the warning does find actual errors. I definitely don't think it's a good idea to extend the API for this.
Ideally calling regulator_get a second time for the same device would simply return the same struct regulator object (with a reference count increment).
One of the goals with the distinct struct regulator is to make sure that we track all the user's activity together - if we mix multiple users in there it becomes harder to tell if something is going wrong.
On 2020-04-16 2:42 pm, Steven Price wrote: [...]
Sounds like the most logical thing to do is to shuffle things around so we start by trying to set up an OPP table, then fall back to explicitly claiming clocks and regulators if necessary. Then we can easily make the devfreq decision later in probe based on how that turned out.
Robin.
Hi Robin,
On Fri, 17 Apr 2020 at 13:10, Robin Murphy robin.murphy@arm.com wrote:
Ok I will propose a new serie with this behavior,
Thanks Clement
Robin.
Hi,
On Fri, 17 Apr 2020 at 14:33, Clément Péron peron.clem@gmail.com wrote:
Just saw that a Lima devfreq[0] has been also introduced with I think exactly the same logic.
Is this something that hasn't been triggered by Maintainer or I am missing something?
I will backport some remarks made on the lima devfreq to improve panfrost one. They are almost identical.
Regards, Clement
0: https://cgit.freedesktop.org/drm-misc/commit/?id=1996970773a323533e1cc1b6b97...
On Sun, Apr 19, 2020 at 11:25:08AM +0200, Clément Péron wrote:
Just saw that a Lima devfreq[0] has been also introduced with I think exactly the same logic.
Is this something that hasn't been triggered by Maintainer or I am missing something?
My understanding is that there is very little use of any of this upstream since it's all pretty new, some platforms have OPPs but use a firmware interface rather than the OS to control clocks and regulators while most other platforms don't have OPPs defined and it's only platforms with both regulators and OPPs that are affected.
Hi Steven,
On Tue, 14 Apr 2020 at 15:10, Steven Price steven.price@arm.com wrote:
<snip>
I'm preparing a new version with some clean from lima devfreq. My working branch : https://github.com/clementperon/linux/commits/panfrost_devfreq
Two strange things I observe: - After 30sec the regulator is released by OPP ??? [ 33.757627] vdd-gpu: disabling Introduce the regulator support in this commit: https://github.com/clementperon/linux/commit/be310c37b82010e293b7f129ccdcb71...
- The Cooling map is not probe correctly : [ 2.545756] panfrost 1800000.gpu: [drm:panfrost_devfreq_init [panfrost]] Failed to register cooling device Introduce in this commit : https://github.com/clementperon/linux/commit/0252c38fd55ad78366ac4b1714e285c...
Do you have an hint about what I'm missing ?
Thanks for your help, Clement
On 02/05/2020 23:07, Clément Péron wrote:
I had a look at that branch and gave it a quick spin on my Firefly RK3288 and didn't notice any issues.
I can't see anything wrong with this commit, but equally in my DTS I have a "regulator-always-on" for vdd_gpu. My initial thought was that this could be runtime PM of the GPU - but I can't see how panfrost_device_suspend() would end up turning off the regulator. So unless there's some way that the regulator itself suspends (but it should know it's in use) I've no clue why this would be happening.
Since you've got a reproduction - can you get a backtrace where the regulator is getting disabled?
Sorry, my knowledge of the cooling framework is very limited. What you've got looks plausible, but I'm afraid I can't really help beyond that! As before - can you try adding some printk()s in e.g. of_devfreq_cooling_register_power() and find out where it is bailing out?
Steve
Hi Steven,
On Thu, 7 May 2020 at 16:30, Steven Price steven.price@arm.com wrote:
<snip>
Since you've got a reproduction - can you get a backtrace where the regulator is getting disabled?
Regulator is disabled from regulator_late_cleanup()
[ 33.757650] vdd-gpu: disabling [ 33.760718] CPU: 2 PID: 31 Comm: kworker/2:1 Not tainted 5.7.0-rc2-next-20200424 #8 [ 33.768362] Hardware name: Beelink GS1 (DT) [ 33.772553] Workqueue: events regulator_init_complete_work_function [ 33.778813] Call trace: [ 33.781261] dump_backtrace+0x0/0x1a0 [ 33.784922] show_stack+0x18/0x30 [ 33.788238] dump_stack+0xc0/0x114 [ 33.791638] regulator_late_cleanup+0x164/0x1f0 [ 33.796165] class_for_each_device+0x64/0xe0 [ 33.800431] regulator_init_complete_work_function+0x4c/0x60 [ 33.806084] process_one_work+0x19c/0x330 [ 33.810090] worker_thread+0x4c/0x430 [ 33.813748] kthread+0x138/0x160 [ 33.816973] ret_from_fork+0x10/0x24
the use_count is at 0...
I have check and the regulator_get is called and regulator_put is never called for vdd-gpu. Not sure what is happening here...
Dumb issue, I was missing the CONFIG_DEVFREQ_THERMAL -_-, I will make a patch to enable it in arm64 defconfig.
Regards, Clement
Steve
On Sat, 9 May 2020 at 18:28, Clément Péron peron.clem@gmail.com wrote:
Looks like the OPP framework only get the regulator but never enable it...
I will send a question to OPP Maintainer about this.
Regards, CLement
On 2020-04-13 3:18 pm, Clément Péron wrote:
Although note that we probably also want a version that can be backported to stable without that dependency, since this behaviour is arguably a regression since 221bc77914cb ("drm/panfrost: Use generic code for devfreq"), and does appear to have been causing subtle problems for users.
Robin.
dri-devel@lists.freedesktop.org