[PATCH 0/5] msm/drm: A6x DCVS series

List overview All Threads
Download

newer

older

[PATCH v8 0/4] drm/atmel-hlcdc:...

[PATCH] drm/sun4i: tcon-top: Use...

Sharat Masetty

23 Aug 2018 23 Aug '18

9:18 a.m.

This patch series starts off with a few bug fixes in devfreq code, followed by refactoring the devfreq code needed for supporting different chipsets, and ends with adding devfreq support for A6x.

Sharat Masetty (5): drm/msm: suspend devfreq on init drm/msm: unregister devfreq upon clean up drm/msm/A6x: Add gmu_read64() register read op drm/msm: re-factor devfreq code drm/msm/A6x: Add devfreq support in A6x

drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 16 ++++++++--- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 15 ++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ drivers/gpu/drm/msm/msm_gpu.c | 53 +++++++++++++++++++++-------------- drivers/gpu/drm/msm/msm_gpu.h | 5 +++- 7 files changed, 133 insertions(+), 31 deletions(-)

-- 1.9.1

Show replies by date

Sharat Masetty

23 Aug 23 Aug

9:18 a.m.

New subject: [PATCH 1/5] drm/msm: suspend devfreq on init

Devfreq turns on and starts recommending power level as soon as it is initialized. The GPU is still not powered on by the time the devfreq init happens and this leads to problems on GPU's where register access is needed to get/set power levels. So we start suspended and only restart devfreq when GPU is powered on.

Signed-off-by: Sharat Masetty smasetty@codeaurora.org --- drivers/gpu/drm/msm/msm_gpu.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 5281a32..04f9604 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -104,6 +104,8 @@ static void msm_devfreq_init(struct msm_gpu *gpu) dev_err(&gpu->pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; } + + devfreq_suspend_device(gpu->devfreq.devfreq); }

static int enable_pwrrail(struct msm_gpu *gpu)

-- 1.9.1

Jordan Crouse

3:39 p.m.

New subject: [PATCH 1/5] drm/msm: suspend devfreq on init

On Thu, Aug 23, 2018 at 02:48:27PM +0530, Sharat Masetty wrote:

...

Devfreq turns on and starts recommending power level as soon as it is initialized. The GPU is still not powered on by the time the devfreq init happens and this leads to problems on GPU's where register access is needed to get/set power levels. So we start suspended and only restart devfreq when GPU is powered on.

Reviewed-by: Jordan Crouse jcrouse@codeaurora.org

...

Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/msm_gpu.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 5281a32..04f9604 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -104,6 +104,8 @@ static void msm_devfreq_init(struct msm_gpu *gpu) dev_err(&gpu->pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; }

devfreq_suspend_device(gpu->devfreq.devfreq);

}

static int enable_pwrrail(struct msm_gpu *gpu)

1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Sharat Masetty

9:18 a.m.

New subject: [PATCH 2/5] drm/msm: unregister devfreq upon clean up

Call the devfreq_remove_device() API to remove the GPU devfreq instance during GPU driver cleanup.

Signed-off-by: Sharat Masetty smasetty@codeaurora.org --- drivers/gpu/drm/msm/msm_gpu.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 04f9604..83fd602 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -970,6 +970,8 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)

WARN_ON(!list_empty(&gpu->active_list));

+ devfreq_remove_device(gpu->devfreq.devfreq); + for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) { msm_ringbuffer_destroy(gpu->rb[i]); gpu->rb[i] = NULL;

-- 1.9.1

Jordan Crouse

3:42 p.m.

New subject: [PATCH 2/5] drm/msm: unregister devfreq upon clean up

On Thu, Aug 23, 2018 at 02:48:28PM +0530, Sharat Masetty wrote:

...

Call the devfreq_remove_device() API to remove the GPU devfreq instance during GPU driver cleanup.

Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/msm_gpu.c | 2 ++ 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 04f9604..83fd602 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -970,6 +970,8 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)

WARN_ON(!list_empty(&gpu->active_list));

devfreq_remove_device(gpu->devfreq.devfreq);

Because everything eventually gets a devm_ wrapper we do have devm_devfreq_add_device() - maybe that would be a better solution?

Jordan

...

for (i = 0; i < ARRAY_SIZE(gpu->rb); i++) { msm_ringbuffer_destroy(gpu->rb[i]); gpu->rb[i] = NULL; -- 1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Sharat Masetty

9:18 a.m.

New subject: [PATCH 3/5] drm/msm/A6x: Add gmu_read64() register read op

Add a simple function to read 64 registers in the GMU domain

Signed-off-by: Sharat Masetty smasetty@codeaurora.org --- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index a08ee8f..f9e4dfe 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -106,6 +106,19 @@ static inline void gmu_rmw(struct a6xx_gmu *gmu, u32 reg, u32 mask, u32 or) gmu_write(gmu, reg, val | or); }

+static inline u64 gmu_read64(struct a6xx_gmu *gmu, u32 lo, u32 hi) +{ + u64 val; + + /* + * Implementation similar to gpu_read64() + */ + val = (u64) msm_readl(gmu->mmio + (lo << 2)); + val |= ((u64) msm_readl(gmu->mmio + (hi << 2)) << 32); + + return val; +} + #define gmu_poll_timeout(gmu, addr, val, cond, interval, timeout) \ readl_poll_timeout((gmu)->mmio + ((addr) << 2), val, cond, \ interval, timeout)

-- 1.9.1

Jordan Crouse

3:44 p.m.

New subject: [PATCH 3/5] drm/msm/A6x: Add gmu_read64() register read op

On Thu, Aug 23, 2018 at 02:48:29PM +0530, Sharat Masetty wrote:

...

Add a simple function to read 64 registers in the GMU domain

Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index a08ee8f..f9e4dfe 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -106,6 +106,19 @@ static inline void gmu_rmw(struct a6xx_gmu *gmu, u32 reg, u32 mask, u32 or) gmu_write(gmu, reg, val | or); }

+static inline u64 gmu_read64(struct a6xx_gmu *gmu, u32 lo, u32 hi) +{
u64 val;

/*
* Implementation similar to gpu_read64()
*/

I'm not sure this comment is really needed, and it certainly could just be one line if it is.

...

val = (u64) msm_readl(gmu->mmio + (lo << 2));

val |= ((u64) msm_readl(gmu->mmio + (hi << 2)) << 32);

return val;

+}

#define gmu_poll_timeout(gmu, addr, val, cond, interval, timeout) \ readl_poll_timeout((gmu)->mmio + ((addr) << 2), val, cond, \ interval, timeout) -- 1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Sharat Masetty

9:18 a.m.

New subject: [PATCH 4/5] drm/msm: re-factor devfreq code

devfreq framework requires the drivers to provide busy time estimations. The GPU driver relies on the hardware performance counteres for the busy time estimations, but different hardware revisions have counters which can be sourced from different clocks. So the busy time estimation will be target dependent. Additionally on targets where the clocks are completely controlled by the on chip microcontroller, fetching and setting the current GPU frequency will be different. This patch aims to embrace these differences by re-factoring the devfreq code a bit.

Signed-off-by: Sharat Masetty smasetty@codeaurora.org --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 16 +++++++++--- drivers/gpu/drm/msm/msm_gpu.c | 49 ++++++++++++++++++++--------------- drivers/gpu/drm/msm/msm_gpu.h | 5 +++- 3 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 897f3e2..043e680 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1369,12 +1369,20 @@ static struct msm_ringbuffer *a5xx_active_ring(struct msm_gpu *gpu) return a5xx_gpu->cur_ring; }

-static int a5xx_gpu_busy(struct msm_gpu *gpu, uint64_t *value) +static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu) { - *value = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO, - REG_A5XX_RBBM_PERFCTR_RBBM_0_HI); + u64 busy_cycles; + unsigned long busy_time;

- return 0; + busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO, + REG_A5XX_RBBM_PERFCTR_RBBM_0_HI); + + busy_time = (busy_cycles - gpu->devfreq.busy_cycles) / + (clk_get_rate(gpu->core_clk) / 1000000); + + gpu->devfreq.busy_cycles = busy_cycles; + + return busy_time; }

static const struct adreno_gpu_funcs funcs = { diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 83fd602..32269ef 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -36,12 +36,16 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq, struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev)); struct dev_pm_opp *opp;

- opp = dev_pm_opp_find_freq_ceil(dev, freq); + opp = devfreq_recommended_opp(dev, freq, flags); + if (IS_ERR(opp)) + return PTR_ERR(opp);

- if (!IS_ERR(opp)) { + if (gpu->funcs->gpu_set_freq) + gpu->funcs->gpu_set_freq(gpu, (u64)*freq); + else clk_set_rate(gpu->core_clk, *freq); - dev_pm_opp_put(opp); - } + + dev_pm_opp_put(opp);

return 0; } @@ -50,16 +54,14 @@ static int msm_devfreq_get_dev_status(struct device *dev, struct devfreq_dev_status *status) { struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev)); - u64 cycles; ktime_t time;

- status->current_frequency = (unsigned long) clk_get_rate(gpu->core_clk); - gpu->funcs->gpu_busy(gpu, &cycles); - - status->busy_time = (cycles - gpu->devfreq.busy_cycles) / - (status->current_frequency / 1000000); + if (gpu->funcs->gpu_get_freq) + status->current_frequency = gpu->funcs->gpu_get_freq(gpu); + else + status->current_frequency = clk_get_rate(gpu->core_clk);

- gpu->devfreq.busy_cycles = cycles; + status->busy_time = gpu->funcs->gpu_busy(gpu);

time = ktime_get(); status->total_time = ktime_us_delta(time, gpu->devfreq.time); @@ -72,7 +74,10 @@ static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long *freq) { struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));

- *freq = (unsigned long) clk_get_rate(gpu->core_clk); + if (gpu->funcs->gpu_get_freq) + *freq = gpu->funcs->gpu_get_freq(gpu); + else + *freq = clk_get_rate(gpu->core_clk);

return 0; } @@ -87,7 +92,7 @@ static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long *freq) static void msm_devfreq_init(struct msm_gpu *gpu) { /* We need target support to do devfreq */ - if (!gpu->funcs->gpu_busy || !gpu->core_clk) + if (!gpu->funcs->gpu_busy) return;

msm_devfreq_profile.initial_freq = gpu->fast_rate; @@ -185,6 +190,14 @@ static int disable_axi(struct msm_gpu *gpu) return 0; }

+void msm_gpu_resume_devfreq(struct msm_gpu *gpu) +{ + gpu->devfreq.busy_cycles = 0; + gpu->devfreq.time = ktime_get(); + + devfreq_resume_device(gpu->devfreq.devfreq); +} + int msm_gpu_pm_resume(struct msm_gpu *gpu) { int ret; @@ -203,12 +216,7 @@ int msm_gpu_pm_resume(struct msm_gpu *gpu) if (ret) return ret;

- if (gpu->devfreq.devfreq) { - gpu->devfreq.busy_cycles = 0; - gpu->devfreq.time = ktime_get(); - - devfreq_resume_device(gpu->devfreq.devfreq); - } + msm_gpu_resume_devfreq(gpu);

gpu->needs_hw_init = true;

@@ -221,8 +229,7 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)

DBG("%s", gpu->name);

- if (gpu->devfreq.devfreq) - devfreq_suspend_device(gpu->devfreq.devfreq); + devfreq_suspend_device(gpu->devfreq.devfreq);

ret = disable_axi(gpu); if (ret) diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 2ae34e3..2446066 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -68,9 +68,11 @@ struct msm_gpu_funcs { void (*show)(struct msm_gpu *gpu, struct msm_gpu_state *state, struct drm_printer *p); #endif - int (*gpu_busy)(struct msm_gpu *gpu, uint64_t *value); + unsigned long (*gpu_busy)(struct msm_gpu *gpu); struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu); int (*gpu_state_put)(struct msm_gpu_state *state); + unsigned long (*gpu_get_freq)(struct msm_gpu *gpu); + int (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq); };

struct msm_gpu { @@ -262,6 +264,7 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)

int msm_gpu_pm_suspend(struct msm_gpu *gpu); int msm_gpu_pm_resume(struct msm_gpu *gpu); +void msm_gpu_resume_devfreq(struct msm_gpu *gpu);

int msm_gpu_hw_init(struct msm_gpu *gpu);

-- 1.9.1

Jordan Crouse

3:48 p.m.

New subject: [PATCH 4/5] drm/msm: re-factor devfreq code

On Thu, Aug 23, 2018 at 02:48:30PM +0530, Sharat Masetty wrote:

...

devfreq framework requires the drivers to provide busy time estimations.

It would help if you added an article to this sentence, i.e: "The devfreq framework..."

...

The GPU driver relies on the hardware performance counteres for the busy time estimations, but different hardware revisions have counters which can be sourced from different clocks. So the busy time estimation will be target dependent. Additionally on targets where the clocks are completely controlled by the on chip microcontroller, fetching and setting the current GPU frequency will be different. This patch aims to embrace these differences by re-factoring the devfreq code a bit.

Other than that, the code looks good. A bit of churn, but for a good cause.

Reviewed-by: Jordan Crouse jcrouse@codeaurora.org

...

Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 16 +++++++++--- drivers/gpu/drm/msm/msm_gpu.c | 49 ++++++++++++++++++++--------------- drivers/gpu/drm/msm/msm_gpu.h | 5 +++- 3 files changed, 44 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index 897f3e2..043e680 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1369,12 +1369,20 @@ static struct msm_ringbuffer *a5xx_active_ring(struct msm_gpu *gpu) return a5xx_gpu->cur_ring; }

-static int a5xx_gpu_busy(struct msm_gpu *gpu, uint64_t *value) +static unsigned long a5xx_gpu_busy(struct msm_gpu *gpu) {
*value = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
u64 busy_cycles;

unsigned long busy_time;

return 0;
busy_cycles = gpu_read64(gpu, REG_A5XX_RBBM_PERFCTR_RBBM_0_LO,
	REG_A5XX_RBBM_PERFCTR_RBBM_0_HI);
busy_time = (busy_cycles - gpu->devfreq.busy_cycles) /
(clk_get_rate(gpu->core_clk) / 1000000);
gpu->devfreq.busy_cycles = busy_cycles;

return busy_time;
}

static const struct adreno_gpu_funcs funcs = { diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 83fd602..32269ef 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -36,12 +36,16 @@ static int msm_devfreq_target(struct device *dev, unsigned long *freq, struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev)); struct dev_pm_opp *opp;

opp = dev_pm_opp_find_freq_ceil(dev, freq);
opp = devfreq_recommended_opp(dev, freq, flags);

if (IS_ERR(opp))
return PTR_ERR(opp);
if (!IS_ERR(opp)) {
if (gpu->funcs->gpu_set_freq)
gpu->funcs->gpu_set_freq(gpu, (u64)*freq);
else clk_set_rate(gpu->core_clk, *freq);
dev_pm_opp_put(opp);
}
dev_pm_opp_put(opp);

return 0;

} @@ -50,16 +54,14 @@ static int msm_devfreq_get_dev_status(struct device *dev, struct devfreq_dev_status *status) { struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));
u64 cycles; ktime_t time;

status->current_frequency = (unsigned long) clk_get_rate(gpu->core_clk);

gpu->funcs->gpu_busy(gpu, &cycles);

status->busy_time = (cycles - gpu->devfreq.busy_cycles) /
(status->current_frequency / 1000000);
if (gpu->funcs->gpu_get_freq)
status->current_frequency = gpu->funcs->gpu_get_freq(gpu);
else
status->current_frequency = clk_get_rate(gpu->core_clk);
gpu->devfreq.busy_cycles = cycles;

status->busy_time = gpu->funcs->gpu_busy(gpu);

time = ktime_get(); status->total_time = ktime_us_delta(time, gpu->devfreq.time);

@@ -72,7 +74,10 @@ static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long *freq) { struct msm_gpu *gpu = platform_get_drvdata(to_platform_device(dev));

*freq = (unsigned long) clk_get_rate(gpu->core_clk);
if (gpu->funcs->gpu_get_freq)
*freq = gpu->funcs->gpu_get_freq(gpu);
else
*freq = clk_get_rate(gpu->core_clk);
return 0;
} @@ -87,7 +92,7 @@ static int msm_devfreq_get_cur_freq(struct device *dev, unsigned long *freq) static void msm_devfreq_init(struct msm_gpu *gpu) { /* We need target support to do devfreq */

if (!gpu->funcs->gpu_busy || !gpu->core_clk)

if (!gpu->funcs->gpu_busy) return;

msm_devfreq_profile.initial_freq = gpu->fast_rate;

@@ -185,6 +190,14 @@ static int disable_axi(struct msm_gpu *gpu) return 0; }

+void msm_gpu_resume_devfreq(struct msm_gpu *gpu) +{

gpu->devfreq.busy_cycles = 0;

gpu->devfreq.time = ktime_get();

devfreq_resume_device(gpu->devfreq.devfreq);

+}

int msm_gpu_pm_resume(struct msm_gpu *gpu) { int ret; @@ -203,12 +216,7 @@ int msm_gpu_pm_resume(struct msm_gpu *gpu) if (ret) return ret;
if (gpu->devfreq.devfreq) {
gpu->devfreq.busy_cycles = 0;
gpu->devfreq.time = ktime_get();
devfreq_resume_device(gpu->devfreq.devfreq);
}
msm_gpu_resume_devfreq(gpu);

gpu->needs_hw_init = true;

@@ -221,8 +229,7 @@ int msm_gpu_pm_suspend(struct msm_gpu *gpu)

DBG("%s", gpu->name);
if (gpu->devfreq.devfreq)
devfreq_suspend_device(gpu->devfreq.devfreq);
devfreq_suspend_device(gpu->devfreq.devfreq);

ret = disable_axi(gpu); if (ret)

diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h index 2ae34e3..2446066 100644 --- a/drivers/gpu/drm/msm/msm_gpu.h +++ b/drivers/gpu/drm/msm/msm_gpu.h @@ -68,9 +68,11 @@ struct msm_gpu_funcs { void (*show)(struct msm_gpu *gpu, struct msm_gpu_state *state, struct drm_printer *p); #endif

int (*gpu_busy)(struct msm_gpu *gpu, uint64_t *value);

unsigned long (*gpu_busy)(struct msm_gpu *gpu); struct msm_gpu_state *(*gpu_state_get)(struct msm_gpu *gpu); int (*gpu_state_put)(struct msm_gpu_state *state);

unsigned long (*gpu_get_freq)(struct msm_gpu *gpu);

int (*gpu_set_freq)(struct msm_gpu *gpu, unsigned long freq);

};

struct msm_gpu { @@ -262,6 +264,7 @@ static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)

int msm_gpu_pm_suspend(struct msm_gpu *gpu); int msm_gpu_pm_resume(struct msm_gpu *gpu); +void msm_gpu_resume_devfreq(struct msm_gpu *gpu);

int msm_gpu_hw_init(struct msm_gpu *gpu);

-- 1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Sharat Masetty

9:18 a.m.

New subject: [PATCH 5/5] drm/msm/A6x: Add devfreq support in A6x

Implement routines to estimate GPU busy time and fetching the current frequency for the polling interval. This is required by the devfreq framework which recommends a frequency change if needed. The driver code then tries to set this new frequency on the GPU by sending an Out Of Band(OOB) request.

Signed-off-by: Sharat Masetty smasetty@codeaurora.org --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 +++++++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f6634c0..92ff48b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -67,8 +67,10 @@ static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); }

-static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +static int __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) { + int ret; + gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);

gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING, @@ -84,7 +86,41 @@ static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) a6xx_gmu_set_oob(gmu, GMU_OOB_DCVS_SET); a6xx_gmu_clear_oob(gmu, GMU_OOB_DCVS_SET);

- return gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN); + ret = gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN); + if (!ret) + gmu->cur_freq = gmu->gpu_freqs[index]; + + return ret; +} + +int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = &a6xx_gpu->gmu; + u32 perf_index = 0; + + if (freq == gmu->cur_freq) + return 0; + + //TODO: Use a hashmap instead? This gets called potentially every ~10 ms + for (perf_index = 0; perf_index < gmu->nr_gpu_freqs; perf_index++) + if (freq == gmu->gpu_freqs[perf_index]) + break; + + if (perf_index == gmu->nr_gpu_freqs) + return -EINVAL; + + return __a6xx_gmu_set_freq(gmu, perf_index); +} + +unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = &a6xx_gpu->gmu; + + return gmu->cur_freq; }

static bool a6xx_gmu_check_idle_level(struct a6xx_gmu *gmu) @@ -629,8 +665,8 @@ int a6xx_gmu_reset(struct a6xx_gpu *a6xx_gpu) if (!ret) ret = a6xx_hfi_start(gmu, GMU_COLD_BOOT);

- /* Set the GPU back to the highest power frequency */ - a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1); + /* Save the current frequency for devfreq */ + gmu->cur_freq = gmu->gpu_freqs[gmu->nr_gpu_freqs - 1];

out: if (ret) @@ -671,7 +707,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) ret = a6xx_hfi_start(gmu, status);

/* Set the GPU to the highest power frequency */ - a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1); + __a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

out: /* Make sure to turn off the boot OOB request on error */ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index f9e4dfe..ce6e5ca 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -77,6 +77,8 @@ struct a6xx_gmu { unsigned long gmu_freqs[4]; u32 cx_arc_votes[4];

+ unsigned long cur_freq; + struct a6xx_hfi_queue queues[2];

struct tasklet_struct hfi_tasklet; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 3429d33a..af90706 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -7,6 +7,8 @@ #include "a6xx_gpu.h" #include "a6xx_gmu.xml.h"

+#include <linux/devfreq.h> + static inline bool _a6xx_check_idle(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -682,6 +684,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)

gpu->needs_hw_init = true;

+ msm_gpu_resume_devfreq(gpu); + return ret; }

@@ -690,6 +694,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

+ devfreq_suspend_device(gpu->devfreq.devfreq); + /* * Make sure the GMU is idle before continuing (because some transitions * may use VBIF @@ -753,6 +759,24 @@ static void a6xx_destroy(struct msm_gpu *gpu) kfree(a6xx_gpu); }

+static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu) +{ + struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); + struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + u64 busy_cycles; + unsigned long busy_time; + + busy_cycles = gmu_read64(&a6xx_gpu->gmu, + REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L, + REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H); + + busy_time = ((busy_cycles - gpu->devfreq.busy_cycles) * 10) / 192; + + gpu->devfreq.busy_cycles = busy_cycles; + + return busy_time; +} + static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -768,6 +792,9 @@ static void a6xx_destroy(struct msm_gpu *gpu) #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP) .show = a6xx_show, #endif + .gpu_busy = a6xx_gpu_busy, + .gpu_get_freq = a6xx_gmu_get_freq, + .gpu_set_freq = a6xx_gmu_set_freq, }, .get_timestamp = a6xx_get_timestamp, }; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 32c2501..f236767 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -56,5 +56,7 @@ struct a6xx_gpu {

int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node, struct platform_device *gpu_pdev); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu); +int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq); +unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);

#endif /* __A6XX_GPU_H__ */

-- 1.9.1

Jordan Crouse

4 p.m.

New subject: [PATCH 5/5] drm/msm/A6x: Add devfreq support in A6x

On Thu, Aug 23, 2018 at 02:48:31PM +0530, Sharat Masetty wrote:

...

Implement routines to estimate GPU busy time and fetching the current frequency for the polling interval. This is required by the devfreq framework which recommends a frequency change if needed. The driver code then tries to set this new frequency on the GPU by sending an Out Of Band(OOB) request.

"sending an Out of Band (OOB) request _to the GMU_". Otherwise it is a little confusing as to who is doing what.

...

Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 +++++++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f6634c0..92ff48b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -67,8 +67,10 @@ static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); }

-static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +static int __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) {

int ret;

Should be a u32 since we are doing a gmu_read().

...

gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);

gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,

@@ -84,7 +86,41 @@ static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) a6xx_gmu_set_oob(gmu, GMU_OOB_DCVS_SET); a6xx_gmu_clear_oob(gmu, GMU_OOB_DCVS_SET);

return gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);
ret = gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);

if (!ret)
gmu->cur_freq = gmu->gpu_freqs[index];

'ret' from the register read won't be an appropriate Unix error message so it should be translated - otherwise it will be confusing because 'a6xx_gmu_set_freq' otherwise returns 0 or valid error messages.

...

return ret;

+}

+int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) +{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

struct a6xx_gmu *gmu = &a6xx_gpu->gmu;

u32 perf_index = 0;

if (freq == gmu->cur_freq)
return 0;
//TODO: Use a hashmap instead? This gets called potentially every ~10 ms

Please don't use C++ style comments. A TODO is okay, but I would prefer if you solved this question. I'm not sure if walking a short list of 10 items is a big concern if it happens every 10ms or so.

...

for (perf_index = 0; perf_index < gmu->nr_gpu_freqs; perf_index++)
if (freq == gmu->gpu_freqs[perf_index])
	break;

Are you positive we don't need to worry about rounding here - will devfreq *always* give you an exact frequency value? I know the clock subsystem allows for rounding. You might want to double check just to be sure that we don't need to worry about that here.

In particular, I would be concerned about the userspace governor for devfreq where the user can set anything they want. I'm not 100% sure that gets vetted against the OPP table before we get to this point.

...

if (perf_index == gmu->nr_gpu_freqs)
return -EINVAL;

Related to the previous comment slightly, if devfreq wants to set a frequency of a hundred million HZ is it an error or should we just clamp to the highest available frequency and call it good?

...

return __a6xx_gmu_set_freq(gmu, perf_index);

+}

+unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) +{

struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

struct a6xx_gmu *gmu = &a6xx_gpu->gmu;

return gmu->cur_freq;

}

static bool a6xx_gmu_check_idle_level(struct a6xx_gmu *gmu) @@ -629,8 +665,8 @@ int a6xx_gmu_reset(struct a6xx_gpu *a6xx_gpu) if (!ret) ret = a6xx_hfi_start(gmu, GMU_COLD_BOOT);

/* Set the GPU back to the highest power frequency */

a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

/* Save the current frequency for devfreq */

gmu->cur_freq = gmu->gpu_freqs[gmu->nr_gpu_freqs - 1];

I'm not sure I understand this change - don't we need to set the frequency GPU immediately out of reset even if DCVS is expected to change it soon?

...

out: if (ret) @@ -671,7 +707,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) ret = a6xx_hfi_start(gmu, status);

/* Set the GPU to the highest power frequency */

a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

__a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

out: /* Make sure to turn off the boot OOB request on error */ diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index f9e4dfe..ce6e5ca 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -77,6 +77,8 @@ struct a6xx_gmu { unsigned long gmu_freqs[4]; u32 cx_arc_votes[4];

unsigned long cur_freq;

This could just be 'freq'.

...

struct a6xx_hfi_queue queues[2];

struct tasklet_struct hfi_tasklet; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 3429d33a..af90706 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -7,6 +7,8 @@ #include "a6xx_gpu.h" #include "a6xx_gmu.xml.h"

+#include <linux/devfreq.h>

static inline bool _a6xx_check_idle(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); @@ -682,6 +684,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)

gpu->needs_hw_init = true;

msm_gpu_resume_devfreq(gpu);

return ret;

}

@@ -690,6 +694,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

devfreq_suspend_device(gpu->devfreq.devfreq);

/*

Make sure the GMU is idle before continuing (because some transitions

may use VBIF

@@ -753,6 +759,24 @@ static void a6xx_destroy(struct msm_gpu *gpu) kfree(a6xx_gpu); }

+static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu) +{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

u64 busy_cycles;

unsigned long busy_time;

busy_cycles = gmu_read64(&a6xx_gpu->gmu,
	REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L,
	REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H);
busy_time = ((busy_cycles - gpu->devfreq.busy_cycles) * 10) / 192;

gpu->devfreq.busy_cycles = busy_cycles;

return busy_time;
+}

static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param, @@ -768,6 +792,9 @@ static void a6xx_destroy(struct msm_gpu *gpu) #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP) .show = a6xx_show, #endif
.gpu_busy = a6xx_gpu_busy,
.gpu_get_freq = a6xx_gmu_get_freq,
.gpu_set_freq = a6xx_gmu_set_freq,
}, .get_timestamp = a6xx_get_timestamp,
}; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 32c2501..f236767 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -56,5 +56,7 @@ struct a6xx_gpu {

int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node, struct platform_device *gpu_pdev); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu); +int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq); +unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);

#endif /* __A6XX_GPU_H__ */

1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Sharat Masetty

24 Aug 24 Aug

9:54 a.m.

New subject: [Freedreno] [PATCH 5/5] drm/msm/A6x: Add devfreq support in A6x

On 8/23/2018 9:30 PM, Jordan Crouse wrote:

...

On Thu, Aug 23, 2018 at 02:48:31PM +0530, Sharat Masetty wrote:

...
Implement routines to estimate GPU busy time and fetching the current frequency for the polling interval. This is required by the devfreq framework which recommends a frequency change if needed. The driver code then tries to set this new frequency on the GPU by sending an Out Of Band(OOB) request.

"sending an Out of Band (OOB) request _to the GMU_". Otherwise it is a little confusing as to who is doing what.

...
Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 +++++++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f6634c0..92ff48b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -67,8 +67,10 @@ static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); }

-static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +static int __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) {

int ret;

Should be a u32 since we are doing a gmu_read().

...
gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0);

gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,

@@ -84,7 +86,41 @@ static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) a6xx_gmu_set_oob(gmu, GMU_OOB_DCVS_SET); a6xx_gmu_clear_oob(gmu, GMU_OOB_DCVS_SET);

return gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);
ret = gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);

if (!ret)
gmu->cur_freq = gmu->gpu_freqs[index];
'ret' from the register read won't be an appropriate Unix error message so it should be translated - otherwise it will be confusing because 'a6xx_gmu_set_freq' otherwise returns 0 or valid error messages.

...
return ret;

+}

+int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) +{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

struct a6xx_gmu *gmu = &a6xx_gpu->gmu;

u32 perf_index = 0;

if (freq == gmu->cur_freq)
return 0;
//TODO: Use a hashmap instead? This gets called potentially every ~10 ms
Please don't use C++ style comments. A TODO is okay, but I would prefer if you solved this question. I'm not sure if walking a short list of 10 items is a big concern if it happens every 10ms or so.

Sure, I will take care of this...

...

...
for (perf_index = 0; perf_index < gmu->nr_gpu_freqs; perf_index++)
if (freq == gmu->gpu_freqs[perf_index])
	break;
Are you positive we don't need to worry about rounding here - will devfreq *always* give you an exact frequency value? I know the clock subsystem allows for rounding. You might want to double check just to be sure that we don't need to worry about that here.

In particular, I would be concerned about the userspace governor for devfreq where the user can set anything they want. I'm not 100% sure that gets vetted against the OPP table before we get to this point.

...
if (perf_index == gmu->nr_gpu_freqs)
return -EINVAL;
Related to the previous comment slightly, if devfreq wants to set a frequency of a hundred million HZ is it an error or should we just clamp to the highest available frequency and call it good?

For this and the comment above, we use the devfreq_recommended_opp() function to get a proper OPP from our OPP list in the dt for the GPU device.

...

...

return __a6xx_gmu_set_freq(gmu, perf_index);

+}

+unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu) +{

struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

struct a6xx_gmu *gmu = &a6xx_gpu->gmu;

return gmu->cur_freq; }

static bool a6xx_gmu_check_idle_level(struct a6xx_gmu *gmu)

@@ -629,8 +665,8 @@ int a6xx_gmu_reset(struct a6xx_gpu *a6xx_gpu) if (!ret) ret = a6xx_hfi_start(gmu, GMU_COLD_BOOT);

/* Set the GPU back to the highest power frequency */

a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

/* Save the current frequency for devfreq */

gmu->cur_freq = gmu->gpu_freqs[gmu->nr_gpu_freqs - 1];

I'm not sure I understand this change - don't we need to set the frequency GPU immediately out of reset even if DCVS is expected to change it soon?

Oops, this somehow slipped my attention and should not be here. I will revert this. Thanks for the catch.

...

...
out: if (ret) @@ -671,7 +707,7 @@ int a6xx_gmu_resume(struct a6xx_gpu *a6xx_gpu) ret = a6xx_hfi_start(gmu, status);

/* Set the GPU to the highest power frequency */

a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

__a6xx_gmu_set_freq(gmu, gmu->nr_gpu_freqs - 1);

out: /* Make sure to turn off the boot OOB request on error */

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h index f9e4dfe..ce6e5ca 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h @@ -77,6 +77,8 @@ struct a6xx_gmu { unsigned long gmu_freqs[4]; u32 cx_arc_votes[4];

unsigned long cur_freq;

This could just be 'freq'.

...
struct a6xx_hfi_queue queues[2];

struct tasklet_struct hfi_tasklet; diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 3429d33a..af90706 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -7,6 +7,8 @@ #include "a6xx_gpu.h" #include "a6xx_gmu.xml.h"

+#include <linux/devfreq.h>

static inline bool _a6xx_check_idle(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

@@ -682,6 +684,8 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)

gpu->needs_hw_init = true;

msm_gpu_resume_devfreq(gpu);

return ret; }

@@ -690,6 +694,8 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu) struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

devfreq_suspend_device(gpu->devfreq.devfreq);

/*

Make sure the GMU is idle before continuing (because some transitions

may use VBIF

@@ -753,6 +759,24 @@ static void a6xx_destroy(struct msm_gpu *gpu) kfree(a6xx_gpu); }

+static unsigned long a6xx_gpu_busy(struct msm_gpu *gpu) +{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

u64 busy_cycles;

unsigned long busy_time;

busy_cycles = gmu_read64(&a6xx_gpu->gmu,
	REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_L,
	REG_A6XX_GMU_CX_GMU_POWER_COUNTER_XOCLK_0_H);
busy_time = ((busy_cycles - gpu->devfreq.busy_cycles) * 10) / 192;

gpu->devfreq.busy_cycles = busy_cycles;

return busy_time;
+}

static const struct adreno_gpu_funcs funcs = { .base = { .get_param = adreno_get_param,

@@ -768,6 +792,9 @@ static void a6xx_destroy(struct msm_gpu *gpu) #if defined(CONFIG_DEBUG_FS) || defined(CONFIG_DEV_COREDUMP) .show = a6xx_show, #endif
.gpu_busy = a6xx_gpu_busy,
.gpu_get_freq = a6xx_gmu_get_freq,
.gpu_set_freq = a6xx_gmu_set_freq,
}, .get_timestamp = a6xx_get_timestamp, };
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h index 32c2501..f236767 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h @@ -56,5 +56,7 @@ struct a6xx_gpu {

int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node, struct platform_device *gpu_pdev); void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu); +int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq); +unsigned long a6xx_gmu_get_freq(struct msm_gpu *gpu);

#endif /* __A6XX_GPU_H__ */

1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, Linux Foundation Collaborative Project

Jordan Crouse

2:45 p.m.

New subject: [Freedreno] [PATCH 5/5] drm/msm/A6x: Add devfreq support in A6x

On Fri, Aug 24, 2018 at 03:24:04PM +0530, Sharat Masetty wrote:

...

On 8/23/2018 9:30 PM, Jordan Crouse wrote:

...
On Thu, Aug 23, 2018 at 02:48:31PM +0530, Sharat Masetty wrote:

...
Implement routines to estimate GPU busy time and fetching the current frequency for the polling interval. This is required by the devfreq framework which recommends a frequency change if needed. The driver code then tries to set this new frequency on the GPU by sending an Out Of Band(OOB) request.

"sending an Out of Band (OOB) request _to the GMU_". Otherwise it is a little confusing as to who is doing what.

...
Signed-off-by: Sharat Masetty smasetty@codeaurora.org

drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 +++++++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index f6634c0..92ff48b 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -67,8 +67,10 @@ static bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu) A6XX_GMU_SPTPRAC_PWR_CLK_STATUS_GX_HM_CLK_OFF)); } -static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) +static int __a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) {

int ret;

Should be a u32 since we are doing a gmu_read().

...
gmu_write(gmu, REG_A6XX_GMU_DCVS_ACK_OPTION, 0); gmu_write(gmu, REG_A6XX_GMU_DCVS_PERF_SETTING,

@@ -84,7 +86,41 @@ static int a6xx_gmu_set_freq(struct a6xx_gmu *gmu, int index) a6xx_gmu_set_oob(gmu, GMU_OOB_DCVS_SET); a6xx_gmu_clear_oob(gmu, GMU_OOB_DCVS_SET);

return gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);
ret = gmu_read(gmu, REG_A6XX_GMU_DCVS_RETURN);

if (!ret)
gmu->cur_freq = gmu->gpu_freqs[index];
'ret' from the register read won't be an appropriate Unix error message so it should be translated - otherwise it will be confusing because 'a6xx_gmu_set_freq' otherwise returns 0 or valid error messages.

...
return ret;

+}

+int a6xx_gmu_set_freq(struct msm_gpu *gpu, unsigned long freq) +{
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);

struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);

struct a6xx_gmu *gmu = &a6xx_gpu->gmu;

u32 perf_index = 0;

if (freq == gmu->cur_freq)
return 0;
//TODO: Use a hashmap instead? This gets called potentially every ~10 ms
Please don't use C++ style comments. A TODO is okay, but I would prefer if you solved this question. I'm not sure if walking a short list of 10 items is a big concern if it happens every 10ms or so.
Sure, I will take care of this...

...
...
for (perf_index = 0; perf_index < gmu->nr_gpu_freqs; perf_index++)
if (freq == gmu->gpu_freqs[perf_index])
	break;
Are you positive we don't need to worry about rounding here - will devfreq *always* give you an exact frequency value? I know the clock subsystem allows for rounding. You might want to double check just to be sure that we don't need to worry about that here.

In particular, I would be concerned about the userspace governor for devfreq where the user can set anything they want. I'm not 100% sure that gets vetted against the OPP table before we get to this point.

...
if (perf_index == gmu->nr_gpu_freqs)
return -EINVAL;
Related to the previous comment slightly, if devfreq wants to set a frequency of a hundred million HZ is it an error or should we just clamp to the highest available frequency and call it good?
For this and the comment above, we use the devfreq_recommended_opp() function to get a proper OPP from our OPP list in the dt for the GPU device.

So if we are sure the incoming frequency is always valid then perf_index will always match and we know this if statement will never be true. So we should get rid of it.

If you are paranoid about the list being wrong you could do change the for loop so that it always defaulted to the highest priority and then remove the if statement:

- for (perf_index = 0; perf_index < gmu->nr_gpu_freqs; perf_index++) + for (perf_index = 0; perf_index < gmu->nr_gpu_freqs - 1; perf_index++)

Jordan

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

Jordan Crouse

23 Aug 23 Aug

3:38 p.m.

On Thu, Aug 23, 2018 at 02:48:26PM +0530, Sharat Masetty wrote:

...

This patch series starts off with a few bug fixes in devfreq code, followed by refactoring the devfreq code needed for supporting different chipsets, and ends with adding devfreq support for A6x.

Just an aside, I'm a sucker for consistency and I know the form A6x and A6xx are used interchangeably downstream but I feel like we should be more consistent upstream. I like the form a6xx since it matches the fine names and functions and is technically more correct in terms of the actual core name.

...

Sharat Masetty (5): drm/msm: suspend devfreq on init drm/msm: unregister devfreq upon clean up drm/msm/A6x: Add gmu_read64() register read op drm/msm: re-factor devfreq code drm/msm/A6x: Add devfreq support in A6x

drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 16 ++++++++--- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 46 ++++++++++++++++++++++++++---- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 15 ++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 ++++++++++++++++++ drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 2 ++ drivers/gpu/drm/msm/msm_gpu.c | 53 +++++++++++++++++++++-------------- drivers/gpu/drm/msm/msm_gpu.h | 5 +++- 7 files changed, 133 insertions(+), 31 deletions(-)

-- 1.9.1

-- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

2455

Age (days ago)

2456

Last active (days ago)

dri-devel@lists.freedesktop.org

13 comments

2 participants

tags (0)

participants (2)

Jordan Crouse
Sharat Masetty