The "break-even" point for the two formulas is L==8, which is also what the code actually implements. [Incidentally, at that point one has Y=0.008856, not 0.08856].
Moreover, all the sources I can find say the linear factor is 903.3 rather than 902.3, which makes sense since then the formulas agree at L==8, both yielding the 0.008856 figure to four significant digits.
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk --- drivers/video/backlight/pwm_bl.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index 2201b8c78641..be36be1cacb7 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -155,8 +155,8 @@ static const struct backlight_ops pwm_backlight_ops = { * * The CIE 1931 lightness formula is what actually describes how we perceive * light: - * Y = (L* / 902.3) if L* ≤ 0.08856 - * Y = ((L* + 16) / 116)^3 if L* > 0.08856 + * Y = (L* / 903.3) if L* ≤ 8 + * Y = ((L* + 16) / 116)^3 if L* > 8 * * Where Y is the luminance, the amount of light coming out of the screen, and * is a number between 0.0 and 1.0; and L* is the lightness, how bright a human @@ -169,9 +169,15 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) { u64 retval;
+ /* + * @lightness is given as a number between 0 and 1, expressed + * as a fixed-point number in scale @scale. Convert to a + * percentage, still expressed as a fixed-point number, so the + * above formulas can be applied. + */ lightness *= 100; if (lightness <= (8 * scale)) { - retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9023); + retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033); } else { retval = int_pow((lightness + (16 * scale)) / 116, 3); retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
lightness*1000 is nowhere near overflowing 32 bits, so we can just use an ordinary 32/32 division, which is much cheaper than the 64/32 done via do_div().
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk --- drivers/video/backlight/pwm_bl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index be36be1cacb7..9252d51f31b9 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -177,7 +177,7 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) */ lightness *= 100; if (lightness <= (8 * scale)) { - retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033); + retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else { retval = int_pow((lightness + (16 * scale)) / 116, 3); retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
On Thu, Sep 19, 2019 at 04:06:17PM +0200, Rasmus Villemoes wrote:
lightness*1000 is nowhere near overflowing 32 bits, so we can just use an ordinary 32/32 division, which is much cheaper than the 64/32 done via do_div().
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk
Reviewed-by: Daniel Thompson daniel.thompson@linaro.org
drivers/video/backlight/pwm_bl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index be36be1cacb7..9252d51f31b9 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -177,7 +177,7 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) */ lightness *= 100; if (lightness <= (8 * scale)) {
retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033);
} else { retval = int_pow((lightness + (16 * scale)) / 116, 3); retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));retval = DIV_ROUND_CLOSEST(lightness * 10, 9033);
-- 2.20.1
The scheduler uses a (currently private) fixed_power_int() in its load average computation for computing powers of numbers 0 < x < 1 expressed as fixed-point numbers, which is also what we want here. But that requires the scale to be a power-of-2.
We could (and a following patch will) change to use a power-of-2 scale, but for a fixed small exponent of 3, there's no advantage in using repeated squaring.
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk --- drivers/video/backlight/pwm_bl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index 9252d51f31b9..aee6839e024a 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -179,7 +179,8 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) if (lightness <= (8 * scale)) { retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else { - retval = int_pow((lightness + (16 * scale)) / 116, 3); + retval = (lightness + (16 * scale)) / 116; + retval *= retval * retval; retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale)); }
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote:
The scheduler uses a (currently private) fixed_power_int() in its load average computation for computing powers of numbers 0 < x < 1 expressed as fixed-point numbers, which is also what we want here. But that requires the scale to be a power-of-2.
It feels like there is some rationale missing in the description here.
What is the benefit of replacing the explicit int_pow() with the implicit multiplications?
Daniel.
We could (and a following patch will) change to use a power-of-2 scale, but for a fixed small exponent of 3, there's no advantage in using repeated squaring.
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk
drivers/video/backlight/pwm_bl.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index 9252d51f31b9..aee6839e024a 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -179,7 +179,8 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) if (lightness <= (8 * scale)) { retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else {
retval = int_pow((lightness + (16 * scale)) / 116, 3);
retval = (lightness + (16 * scale)) / 116;
retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale)); }retval *= retval * retval;
-- 2.20.1
On 07/10/2019 17.28, Daniel Thompson wrote:
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote:
It feels like there is some rationale missing in the description here.
What is the benefit of replacing the explicit int_pow() with the implicit multiplications?
Daniel.
We could (and a following patch will) change to use a power-of-2 scale, but for a fixed small exponent of 3, there's no advantage in using repeated squaring.
^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Apart from the function call overhead (and resulting register pressure etc.), using int_pow is less efficient (for an exponent of 3, it ends up doing four 64x64 multiplications instead of just two). But feel free to drop it, I'm not going to pursue it further - it just seemed like a sensible thing to do while I was optimizing the code anyway.
[At the time I wrote the patch, this was also the only user of int_pow in the tree, so it also allowed removing int_pow altogether.]
Rasmus
On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote:
On 07/10/2019 17.28, Daniel Thompson wrote:
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote:
It feels like there is some rationale missing in the description here.
What is the benefit of replacing the explicit int_pow() with the implicit multiplications?
Daniel.
We could (and a following patch will) change to use a power-of-2 scale, but for a fixed small exponent of 3, there's no advantage in using repeated squaring.
^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Apart from the function call overhead (and resulting register pressure etc.), using int_pow is less efficient (for an exponent of 3, it ends up doing four 64x64 multiplications instead of just two). But feel free to drop it, I'm not going to pursue it further - it just seemed like a sensible thing to do while I was optimizing the code anyway.
[At the time I wrote the patch, this was also the only user of int_pow in the tree, so it also allowed removing int_pow altogether.]
To be honest the change is fine but the patch description doesn't make sense if the only current purpose of the patch is as a optimization.
Daniel.
On 08/10/2019 11.31, Daniel Thompson wrote:
On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote:
On 07/10/2019 17.28, Daniel Thompson wrote:
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote:
It feels like there is some rationale missing in the description here.
Apart from the function call overhead (and resulting register pressure etc.), using int_pow is less efficient (for an exponent of 3, it ends up doing four 64x64 multiplications instead of just two). But feel free to drop it, I'm not going to pursue it further - it just seemed like a sensible thing to do while I was optimizing the code anyway.
[At the time I wrote the patch, this was also the only user of int_pow in the tree, so it also allowed removing int_pow altogether.]
To be honest the change is fine but the patch description doesn't make sense if the only current purpose of the patch is as a optimization.
Agreed. Do you want me to resend the series with patch 3 updated to read
"For a fixed small exponent of 3, it is more efficient to simply use two explicit multiplications rather than calling the int_pow() library function: Aside from the function call overhead, its implementation using repeated squaring means it ends up doing four 64x64 multiplications."
(and obviously patch 5 dropped)?
Rasmus
On Tue, Oct 08, 2019 at 12:02:07PM +0200, Rasmus Villemoes wrote:
On 08/10/2019 11.31, Daniel Thompson wrote:
On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote:
On 07/10/2019 17.28, Daniel Thompson wrote:
On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote:
It feels like there is some rationale missing in the description here.
Apart from the function call overhead (and resulting register pressure etc.), using int_pow is less efficient (for an exponent of 3, it ends up doing four 64x64 multiplications instead of just two). But feel free to drop it, I'm not going to pursue it further - it just seemed like a sensible thing to do while I was optimizing the code anyway.
[At the time I wrote the patch, this was also the only user of int_pow in the tree, so it also allowed removing int_pow altogether.]
To be honest the change is fine but the patch description doesn't make sense if the only current purpose of the patch is as a optimization.
Agreed. Do you want me to resend the series with patch 3 updated to read
"For a fixed small exponent of 3, it is more efficient to simply use two explicit multiplications rather than calling the int_pow() library function: Aside from the function call overhead, its implementation using repeated squaring means it ends up doing four 64x64 multiplications."
(and obviously patch 5 dropped)?
Yes, please.
When you resend you can add my R-B: to all patches:
Reviewed-by: Daniel Thompson daniel.thompson@linaro.org
Daniel.
PS Don't mind either way but I wondered the following is clearer than the slightly funky multiply-and-assign expression (which isn't wrong but isn't very common either so my brain won't speed read it):
retval = DIV_ROUND_CLOSEST_ULL(retval * retval * retval, scale * scale);
Using a power-of-2 instead of power-of-10 base makes the computations much cheaper. 2^16 is safe; retval never becomes more than 2^48 + 2^16/2. On a 32 bit platform, the very expensive 64/32 division at the end of cie1931() instead becomes essentially free (a shift by 32 is just a register rename).
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk --- drivers/video/backlight/pwm_bl.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index aee6839e024a..102bc191310f 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -148,7 +148,8 @@ static const struct backlight_ops pwm_backlight_ops = { };
#ifdef CONFIG_OF -#define PWM_LUMINANCE_SCALE 10000 /* luminance scale */ +#define PWM_LUMINANCE_SHIFT 16 +#define PWM_LUMINANCE_SCALE (1 << PWM_LUMINANCE_SHIFT) /* luminance scale */
/* * CIE lightness to PWM conversion. @@ -165,23 +166,25 @@ static const struct backlight_ops pwm_backlight_ops = { * The following function does the fixed point maths needed to implement the * above formula. */ -static u64 cie1931(unsigned int lightness, unsigned int scale) +static u64 cie1931(unsigned int lightness) { u64 retval;
/* * @lightness is given as a number between 0 and 1, expressed - * as a fixed-point number in scale @scale. Convert to a - * percentage, still expressed as a fixed-point number, so the - * above formulas can be applied. + * as a fixed-point number in scale + * PWM_LUMINANCE_SCALE. Convert to a percentage, still + * expressed as a fixed-point number, so the above formulas + * can be applied. */ lightness *= 100; - if (lightness <= (8 * scale)) { + if (lightness <= (8 * PWM_LUMINANCE_SCALE)) { retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else { - retval = (lightness + (16 * scale)) / 116; + retval = (lightness + (16 * PWM_LUMINANCE_SCALE)) / 116; retval *= retval * retval; - retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale)); + retval += PWM_LUMINANCE_SCALE/2; + retval >>= 2*PWM_LUMINANCE_SHIFT; }
return retval; @@ -215,8 +218,7 @@ int pwm_backlight_brightness_default(struct device *dev, /* Fill the table using the cie1931 algorithm */ for (i = 0; i < data->max_brightness; i++) { retval = cie1931((i * PWM_LUMINANCE_SCALE) / - data->max_brightness, PWM_LUMINANCE_SCALE) * - period; + data->max_brightness) * period; retval = DIV_ROUND_CLOSEST_ULL(retval, PWM_LUMINANCE_SCALE); if (retval > UINT_MAX) return -EINVAL;
On Thu, Sep 19, 2019 at 04:06:19PM +0200, Rasmus Villemoes wrote:
Using a power-of-2 instead of power-of-10 base makes the computations much cheaper. 2^16 is safe; retval never becomes more than 2^48 + 2^16/2. On a 32 bit platform, the very expensive 64/32 division at the end of cie1931() instead becomes essentially free (a shift by 32 is just a register rename).
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk
Reviewed-by: Daniel Thompson daniel.thompson@linaro.org
drivers/video/backlight/pwm_bl.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index aee6839e024a..102bc191310f 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -148,7 +148,8 @@ static const struct backlight_ops pwm_backlight_ops = { };
#ifdef CONFIG_OF -#define PWM_LUMINANCE_SCALE 10000 /* luminance scale */ +#define PWM_LUMINANCE_SHIFT 16 +#define PWM_LUMINANCE_SCALE (1 << PWM_LUMINANCE_SHIFT) /* luminance scale */
/*
- CIE lightness to PWM conversion.
@@ -165,23 +166,25 @@ static const struct backlight_ops pwm_backlight_ops = {
- The following function does the fixed point maths needed to implement the
- above formula.
*/ -static u64 cie1931(unsigned int lightness, unsigned int scale) +static u64 cie1931(unsigned int lightness) { u64 retval;
/* * @lightness is given as a number between 0 and 1, expressed
* as a fixed-point number in scale @scale. Convert to a
* percentage, still expressed as a fixed-point number, so the
* above formulas can be applied.
* as a fixed-point number in scale
* PWM_LUMINANCE_SCALE. Convert to a percentage, still
* expressed as a fixed-point number, so the above formulas
*/ lightness *= 100;* can be applied.
- if (lightness <= (8 * scale)) {
- if (lightness <= (8 * PWM_LUMINANCE_SCALE)) { retval = DIV_ROUND_CLOSEST(lightness * 10, 9033); } else {
retval = (lightness + (16 * scale)) / 116;
retval *= retval * retval;retval = (lightness + (16 * PWM_LUMINANCE_SCALE)) / 116;
retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
retval += PWM_LUMINANCE_SCALE/2;
retval >>= 2*PWM_LUMINANCE_SHIFT;
}
return retval;
@@ -215,8 +218,7 @@ int pwm_backlight_brightness_default(struct device *dev, /* Fill the table using the cie1931 algorithm */ for (i = 0; i < data->max_brightness; i++) { retval = cie1931((i * PWM_LUMINANCE_SCALE) /
data->max_brightness, PWM_LUMINANCE_SCALE) *
period;
retval = DIV_ROUND_CLOSEST_ULL(retval, PWM_LUMINANCE_SCALE); if (retval > UINT_MAX) return -EINVAL;data->max_brightness) * period;
-- 2.20.1
Hi Rasmus
Has something gone wrong with the mail out for this patch set. I didn't get a covering letter or patch 5/5?
Daniel.
On 20/09/2019 12.14, Daniel Thompson wrote:
Hi Rasmus
Has something gone wrong with the mail out for this patch set. I didn't get a covering letter or patch 5/5?
Sorry about that. I should have included a cover letter so you'd know that patch 5 wasn't directly related to the other patches.
https://lore.kernel.org/lkml/20190919140620.32407-5-linux@rasmusvillemoes.dk...
I was removing the now unused int_pow() function, but Andy has told me there are new users in -next, so it's moot. Only the first four patches are relevant.
Thanks, Rasmus
On Thu, Sep 19, 2019 at 04:06:16PM +0200, Rasmus Villemoes wrote:
The "break-even" point for the two formulas is L==8, which is also what the code actually implements. [Incidentally, at that point one has Y=0.008856, not 0.08856].
Moreover, all the sources I can find say the linear factor is 903.3 rather than 902.3, which makes sense since then the formulas agree at L==8, both yielding the 0.008856 figure to four significant digits.
Indeed. Interestingly the following doc (with a high search rank in Google) has exactly this inconsistency and uses different values at different times: http://www.photonstophotos.net/GeneralTopics/Exposure/Psychometric_Lightness...
Signed-off-by: Rasmus Villemoes linux@rasmusvillemoes.dk
Reviewed-by: Daniel Thompson daniel.thompson@linaro.org
drivers/video/backlight/pwm_bl.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c index 2201b8c78641..be36be1cacb7 100644 --- a/drivers/video/backlight/pwm_bl.c +++ b/drivers/video/backlight/pwm_bl.c @@ -155,8 +155,8 @@ static const struct backlight_ops pwm_backlight_ops = {
- The CIE 1931 lightness formula is what actually describes how we perceive
- light:
Y = (L* / 902.3) if L* ≤ 0.08856
Y = ((L* + 16) / 116)^3 if L* > 0.08856
Y = (L* / 903.3) if L* ≤ 8
Y = ((L* + 16) / 116)^3 if L* > 8
- Where Y is the luminance, the amount of light coming out of the screen, and
- is a number between 0.0 and 1.0; and L* is the lightness, how bright a human
@@ -169,9 +169,15 @@ static u64 cie1931(unsigned int lightness, unsigned int scale) { u64 retval;
- /*
* @lightness is given as a number between 0 and 1, expressed
* as a fixed-point number in scale @scale. Convert to a
* percentage, still expressed as a fixed-point number, so the
* above formulas can be applied.
lightness *= 100; if (lightness <= (8 * scale)) {*/
retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9023);
} else { retval = int_pow((lightness + (16 * scale)) / 116, 3); retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033);
-- 2.20.1
dri-devel@lists.freedesktop.org