Re: [PATCH v2 00/10] Color Manager Implementation

List overview All Threads
Download

newer

older

kernel-doc markdown support

[PATCH] drm/dp/mst: dump branch...

Damien Lespiau

9 Jun 2015 9 Jun '15

12:50 p.m.

On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...

From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien

Show replies by date

Daniel Vetter

15 Jun 15 Jun

6:53 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...

On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have an enum property listing all the supported gamma table formats, of which 8bit 256 entry (the current standard) would be a one. This enum space would need to be drm-wide ofc. Then the gamma blob would just contain the table. This way we can allow funky stuff like the 1025th entry for 1.0+ values some intel tables have, and similar things.

Wrt pre-post and plan/crtc I guess we'd just add the properties to all the objects where they're possible on a given platform and then the driver must check if there's constraints (e.g. post-lut gamma only on 1 plane or the crtc or similar stuff).

Also there's the legacy gamma ioctl. That should forward to the crtc gamma (and there probably pick post lut and pre-lut only if there's no post lut). For names I'd suggest

"pre-gamma-type", "pre-gamma-data", "post-gamma-type" and "post-gamma-data" but I don't care terrible much about them. -Daniel

...

Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Matheson, Annie J

8:30 p.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

+Susanta/Shashank

How does this review from Daniel sound to you guys? I know you've ask for the Display team to review the latest design doc before you start the external communication and there's been some discussion below...

Thanks.

Annie Matheson Intel Corporation Phone: (503) 712-0586 Email: annie.j.matheson@intel.com

-----Original Message----- From: Daniel Vetter [mailto:daniel.vetter@ffwll.ch] On Behalf Of Daniel Vetter Sent: Sunday, June 14, 2015 11:53 PM To: Lespiau, Damien Cc: Malladi, Kausal; Matheson, Annie J; R, Dhanya p; intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Purushothaman, Vijay A; Barnes, Jesse; Vetter, Daniel Subject: Re: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...

On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Also there's the legacy gamma ioctl. That should forward to the crtc gamma (and there probably pick post lut and pre-lut only if there's no post lut). For names I'd suggest

"pre-gamma-type", "pre-gamma-data", "post-gamma-type" and "post-gamma-data" but I don't care terrible much about them. -Daniel

...

Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Sharma, Shashank

16 Jun 16 Jun

3:12 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

Hi Annie,

I missed this comment (I was missing from to/cc list, so my filter did the trick :)) Overall, comments from Daniel looks good, and is pretty much aligned to what we are trying to do (1. Generic for all KMS drivers).

If you check the latest design, we are also planning to expose a read-only property, to userspace, to expose all the color capabilities. It's similar to what Danvet suggested, we are just doing it for all color properties, instead of just for gamma. Userspace can read this blob property, and extract all the color capabilities of the platform (gamma, CSC, degamma) and decide which one to opt for.

I would again recommend all to go through the document (only the how to query / get / set section would be enough), and let us know if we need a change at this level.

Regards Shashank -----Original Message----- From: Matheson, Annie J Sent: Tuesday, June 16, 2015 2:00 AM To: Daniel Vetter; Lespiau, Damien; Sharma, Shashank; Bhattacharjee, Susanta Cc: Malladi, Kausal; R, Dhanya p; intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Purushothaman, Vijay A; Barnes, Jesse; Vetter, Daniel Subject: RE: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

+Susanta/Shashank

Thanks.

Annie Matheson Intel Corporation Phone: (503) 712-0586 Email: annie.j.matheson@intel.com

On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...

On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Also there's the legacy gamma ioctl. That should forward to the crtc gamma (and there probably pick post lut and pre-lut only if there's no post lut). For names I'd suggest

"pre-gamma-type", "pre-gamma-data", "post-gamma-type" and "post-gamma-data" but I don't care terrible much about them. -Daniel

...

Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Matheson, Annie J

10:10 p.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

Jesse-Daniel:

Can you take a look at this and let us know your thoughts please?

Thanks.

Annie Matheson Intel Corporation Phone: (503) 712-0586 Email: annie.j.matheson@intel.com

-----Original Message----- From: Sharma, Shashank Sent: Monday, June 15, 2015 8:12 PM To: Matheson, Annie J; Daniel Vetter; Lespiau, Damien; Bhattacharjee, Susanta Cc: Malladi, Kausal; R, Dhanya p; intel-gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; Purushothaman, Vijay A; Barnes, Jesse; Vetter, Daniel Subject: RE: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

Hi Annie,

I would again recommend all to go through the document (only the how to query / get / set section would be enough), and let us know if we need a change at this level.

+Susanta/Shashank

Thanks.

Annie Matheson Intel Corporation Phone: (503) 712-0586 Email: annie.j.matheson@intel.com

On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...

On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Also there's the legacy gamma ioctl. That should forward to the crtc gamma (and there probably pick post lut and pre-lut only if there's no post lut). For names I'd suggest

"pre-gamma-type", "pre-gamma-data", "post-gamma-type" and "post-gamma-data" but I don't care terrible much about them. -Daniel

...

Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

13 Jul 13 Jul

8:29 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...

On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...
On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

Regards,

Hans

...

an enum property listing all the supported gamma table formats, of which 8bit 256 entry (the current standard) would be a one. This enum space would need to be drm-wide ofc. Then the gamma blob would just contain the table. This way we can allow funky stuff like the 1025th entry for 1.0+ values some intel tables have, and similar things.

Wrt pre-post and plan/crtc I guess we'd just add the properties to all the objects where they're possible on a given platform and then the driver must check if there's constraints (e.g. post-lut gamma only on 1 plane or the crtc or similar stuff).

Also there's the legacy gamma ioctl. That should forward to the crtc gamma (and there probably pick post lut and pre-lut only if there's no post lut). For names I'd suggest

"pre-gamma-type", "pre-gamma-data", "post-gamma-type" and "post-gamma-data" but I don't care terrible much about them. -Daniel

...
Option 3/ is, IMHO, a no-go, we should really try hard to limit the work we need to do per-platform, which means defining a common format for the values we give to the kernel. As stated in various places, 16.16 seems the format of choice, even for the LUTs as we have wide gamut support in some of the LUTs where we can map values > 1.0 to other values > 1.0.

Another thing, the documentation of the interface needs to be a bit more crisp. For instance, we don't currently define the order in which the CSC and LUT transforms of this patch set are applied: is this a de-gamma LUT to do the CSC in linear space? but then that means the display is linear, oops. So it must be a post-CSC lut, but then we don't de-gamma sRGB (not technically a single gamma power curve for sRGB, but details, details) before applying a linear transform. So with this interface, we have to enforce the fbs are linear, losing dynamic range. I'm sure later patches would expose more properties, but as a stand-alone patch set, it would seem we can't do anything useful?

-- Damien _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Daniel Vetter

9:18 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...

On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...
On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...
On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

But yeah agreeing on the underlying layout would be good so that we could share in-kernel code. We're aiming to not have any LUT interpolation in the kernel (just dropping samples at most if e.g. the hw table doesn't have linear sample positions). But with the LUT we might need to mutliply it with an in-kernel one (we need the CSC unit on some platforms to compress the color output range for hdmi). And maybe compress the LUTs too. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

9:43 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...

On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...
On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...
On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...
On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...

255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

Regards,

Hans

...

But yeah agreeing on the underlying layout would be good so that we could share in-kernel code. We're aiming to not have any LUT interpolation in the kernel (just dropping samples at most if e.g. the hw table doesn't have linear sample positions). But with the LUT we might need to mutliply it with an in-kernel one (we need the CSC unit on some platforms to compress the color output range for hdmi). And maybe compress the LUTs too. -Daniel

Daniel Vetter

9:54 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...

On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...
On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...
On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...
On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote:

...
From: Kausal Malladi Kausal.Malladi@intel.com

This patch set adds color manager implementation in drm/i915 layer. Color Manager is an extension in i915 driver to support color correction/enhancement. Various Intel platforms support several color correction capabilities. Color Manager provides abstraction of these properties and allows a user space UI agent to correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...
255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

Oh and we might need those since for CSC and at least some LUTs you can do this. It's probably needed if your destination color space is much smaller than the source and you need to expand it. Will result in some clamping ofc. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

10:11 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...

On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...
On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...
On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote:

...
On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: > From: Kausal Malladi Kausal.Malladi@intel.com > > This patch set adds color manager implementation in drm/i915 layer. > Color Manager is an extension in i915 driver to support color > correction/enhancement. Various Intel platforms support several > color correction capabilities. Color Manager provides abstraction > of these properties and allows a user space UI agent to > correct/enhance the display.

So I did a first rough pass on the API itself. The big question that isn't solved at the moment is: do we want to try to do generic KMS properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels:

1/ Generic for all KMS drivers 2/ Generic for i915 supported platfoms 3/ Specific to each platform

At this point, I'm quite tempted to say we should give 1/ a shot. We should be able to have pre-LUT + matrix + post-LUT on CRTC objects and guarantee that, when the drivers expose such properties, user space can at least give 8 bits LUT + 3x3 matrix + 8 bits LUT.

It may be possible to use the "try" version of the atomic ioctl to explore the space of possibilities from a generic user space to use bigger LUTs as well. A HAL layer (which is already there in some but not all OSes) would still be able to use those generic properties to load "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...
255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

...

Oh and we might need those since for CSC and at least some LUTs you can do this.

Sorry, I don't understand this sentence. What does 'those' and 'this' refer to?

...

It's probably needed if your destination color space is much smaller than the source and you need to expand it. Will result in some clamping ofc. -Daniel

Regards,

Hans

Daniel Vetter

2:07 p.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:

...

On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...
On 06/15/2015 08:53 AM, Daniel Vetter wrote:

...
On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote: > On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: >> From: Kausal Malladi Kausal.Malladi@intel.com >> >> This patch set adds color manager implementation in drm/i915 layer. >> Color Manager is an extension in i915 driver to support color >> correction/enhancement. Various Intel platforms support several >> color correction capabilities. Color Manager provides abstraction >> of these properties and allows a user space UI agent to >> correct/enhance the display. > > So I did a first rough pass on the API itself. The big question that > isn't solved at the moment is: do we want to try to do generic KMS > properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels: > > 1/ Generic for all KMS drivers > 2/ Generic for i915 supported platfoms > 3/ Specific to each platform > > At this point, I'm quite tempted to say we should give 1/ a shot. We > should be able to have pre-LUT + matrix + post-LUT on CRTC objects and > guarantee that, when the drivers expose such properties, user space can > at least give 8 bits LUT + 3x3 matrix + 8 bits LUT. > > It may be possible to use the "try" version of the atomic ioctl to > explore the space of possibilities from a generic user space to use > bigger LUTs as well. A HAL layer (which is already there in some but not > all OSes) would still be able to use those generic properties to load > "precision optimized" LUTs with some knowledge of the hardware.

Yeah, imo 1/ should be doable. For the matrix we should be able to be fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...
255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

Hm, why do we need 8 bits more precision than source data? At least in the intel hw I've seen the most bits we can stuff into the hw is 0.12 (again for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits away. What would you want to use these bits for?

...

...
Oh and we might need those since for CSC and at least some LUTs you can do this.

Sorry, I don't understand this sentence. What does 'those' and 'this' refer to?

I meant that the higher bits before the decimal are needed in some cases by the hw since it allows values > logical 1.0. At least some hw supports 16bit (half) floats as scanout sources too. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

14 Jul 14 Jul

8:17 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/13/15 16:07, Daniel Vetter wrote:

...

On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote:

...
On 06/15/2015 08:53 AM, Daniel Vetter wrote: > On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote: >> On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: >>> From: Kausal Malladi Kausal.Malladi@intel.com >>> >>> This patch set adds color manager implementation in drm/i915 layer. >>> Color Manager is an extension in i915 driver to support color >>> correction/enhancement. Various Intel platforms support several >>> color correction capabilities. Color Manager provides abstraction >>> of these properties and allows a user space UI agent to >>> correct/enhance the display. >> >> So I did a first rough pass on the API itself. The big question that >> isn't solved at the moment is: do we want to try to do generic KMS >> properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels: >> >> 1/ Generic for all KMS drivers >> 2/ Generic for i915 supported platfoms >> 3/ Specific to each platform >> >> At this point, I'm quite tempted to say we should give 1/ a shot. We >> should be able to have pre-LUT + matrix + post-LUT on CRTC objects and >> guarantee that, when the drivers expose such properties, user space can >> at least give 8 bits LUT + 3x3 matrix + 8 bits LUT. >> >> It may be possible to use the "try" version of the atomic ioctl to >> explore the space of possibilities from a generic user space to use >> bigger LUTs as well. A HAL layer (which is already there in some but not >> all OSes) would still be able to use those generic properties to load >> "precision optimized" LUTs with some knowledge of the hardware. > > Yeah, imo 1/ should be doable. For the matrix we should be able to be > fully generic with a 16.16 format. For gamma one option would be to have

I know I am late replying, apologies for that.

I've been working on CSC support for V4L2 as well (still work in progress) and I would like to at least end up with the same low-level fixed point format as DRM so we can share matrix/vector calculations.

Based on my experiences I have concerns about the 16.16 format: the precision is quite low which can be a problem when such values are used in matrix multiplications.

In addition, while the precision may be sufficient for 8 bit color component values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit color components.

In earlier versions of my CSC code I used a 12.20 format, but in the latest I switched to 32.32. This fits nicely in a u64 and it's easy to extract the integer and fractional parts.

If this is going to be a generic and future proof API, then my suggestion would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...
255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

Hm, why do we need 8 bits more precision than source data? At least in the intel hw I've seen the most bits we can stuff into the hw is 0.12 (again for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits away. What would you want to use these bits for?

The intel hardware uses 12 bits today, but what about the next-gen? If you are defining an API and data type just for the hardware the kernel supports today, then 12 bits might be enough precision. If you want to be future proof then you need to be prepared for more capable future hardware.

So 0.12 will obviously not be enough if you want to support 16 bit color components in the future.

In addition, to fully support HW colorspace conversion (e.g. sRGB to Rec.709) where lookup tables are used for implementing the transfer functions (normal and inverse), then you need more precision then just the number of bits per component or you will get quite large errors in the calculation.

It all depends how a LUT is used: if the value from the LUT is the 'final' value, then you don't need more precision than the number of bits of a color component. But if it is used in other calculations (3x3 matrices, full/limited range scaling, etc), then the LUT should provide more bits precision.

Which seems to be the case with Intel hardware: 12 bits is 4 bits more than the 8 bits per component it probably uses.

I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

...

...
...
Oh and we might need those since for CSC and at least some LUTs you can do this.

Sorry, I don't understand this sentence. What does 'those' and 'this' refer to?

I meant that the higher bits before the decimal are needed in some cases by the hw since it allows values > logical 1.0. At least some hw supports 16bit (half) floats as scanout sources too.

Ah, OK. Thanks for the clarification.

Regards,

Hans

Daniel Vetter

9:11 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Tue, Jul 14, 2015 at 10:17:09AM +0200, Hans Verkuil wrote:

...

On 07/13/15 16:07, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:18 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote: > On 06/15/2015 08:53 AM, Daniel Vetter wrote: >> On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote: >>> On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: >>>> From: Kausal Malladi Kausal.Malladi@intel.com >>>> >>>> This patch set adds color manager implementation in drm/i915 layer. >>>> Color Manager is an extension in i915 driver to support color >>>> correction/enhancement. Various Intel platforms support several >>>> color correction capabilities. Color Manager provides abstraction >>>> of these properties and allows a user space UI agent to >>>> correct/enhance the display. >>> >>> So I did a first rough pass on the API itself. The big question that >>> isn't solved at the moment is: do we want to try to do generic KMS >>> properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels: >>> >>> 1/ Generic for all KMS drivers >>> 2/ Generic for i915 supported platfoms >>> 3/ Specific to each platform >>> >>> At this point, I'm quite tempted to say we should give 1/ a shot. We >>> should be able to have pre-LUT + matrix + post-LUT on CRTC objects and >>> guarantee that, when the drivers expose such properties, user space can >>> at least give 8 bits LUT + 3x3 matrix + 8 bits LUT. >>> >>> It may be possible to use the "try" version of the atomic ioctl to >>> explore the space of possibilities from a generic user space to use >>> bigger LUTs as well. A HAL layer (which is already there in some but not >>> all OSes) would still be able to use those generic properties to load >>> "precision optimized" LUTs with some knowledge of the hardware. >> >> Yeah, imo 1/ should be doable. For the matrix we should be able to be >> fully generic with a 16.16 format. For gamma one option would be to have > > I know I am late replying, apologies for that. > > I've been working on CSC support for V4L2 as well (still work in progress) > and I would like to at least end up with the same low-level fixed point > format as DRM so we can share matrix/vector calculations. > > Based on my experiences I have concerns about the 16.16 format: the precision > is quite low which can be a problem when such values are used in matrix > multiplications. > > In addition, while the precision may be sufficient for 8 bit color component > values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit > color components. > > In earlier versions of my CSC code I used a 12.20 format, but in the latest I > switched to 32.32. This fits nicely in a u64 and it's easy to extract the > integer and fractional parts. > > If this is going to be a generic and future proof API, then my suggestion > would be to increase the precision of the underlying data type.

We discussed this a bit more internally and figured it would be nice to have the same fixed point for both CSC matrix and LUT/gamma tables. Current consensus seems to be to go with 8.24 for both. Since LUTs are fairly big I think it makes sense if we try to be not too wasteful (while still future-proof ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values

...
255, which might be needed (now or in the future) for 10, 12 or 16 bit

color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

Hm, why do we need 8 bits more precision than source data? At least in the intel hw I've seen the most bits we can stuff into the hw is 0.12 (again for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits away. What would you want to use these bits for?

The intel hardware uses 12 bits today, but what about the next-gen? If you are defining an API and data type just for the hardware the kernel supports today, then 12 bits might be enough precision. If you want to be future proof then you need to be prepared for more capable future hardware.

So 0.12 will obviously not be enough if you want to support 16 bit color components in the future.

In addition, to fully support HW colorspace conversion (e.g. sRGB to Rec.709) where lookup tables are used for implementing the transfer functions (normal and inverse), then you need more precision then just the number of bits per component or you will get quite large errors in the calculation.

It all depends how a LUT is used: if the value from the LUT is the 'final' value, then you don't need more precision than the number of bits of a color component. But if it is used in other calculations (3x3 matrices, full/limited range scaling, etc), then the LUT should provide more bits precision.

Which seems to be the case with Intel hardware: 12 bits is 4 bits more than the 8 bits per component it probably uses.

Intel hw supports a 12bpp pixel pipeline. They didnt add _any_ additional precision at all afaik. Which is why I wonder why we need it. I'm also not aware of any plans for pushing past 12bpp of data sent to the sink, but I honestly don't have much clue really.

I guess input is a different story, todays cmos already easily to 14bit with more to come I guess with all the noise about HDR. We probably need more headroom on v4l input side than we ever need on drm display side. Still 24bits is an awful lot of headroom, at least for the next few years. Or do you expect to hit that already soonish on v4l side?

...

I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10 years or so) we need to add a high-bpp variant if it really comes to it. But for the next few years (and that's the pace at which we completely rewrite gfx drivers anyway) I don't see anything going past .24. .16 was indeed a bit too little I think, which is why we decided to move the fixed point a bit. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

9:35 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/14/15 11:11, Daniel Vetter wrote:

...

On Tue, Jul 14, 2015 at 10:17:09AM +0200, Hans Verkuil wrote:

...
On 07/13/15 16:07, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:18 AM, Daniel Vetter wrote: > On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote: >> On 06/15/2015 08:53 AM, Daniel Vetter wrote: >>> On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote: >>>> On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: >>>>> From: Kausal Malladi Kausal.Malladi@intel.com >>>>> >>>>> This patch set adds color manager implementation in drm/i915 layer. >>>>> Color Manager is an extension in i915 driver to support color >>>>> correction/enhancement. Various Intel platforms support several >>>>> color correction capabilities. Color Manager provides abstraction >>>>> of these properties and allows a user space UI agent to >>>>> correct/enhance the display. >>>> >>>> So I did a first rough pass on the API itself. The big question that >>>> isn't solved at the moment is: do we want to try to do generic KMS >>>> properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels: >>>> >>>> 1/ Generic for all KMS drivers >>>> 2/ Generic for i915 supported platfoms >>>> 3/ Specific to each platform >>>> >>>> At this point, I'm quite tempted to say we should give 1/ a shot. We >>>> should be able to have pre-LUT + matrix + post-LUT on CRTC objects and >>>> guarantee that, when the drivers expose such properties, user space can >>>> at least give 8 bits LUT + 3x3 matrix + 8 bits LUT. >>>> >>>> It may be possible to use the "try" version of the atomic ioctl to >>>> explore the space of possibilities from a generic user space to use >>>> bigger LUTs as well. A HAL layer (which is already there in some but not >>>> all OSes) would still be able to use those generic properties to load >>>> "precision optimized" LUTs with some knowledge of the hardware. >>> >>> Yeah, imo 1/ should be doable. For the matrix we should be able to be >>> fully generic with a 16.16 format. For gamma one option would be to have >> >> I know I am late replying, apologies for that. >> >> I've been working on CSC support for V4L2 as well (still work in progress) >> and I would like to at least end up with the same low-level fixed point >> format as DRM so we can share matrix/vector calculations. >> >> Based on my experiences I have concerns about the 16.16 format: the precision >> is quite low which can be a problem when such values are used in matrix >> multiplications. >> >> In addition, while the precision may be sufficient for 8 bit color component >> values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit >> color components. >> >> In earlier versions of my CSC code I used a 12.20 format, but in the latest I >> switched to 32.32. This fits nicely in a u64 and it's easy to extract the >> integer and fractional parts. >> >> If this is going to be a generic and future proof API, then my suggestion >> would be to increase the precision of the underlying data type. > > We discussed this a bit more internally and figured it would be nice to have the same > fixed point for both CSC matrix and LUT/gamma tables. Current consensus > seems to be to go with 8.24 for both. Since LUTs are fairly big I think it > makes sense if we try to be not too wasteful (while still future-proof > ofc).

The .24 should have enough precision, but I am worried about the 8: while this works for 8 bit components, you can't use it to represent values > 255, which might be needed (now or in the future) for 10, 12 or 16 bit color components.

It's why I ended up with 32.32: it's very generic so usable for other things besides CSC.

Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented in this format.

That said, all values I'm working with in my current code are small integers (say between -4 and 4 worst case), so 8.24 would work. But I am not at all confident that this is future proof. My gut feeling is that you need to be able to represent at least the max component value + a sign bit + 7 decimals precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

Hm, why do we need 8 bits more precision than source data? At least in the intel hw I've seen the most bits we can stuff into the hw is 0.12 (again for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits away. What would you want to use these bits for?

The intel hardware uses 12 bits today, but what about the next-gen? If you are defining an API and data type just for the hardware the kernel supports today, then 12 bits might be enough precision. If you want to be future proof then you need to be prepared for more capable future hardware.

So 0.12 will obviously not be enough if you want to support 16 bit color components in the future.

In addition, to fully support HW colorspace conversion (e.g. sRGB to Rec.709) where lookup tables are used for implementing the transfer functions (normal and inverse), then you need more precision then just the number of bits per component or you will get quite large errors in the calculation.

It all depends how a LUT is used: if the value from the LUT is the 'final' value, then you don't need more precision than the number of bits of a color component. But if it is used in other calculations (3x3 matrices, full/limited range scaling, etc), then the LUT should provide more bits precision.

Which seems to be the case with Intel hardware: 12 bits is 4 bits more than the 8 bits per component it probably uses.

Intel hw supports a 12bpp pixel pipeline. They didnt add _any_ additional precision at all afaik. Which is why I wonder why we need it. I'm also not aware of any plans for pushing past 12bpp of data sent to the sink, but I honestly don't have much clue really.

I guess input is a different story, todays cmos already easily to 14bit with more to come I guess with all the noise about HDR. We probably need more headroom on v4l input side than we ever need on drm display side. Still 24bits is an awful lot of headroom, at least for the next few years. Or do you expect to hit that already soonish on v4l side?

I think 24 bits precision is enough, but that assumes that the integer part will be between -128 and 127. And I am not so sure that that is a valid assumption.

It's true today, but what if you have a HW LUT that maps integer values and expects 16.0 or perhaps 12.4?

BTW, I am assuming that the proposed 8.24 format is a signed format: the CSC 3x3 matrices contain negative values, so any fixed point data type has to be signed.

I'm just wondering: is it really such a big deal to use a 32.32 format? Yes, the amount of data doubles, but it's quite rare that you need to configure a LUT, right?

For a 12 bit LUT it's 16 kB vs 32 kB. Yes, it's more data, but the advantage is that the data type is future proof (well, probably :-) ) and much more likely to be usable in other subsystems.

...

...
I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10 years or so) we need to add a high-bpp variant if it really comes to it.

I think this is much closer than you think. I agree that you are not likely to see this soon for consumer graphics cards, but for professional equipment and high-end consumer electronics this is another story.

And if it is being done for input, then output will need it as well: after all, what's the point of 16-bit color components if you can't display it? Whether Intel will support it is another matter, but there are other vendors, you know... :-)

Regards,

Hans

...

But for the next few years (and that's the pace at which we completely rewrite gfx drivers anyway) I don't see anything going past .24. .16 was indeed a bit too little I think, which is why we decided to move the fixed point a bit. -Daniel

Daniel Vetter

10:16 a.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On Tue, Jul 14, 2015 at 11:35:30AM +0200, Hans Verkuil wrote:

...

On 07/14/15 11:11, Daniel Vetter wrote:

...
On Tue, Jul 14, 2015 at 10:17:09AM +0200, Hans Verkuil wrote:

...
On 07/13/15 16:07, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 12:11:08PM +0200, Hans Verkuil wrote:

...
On 07/13/2015 11:54 AM, Daniel Vetter wrote:

...
On Mon, Jul 13, 2015 at 11:43:31AM +0200, Hans Verkuil wrote: > On 07/13/2015 11:18 AM, Daniel Vetter wrote: >> On Mon, Jul 13, 2015 at 10:29:32AM +0200, Hans Verkuil wrote: >>> On 06/15/2015 08:53 AM, Daniel Vetter wrote: >>>> On Tue, Jun 09, 2015 at 01:50:48PM +0100, Damien Lespiau wrote: >>>>> On Thu, Jun 04, 2015 at 07:12:31PM +0530, Kausal Malladi wrote: >>>>>> From: Kausal Malladi Kausal.Malladi@intel.com >>>>>> >>>>>> This patch set adds color manager implementation in drm/i915 layer. >>>>>> Color Manager is an extension in i915 driver to support color >>>>>> correction/enhancement. Various Intel platforms support several >>>>>> color correction capabilities. Color Manager provides abstraction >>>>>> of these properties and allows a user space UI agent to >>>>>> correct/enhance the display. >>>>> >>>>> So I did a first rough pass on the API itself. The big question that >>>>> isn't solved at the moment is: do we want to try to do generic KMS >>>>> properties for pre-LUT + matrix + post-LUT or not. "Generic" has 3 levels: >>>>> >>>>> 1/ Generic for all KMS drivers >>>>> 2/ Generic for i915 supported platfoms >>>>> 3/ Specific to each platform >>>>> >>>>> At this point, I'm quite tempted to say we should give 1/ a shot. We >>>>> should be able to have pre-LUT + matrix + post-LUT on CRTC objects and >>>>> guarantee that, when the drivers expose such properties, user space can >>>>> at least give 8 bits LUT + 3x3 matrix + 8 bits LUT. >>>>> >>>>> It may be possible to use the "try" version of the atomic ioctl to >>>>> explore the space of possibilities from a generic user space to use >>>>> bigger LUTs as well. A HAL layer (which is already there in some but not >>>>> all OSes) would still be able to use those generic properties to load >>>>> "precision optimized" LUTs with some knowledge of the hardware. >>>> >>>> Yeah, imo 1/ should be doable. For the matrix we should be able to be >>>> fully generic with a 16.16 format. For gamma one option would be to have >>> >>> I know I am late replying, apologies for that. >>> >>> I've been working on CSC support for V4L2 as well (still work in progress) >>> and I would like to at least end up with the same low-level fixed point >>> format as DRM so we can share matrix/vector calculations. >>> >>> Based on my experiences I have concerns about the 16.16 format: the precision >>> is quite low which can be a problem when such values are used in matrix >>> multiplications. >>> >>> In addition, while the precision may be sufficient for 8 bit color component >>> values, I'm pretty sure it will be insufficient when dealing with 12 or 16 bit >>> color components. >>> >>> In earlier versions of my CSC code I used a 12.20 format, but in the latest I >>> switched to 32.32. This fits nicely in a u64 and it's easy to extract the >>> integer and fractional parts. >>> >>> If this is going to be a generic and future proof API, then my suggestion >>> would be to increase the precision of the underlying data type. >> >> We discussed this a bit more internally and figured it would be nice to have the same >> fixed point for both CSC matrix and LUT/gamma tables. Current consensus >> seems to be to go with 8.24 for both. Since LUTs are fairly big I think it >> makes sense if we try to be not too wasteful (while still future-proof >> ofc). > > The .24 should have enough precision, but I am worried about the 8: while > this works for 8 bit components, you can't use it to represent values >> 255, which might be needed (now or in the future) for 10, 12 or 16 bit > color components. > > It's why I ended up with 32.32: it's very generic so usable for other > things besides CSC. > > Note that 8.24 is really 7.24 + one sign bit. So 255 can't be represented > in this format. > > That said, all values I'm working with in my current code are small integers > (say between -4 and 4 worst case), so 8.24 would work. But I am not at all > confident that this is future proof. My gut feeling is that you need to be > able to represent at least the max component value + a sign bit + 7 decimals > precision. Which makes 17.24.

The idea is to steal from GL and always normalize everything to [0.0, 1.0], irrespective of the source color format. We need that in drm since if you blend together planes with different formats it's completely undefined which one you should pick. 8 bits of precision for values out of range should be enough ;-)

That doesn't really help much, using a [0-1] range just means that you need more precision for the fraction since the integer precision is now added to the fractional precision.

So for 16-bit color components the 8.24 format will leave you with only 8 bits precision if you scale each component to the [0-1] range. That's slightly more than 2 decimals. I don't believe that is enough. If you do a gamma table lookup and then feed the result to a CSC matrix you need more precision if you want to get accurate results.

Hm, why do we need 8 bits more precision than source data? At least in the intel hw I've seen the most bits we can stuff into the hw is 0.12 (again for rescaled range to 0.0-1.0). 24 bits means as-is we'll throw 12 bits away. What would you want to use these bits for?

The intel hardware uses 12 bits today, but what about the next-gen? If you are defining an API and data type just for the hardware the kernel supports today, then 12 bits might be enough precision. If you want to be future proof then you need to be prepared for more capable future hardware.

So 0.12 will obviously not be enough if you want to support 16 bit color components in the future.

In addition, to fully support HW colorspace conversion (e.g. sRGB to Rec.709) where lookup tables are used for implementing the transfer functions (normal and inverse), then you need more precision then just the number of bits per component or you will get quite large errors in the calculation.

It all depends how a LUT is used: if the value from the LUT is the 'final' value, then you don't need more precision than the number of bits of a color component. But if it is used in other calculations (3x3 matrices, full/limited range scaling, etc), then the LUT should provide more bits precision.

Which seems to be the case with Intel hardware: 12 bits is 4 bits more than the 8 bits per component it probably uses.

Intel hw supports a 12bpp pixel pipeline. They didnt add _any_ additional precision at all afaik. Which is why I wonder why we need it. I'm also not aware of any plans for pushing past 12bpp of data sent to the sink, but I honestly don't have much clue really.

I guess input is a different story, todays cmos already easily to 14bit with more to come I guess with all the noise about HDR. We probably need more headroom on v4l input side than we ever need on drm display side. Still 24bits is an awful lot of headroom, at least for the next few years. Or do you expect to hit that already soonish on v4l side?

I think 24 bits precision is enough, but that assumes that the integer part will be between -128 and 127. And I am not so sure that that is a valid assumption.

The idea is always that you'd normalize to 0.0-1.0 of the range going over the wire to the sink. The 7 bits of headroom is just for smoother clamping when your colorspaces don't match up. The most I've seen in intel hw is 3 additional bits used there.

...

It's true today, but what if you have a HW LUT that maps integer values and expects 16.0 or perhaps 12.4?

BTW, I am assuming that the proposed 8.24 format is a signed format: the CSC 3x3 matrices contain negative values, so any fixed point data type has to be signed.

Yeah, it's s8.24 really. We definitely need a signed integer part, agreed on that.

...

I'm just wondering: is it really such a big deal to use a 32.32 format? Yes, the amount of data doubles, but it's quite rare that you need to configure a LUT, right?

For a 12 bit LUT it's 16 kB vs 32 kB. Yes, it's more data, but the advantage is that the data type is future proof (well, probably :-) ) and much more likely to be usable in other subsystems.

We need to bash this stuff into hw under spin_lock_irqsave in i915. Yeah props to our hw engineers for screwing things up, but I'd like to not be too wasteful. But otoh the mmios will totally swamp any kind of memory loads we're doing.

The other bit is that our android folks are a bit over the top with reducing overhead sometimes, e.g. they don't like keeping around metadata-only drm_framebuffer objects for xrgb vs. argb because it takes away a few bytes ;-)

...

...
...
I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10 years or so) we need to add a high-bpp variant if it really comes to it.

I think this is much closer than you think. I agree that you are not likely to see this soon for consumer graphics cards, but for professional equipment and high-end consumer electronics this is another story.

And if it is being done for input, then output will need it as well: after all, what's the point of 16-bit color components if you can't display it? Whether Intel will support it is another matter, but there are other vendors, you know... :-)

Input is different because of post-processing - you need that much depth to be able to get useful data out of the dark areas, without the risk for the highlights to clip. While processing you need that depth to avoid banding (because integer math sucks). But tbh I haven't seen anything but 12bpc (and those usually use dithered 10bpc panels internally) anywhere and the common screens top out at 10bpc.

So from my pov of drm s8.24 will be enough for a long time, but if you're convinced that the input side needs this soon I guess it makes sense to go with the bit more overhead and 32.32. Otoh we'll never need 32 of integer part if we normalize to 0.0-1.0, and that normalization is really something I think we want. -Daniel

-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

Hans Verkuil

15 Jul 15 Jul

12:35 p.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/14/15 12:16, Daniel Vetter wrote:

...

...
...
...
I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10 years or so) we need to add a high-bpp variant if it really comes to it.

I think this is much closer than you think. I agree that you are not likely to see this soon for consumer graphics cards, but for professional equipment and high-end consumer electronics this is another story.

And if it is being done for input, then output will need it as well: after all, what's the point of 16-bit color components if you can't display it? Whether Intel will support it is another matter, but there are other vendors, you know... :-)

Input is different because of post-processing - you need that much depth to be able to get useful data out of the dark areas, without the risk for the highlights to clip. While processing you need that depth to avoid banding (because integer math sucks). But tbh I haven't seen anything but 12bpc (and those usually use dithered 10bpc panels internally) anywhere and the common screens top out at 10bpc.

So from my pov of drm s8.24 will be enough for a long time, but if you're convinced that the input side needs this soon I guess it makes sense to go with the bit more overhead and 32.32. Otoh we'll never need 32 of integer part if we normalize to 0.0-1.0, and that normalization is really something I think we want.

I think 32.32 is primarily important as a standard data type for public APIs and as the standard data type for math operations. How it is stored in the driver is driver (or possibly subsystem) specific. Is it a problem for you to store it as 8.24 in the driver (specifically for LUTs) to reduce memory usage? Converting from 8.24 to 32.32 and vice versa is trivial, so this might be the best of both worlds.

Regards,

Hans

Hans Verkuil

1:28 p.m.

New subject: [Intel-gfx] [PATCH v2 00/10] Color Manager Implementation

On 07/15/15 14:35, Hans Verkuil wrote:

...

On 07/14/15 12:16, Daniel Vetter wrote:

<cut away old quotes>

...
...
...
...
I would guess that a LUT supporting 16 bit color components would need a precision of 0.20 or so (assuming the resulting values are used in further calculations).

High dynamic range video will be an important driving force towards higher bit depths and accurate color handling, so you can expect to see this become much more important in the coming years.

And as I mentioned another consideration is that this fixed point data type might be useful elsewhere in the kernel where you need to do some precision arithmetic. So using a standard type that anyone can use with functions in lib/ to do basic operations can be very useful indeed beyond just DRM and V4L2.

0.20 would still comfortably fit into 8.24. And yeah worst-case (in 10 years or so) we need to add a high-bpp variant if it really comes to it.

I think this is much closer than you think. I agree that you are not likely to see this soon for consumer graphics cards, but for professional equipment and high-end consumer electronics this is another story.

And if it is being done for input, then output will need it as well: after all, what's the point of 16-bit color components if you can't display it? Whether Intel will support it is another matter, but there are other vendors, you know... :-)

Input is different because of post-processing - you need that much depth to be able to get useful data out of the dark areas, without the risk for the highlights to clip. While processing you need that depth to avoid banding (because integer math sucks). But tbh I haven't seen anything but 12bpc (and those usually use dithered 10bpc panels internally) anywhere and the common screens top out at 10bpc.

So from my pov of drm s8.24 will be enough for a long time, but if you're convinced that the input side needs this soon I guess it makes sense to go with the bit more overhead and 32.32. Otoh we'll never need 32 of integer part if we normalize to 0.0-1.0, and that normalization is really something I think we want.

I think 32.32 is primarily important as a standard data type for public APIs and as the standard data type for math operations. How it is stored in the driver is driver (or possibly subsystem) specific. Is it a problem for you to store it as 8.24 in the driver (specifically for LUTs) to reduce memory usage? Converting from 8.24 to 32.32 and vice versa is trivial, so this might be the best of both worlds.

Follow-up: I just read through the latest Color Manager Implementation patch series and there the LUTs use U8.24 and CSC use S31.32.

Since DRM will always use LUT values in the range [0-1] and given the fact that these values are just stored and not used in further calculations, I think that this is OK. V4L2 might still use 32.32 for a LUT (currently we don't support that yet, but we likely will in the near future), but that's OK.

As long as any fixed point math functions we create in the kernel all use S31.32 so they can be shared between DRM and V4L2 (and quite possibly other subsystems) I have no problems with this scheme.

Regards,

Hans

3580

Age (days ago)

3616

Last active (days ago)

dri-devel@lists.freedesktop.org

16 comments

5 participants

tags (0)

participants (5)

Damien Lespiau
Daniel Vetter
Hans Verkuil
Matheson, Annie J
Sharma, Shashank