Hi Daniel, Jani,
is it ok to merge this patch along with 2/2 via the i915 tree?
--Imre
On Mon, Nov 23, 2020 at 08:26:30PM +0200, Imre Deak wrote:
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
Gen12 display can decompress surfaces compressed by render engine with Clear Color, add a new modifier as the driver needs to know the surface was compressed by render engine.
V2: Description changes as suggested by Rafael. V3: Mention the Clear Color size of 64 bits in the comments(DK) v4: Fix trailing whitespaces v5: Explain Clear Color in the documentation. v6: Documentation Nitpicks(Nanley)
Cc: Ville Syrjala ville.syrjala@linux.intel.com Cc: Dhinakaran Pandiyan dhinakaran.pandiyan@intel.com Cc: Kalyan Kondapally kalyan.kondapally@intel.com Cc: Rafael Antognolli rafael.antognolli@intel.com Cc: Nanley Chery nanley.g.chery@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Imre Deak imre.deak@intel.com
include/uapi/drm/drm_fourcc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index ca48ed0e6bc1..0a1b2c4c4bee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -527,6 +527,25 @@ extern "C" { */ #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS fourcc_mod_code(INTEL, 7)
+/*
- Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
- compression.
- The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
- and at index 1. The clear color is stored at index 2, and the pitch should
- be ignored. The clear color structure is 256 bits. The first 128 bits
- represents Raw Clear Color Red, Green, Blue and Alpha color each represented
- by 32 bits. The raw clear color is consumed by the 3d engine and generates
- the converted clear color of size 64 bits. The first 32 bits store the Lower
- Converted Clear Color value and the next 32 bits store the Higher Converted
- Clear Color value when applicable. The Converted Clear Color values are
- consumed by the DE. The last 64 bits are used to store Color Discard Enable
- and Depth Clear Value Valid which are ignored by the DE. A CCS cache line
- corresponds to an area of 4x1 tiles in the main surface. The main surface
- pitch is required to be a multiple of 4 tile widths.
- */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC fourcc_mod_code(INTEL, 8)
/*
- Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
-- 2.25.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Fri, Nov 27, 2020 at 04:31:00PM +0200, Imre Deak wrote:
Hi Daniel, Jani,
is it ok to merge this patch along with 2/2 via the i915 tree?
Ack from mesa (userspace in general, but mesa is kinda mandatory) is missing I think. With that
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
--Imre
On Mon, Nov 23, 2020 at 08:26:30PM +0200, Imre Deak wrote:
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
Gen12 display can decompress surfaces compressed by render engine with Clear Color, add a new modifier as the driver needs to know the surface was compressed by render engine.
V2: Description changes as suggested by Rafael. V3: Mention the Clear Color size of 64 bits in the comments(DK) v4: Fix trailing whitespaces v5: Explain Clear Color in the documentation. v6: Documentation Nitpicks(Nanley)
Cc: Ville Syrjala ville.syrjala@linux.intel.com Cc: Dhinakaran Pandiyan dhinakaran.pandiyan@intel.com Cc: Kalyan Kondapally kalyan.kondapally@intel.com Cc: Rafael Antognolli rafael.antognolli@intel.com Cc: Nanley Chery nanley.g.chery@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Imre Deak imre.deak@intel.com
include/uapi/drm/drm_fourcc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index ca48ed0e6bc1..0a1b2c4c4bee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -527,6 +527,25 @@ extern "C" { */ #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS fourcc_mod_code(INTEL, 7)
+/*
- Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
- compression.
- The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
- and at index 1. The clear color is stored at index 2, and the pitch should
- be ignored. The clear color structure is 256 bits. The first 128 bits
- represents Raw Clear Color Red, Green, Blue and Alpha color each represented
- by 32 bits. The raw clear color is consumed by the 3d engine and generates
- the converted clear color of size 64 bits. The first 32 bits store the Lower
- Converted Clear Color value and the next 32 bits store the Higher Converted
- Clear Color value when applicable. The Converted Clear Color values are
- consumed by the DE. The last 64 bits are used to store Color Discard Enable
- and Depth Clear Value Valid which are ignored by the DE. A CCS cache line
- corresponds to an area of 4x1 tiles in the main surface. The main surface
- pitch is required to be a multiple of 4 tile widths.
- */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC fourcc_mod_code(INTEL, 8)
/*
- Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
-- 2.25.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Fri, Nov 27, 2020 at 04:19:20PM +0100, Daniel Vetter wrote:
On Fri, Nov 27, 2020 at 04:31:00PM +0200, Imre Deak wrote:
Hi Daniel, Jani,
is it ok to merge this patch along with 2/2 via the i915 tree?
Ack from mesa (userspace in general, but mesa is kinda mandatory) is missing I think. With that Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
Thanks.
Nanley, could you ACK the patchset if they look ok from Mesa's POV? It works as expected at least with the igt/kms_ccs RC-CC subtest.
--Imre
On Mon, Nov 23, 2020 at 08:26:30PM +0200, Imre Deak wrote:
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
Gen12 display can decompress surfaces compressed by render engine with Clear Color, add a new modifier as the driver needs to know the surface was compressed by render engine.
V2: Description changes as suggested by Rafael. V3: Mention the Clear Color size of 64 bits in the comments(DK) v4: Fix trailing whitespaces v5: Explain Clear Color in the documentation. v6: Documentation Nitpicks(Nanley)
Cc: Ville Syrjala ville.syrjala@linux.intel.com Cc: Dhinakaran Pandiyan dhinakaran.pandiyan@intel.com Cc: Kalyan Kondapally kalyan.kondapally@intel.com Cc: Rafael Antognolli rafael.antognolli@intel.com Cc: Nanley Chery nanley.g.chery@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Imre Deak imre.deak@intel.com
include/uapi/drm/drm_fourcc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index ca48ed0e6bc1..0a1b2c4c4bee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -527,6 +527,25 @@ extern "C" { */ #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS fourcc_mod_code(INTEL, 7)
+/*
- Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
- compression.
- The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
- and at index 1. The clear color is stored at index 2, and the pitch should
- be ignored. The clear color structure is 256 bits. The first 128 bits
- represents Raw Clear Color Red, Green, Blue and Alpha color each represented
- by 32 bits. The raw clear color is consumed by the 3d engine and generates
- the converted clear color of size 64 bits. The first 32 bits store the Lower
- Converted Clear Color value and the next 32 bits store the Higher Converted
- Clear Color value when applicable. The Converted Clear Color values are
- consumed by the DE. The last 64 bits are used to store Color Discard Enable
- and Depth Clear Value Valid which are ignored by the DE. A CCS cache line
- corresponds to an area of 4x1 tiles in the main surface. The main surface
- pitch is required to be a multiple of 4 tile widths.
- */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC fourcc_mod_code(INTEL, 8)
/*
- Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
-- 2.25.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
-----Original Message----- From: Imre Deak imre.deak@intel.com Sent: Friday, November 27, 2020 10:06 AM To: Daniel Vetter daniel@ffwll.ch; Chery, Nanley G nanley.g.chery@intel.com Cc: intel-gfx@lists.freedesktop.org; Nikula, Jani jani.nikula@intel.com; Daniel Vetter daniel.vetter@ffwll.ch; Rafael Antognolli rafael.antognolli@intel.com; Kondapally, Kalyan kalyan.kondapally@intel.com; Pandiyan, Dhinakaran dhinakaran.pandiyan@intel.com; dri-devel@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 1/2] drm/framebuffer: Format modifier for Intel Gen 12 render compression with Clear Color
On Fri, Nov 27, 2020 at 04:19:20PM +0100, Daniel Vetter wrote:
On Fri, Nov 27, 2020 at 04:31:00PM +0200, Imre Deak wrote:
Hi Daniel, Jani,
is it ok to merge this patch along with 2/2 via the i915 tree?
Ack from mesa (userspace in general, but mesa is kinda mandatory) is missing I think. With that Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
Thanks.
Nanley, could you ACK the patchset if they look ok from Mesa's POV? It works as expected at least with the igt/kms_ccs RC-CC subtest.
Hi Imre,
I have a question and a couple comments:
Is the map of the clear color address creating a new synchronization point between the GPU and CPU? If so, I wonder how this will impact performance. There was some talk of asynchronously updating the clear color register a while back.
We probably don't have to update the header, but we noticed in our testing that the clear color prefers an alignment greater than 64B. Unfortunately, I can't find any bspec note about this. As long as the buffer creators are aware though, I think we should be fine. I don't know if this is the best forum to bring it up, but I thought I'd share.
Seems like the upper converted clear color is untested due to the lack of RGBX16 support. I suppose that if there are any issues there, they can be fixed later...
-Nanley
--Imre
On Mon, Nov 23, 2020 at 08:26:30PM +0200, Imre Deak wrote:
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
Gen12 display can decompress surfaces compressed by render engine with Clear Color, add a new modifier as the driver needs to know the surface was compressed by render engine.
V2: Description changes as suggested by Rafael. V3: Mention the Clear Color size of 64 bits in the comments(DK) v4: Fix trailing whitespaces v5: Explain Clear Color in the documentation. v6: Documentation Nitpicks(Nanley)
Cc: Ville Syrjala ville.syrjala@linux.intel.com Cc: Dhinakaran Pandiyan dhinakaran.pandiyan@intel.com Cc: Kalyan Kondapally kalyan.kondapally@intel.com Cc: Rafael Antognolli rafael.antognolli@intel.com Cc: Nanley Chery nanley.g.chery@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Imre Deak imre.deak@intel.com
include/uapi/drm/drm_fourcc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h
b/include/uapi/drm/drm_fourcc.h
index ca48ed0e6bc1..0a1b2c4c4bee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -527,6 +527,25 @@ extern "C" { */ #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS
fourcc_mod_code(INTEL, 7)
+/*
- Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
- compression.
- The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
- and at index 1. The clear color is stored at index 2, and the pitch should
- be ignored. The clear color structure is 256 bits. The first 128 bits
- represents Raw Clear Color Red, Green, Blue and Alpha color each
represented
- by 32 bits. The raw clear color is consumed by the 3d engine and
generates
- the converted clear color of size 64 bits. The first 32 bits store the
Lower
- Converted Clear Color value and the next 32 bits store the Higher
Converted
- Clear Color value when applicable. The Converted Clear Color values
are
- consumed by the DE. The last 64 bits are used to store Color Discard
Enable
- and Depth Clear Value Valid which are ignored by the DE. A CCS cache
line
- corresponds to an area of 4x1 tiles in the main surface. The main
surface
- pitch is required to be a multiple of 4 tile widths.
- */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC
fourcc_mod_code(INTEL, 8)
/*
- Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
-- 2.25.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Hi Nanley,
thanks for the review.
+Ville, Chris.
On Tue, Dec 01, 2020 at 02:18:26AM +0200, Chery, Nanley G wrote:
Hi Imre,
I have a question and a couple comments:
Is the map of the clear color address creating a new synchronization point between the GPU and CPU? If so, I wonder how this will impact performance.
The kmap to read the clear value is not adding any sync overhead if that's what you mean. But the clear value must be in place before we read it out and that should be guaranteed by the flush we do anyway to wait for the render result (even considering the explicit L3/RT flush, depth stall the spec requires for fast clears).
However now that you mention: atm the kmap/readout happens after the explicit but before the implicit fence-wait. I think it should happen after the implicit fence-wait.
Ville, Chris, could you confirm the above and also that the above flush is enough to ensure the CPU read is coherent?
There was some talk of asynchronously updating the clear color register a while back.
Couldn't find anything with a quick search, do you have a pointer? Just before the flip we must wait for the render results anyway, as we do now, so not sure how it could be optimized.
We probably don't have to update the header, but we noticed in our testing that the clear color prefers an alignment greater than 64B. Unfortunately, I can't find any bspec note about this. As long as the buffer creators are aware though, I think we should be fine. I don't know if this is the best forum to bring it up, but I thought I'd share.
Yes, would be good to clarify this and get it also to the spec. Then the driver should also check the alignment of the 3rd FB plane.
Seems like the upper converted clear color is untested due to the lack of RGBX16 support. I suppose that if there are any issues there, they can be fixed later...
Yes, a 64bpp RC-CC subtest in IGT is missing, should be easy to add that.
--Imre
-----Original Message----- From: Imre Deak imre.deak@intel.com Sent: Tuesday, December 1, 2020 4:05 AM To: Chery, Nanley G nanley.g.chery@intel.com; Chris Wilson <chris@chris- wilson.co.uk>; Ville Syrjälä ville.syrjala@linux.intel.com Cc: Daniel Vetter daniel@ffwll.ch; intel-gfx@lists.freedesktop.org; Nikula, Jani jani.nikula@intel.com; Daniel Vetter daniel.vetter@ffwll.ch; Kondapally, Kalyan kalyan.kondapally@intel.com; Pandiyan, Dhinakaran dhinakaran.pandiyan@intel.com; dri-devel@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 1/2] drm/framebuffer: Format modifier for Intel Gen 12 render compression with Clear Color
Hi Nanley,
thanks for the review.
+Ville, Chris.
On Tue, Dec 01, 2020 at 02:18:26AM +0200, Chery, Nanley G wrote:
Hi Imre,
I have a question and a couple comments:
Is the map of the clear color address creating a new synchronization point between the GPU and CPU? If so, I wonder how this will impact performance.
The kmap to read the clear value is not adding any sync overhead if that's what you mean. But the clear value must be in place before we read it out and that should be guaranteed by the flush we do anyway to wait for the render result (even considering the explicit L3/RT flush, depth stall the spec requires for fast clears).
However now that you mention: atm the kmap/readout happens after the explicit but before the implicit fence-wait. I think it should happen after the implicit fence-wait.
Ville, Chris, could you confirm the above and also that the above flush is enough to ensure the CPU read is coherent?
There was some talk of asynchronously updating the clear color register a while back.
Couldn't find anything with a quick search, do you have a pointer? Just before the flip we must wait for the render results anyway, as we do now, so not sure how it could be optimized.
There were some offline discussions, so I don't have a reference unfortunately. Though, given what you shared above it seems like it's actually not an issue.
We probably don't have to update the header, but we noticed in our testing that the clear color prefers an alignment greater than 64B. Unfortunately, I can't find any bspec note about this. As long as the buffer creators are aware though, I think we should be fine. I don't know if this is the best forum to bring it up, but I thought I'd share.
Yes, would be good to clarify this and get it also to the spec. Then the driver should also check the alignment of the 3rd FB plane.
I plan to run some more tests and file a bug in the spec.
I see that the IGT test only clears the fb once. Just to confirm, is the clear color offset read from on every frame? Userspace would like to be able to pass different clear colors for an fb.
-Nanley
Seems like the upper converted clear color is untested due to the lack of RGBX16 support. I suppose that if there are any issues there, they can be fixed later...
Yes, a 64bpp RC-CC subtest in IGT is missing, should be easy to add that.
--Imre
On Fri, Dec 11, 2020 at 09:04:02AM +0200, Chery, Nanley G wrote:
[...]
We probably don't have to update the header, but we noticed in our testing that the clear color prefers an alignment greater than 64B. Unfortunately, I can't find any bspec note about this. As long as the buffer creators are aware though, I think we should be fine. I don't know if this is the best forum to bring it up, but I thought I'd share.
Yes, would be good to clarify this and get it also to the spec. Then the driver should also check the alignment of the 3rd FB plane.
I plan to run some more tests and file a bug in the spec.
Ok, thanks. Note that this patch has a problem with synchornization and based on Chris' response I'm planning to update it once I figured out the proper way to map the CC plane. Until that you could still use it on TGL if you wait for the RT result explicitly after the fast clear and before flipping to it (that's what the IGT test does atm).
I see that the IGT test only clears the fb once. Just to confirm, is the clear color offset read from on every frame? Userspace would like to be able to pass different clear colors for an fb.
Yes, every time you do a flip the kernel will re-read the CC value and program it to the display.
--Imre
On Fri, 27 Nov 2020, Daniel Vetter daniel@ffwll.ch wrote:
On Fri, Nov 27, 2020 at 04:31:00PM +0200, Imre Deak wrote:
Hi Daniel, Jani,
is it ok to merge this patch along with 2/2 via the i915 tree?
Ack from mesa (userspace in general, but mesa is kinda mandatory) is missing I think. With that
Acked-by: Daniel Vetter daniel.vetter@ffwll.ch
With the same conditions,
Acked-by: Jani Nikula jani.nikula@intel.com
--Imre
On Mon, Nov 23, 2020 at 08:26:30PM +0200, Imre Deak wrote:
From: Radhakrishna Sripada radhakrishna.sripada@intel.com
Gen12 display can decompress surfaces compressed by render engine with Clear Color, add a new modifier as the driver needs to know the surface was compressed by render engine.
V2: Description changes as suggested by Rafael. V3: Mention the Clear Color size of 64 bits in the comments(DK) v4: Fix trailing whitespaces v5: Explain Clear Color in the documentation. v6: Documentation Nitpicks(Nanley)
Cc: Ville Syrjala ville.syrjala@linux.intel.com Cc: Dhinakaran Pandiyan dhinakaran.pandiyan@intel.com Cc: Kalyan Kondapally kalyan.kondapally@intel.com Cc: Rafael Antognolli rafael.antognolli@intel.com Cc: Nanley Chery nanley.g.chery@intel.com Signed-off-by: Radhakrishna Sripada radhakrishna.sripada@intel.com Signed-off-by: Imre Deak imre.deak@intel.com
include/uapi/drm/drm_fourcc.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index ca48ed0e6bc1..0a1b2c4c4bee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -527,6 +527,25 @@ extern "C" { */ #define I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS fourcc_mod_code(INTEL, 7)
+/*
- Intel Color Control Surface with Clear Color (CCS) for Gen-12 render
- compression.
- The main surface is Y-tiled and is at plane index 0 whereas CCS is linear
- and at index 1. The clear color is stored at index 2, and the pitch should
- be ignored. The clear color structure is 256 bits. The first 128 bits
- represents Raw Clear Color Red, Green, Blue and Alpha color each represented
- by 32 bits. The raw clear color is consumed by the 3d engine and generates
- the converted clear color of size 64 bits. The first 32 bits store the Lower
- Converted Clear Color value and the next 32 bits store the Higher Converted
- Clear Color value when applicable. The Converted Clear Color values are
- consumed by the DE. The last 64 bits are used to store Color Discard Enable
- and Depth Clear Value Valid which are ignored by the DE. A CCS cache line
- corresponds to an area of 4x1 tiles in the main surface. The main surface
- pitch is required to be a multiple of 4 tile widths.
- */
+#define I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC fourcc_mod_code(INTEL, 8)
/*
- Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
-- 2.25.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel@lists.freedesktop.org