In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
https://github.com/imirkin/mesa/commits/30bpp
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
-------------------------------------
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI): - modetest with a 10bpc gradient shows up smoother than an 8bpc gradient. However it's still dithered to 8bpc, not "real" 10bpc. - things generally work in X -- dri2 and dri3, xv, and obviously regular X rendering / acceleration - lots of X software can't handle 30bpp modes (mplayer hates it for xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too: - Figure out how to properly expose the 1024-sized LUT - Add fp16 scanout - Stop relying on the max bpc of the monitor/connector and make decisions based on the "effective" bpc (e.g. based on the currently-set fb format, take hdmi/dp into account, etc). This will also affect the max clock somehow. Perhaps there should be a way to force a connector to a certain bpc. - Add higher-bpc HDMI support - Add 10bpc dithering (only makes sense if >= 10bpc output is *actually* enabled first) - Investigate YUV HDMI modes (esp since they can enable 4K@60 on HDMI 1.4 hardware) - Test out Wayland compositors - Teach xf86-video-modesetting about addfb2 or that nouveau's ordering is different.
I don't necessarily plan on working further on this, so if there are interested parties, they should definitely try to pick it up. I'll try to upstream all my changes though.
Cheers,
-ilia
On Sun, Feb 04, 2018 at 06:50:45PM -0500, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
https://github.com/imirkin/mesa/commits/30bpp
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
- modetest with a 10bpc gradient shows up smoother than an 8bpc
gradient. However it's still dithered to 8bpc, not "real" 10bpc.
- things generally work in X -- dri2 and dri3, xv, and obviously
regular X rendering / acceleration
- lots of X software can't handle 30bpp modes (mplayer hates it for
xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too:
- Figure out how to properly expose the 1024-sized LUT
We have the properties in the kernel. Not sure if x11 could expose it to clients somehow, or would we just have to interpolate the missing bits in the ddx?
- Add fp16 scanout
i915 could do this as well. There was a patch to just add the fourcc on account of gvt needing it for some Windows thing. IIRC I asked them to actually implement it in i915 proper but no patch ever surfaced.
- Stop relying on the max bpc of the monitor/connector and make
decisions based on the "effective" bpc (e.g. based on the currently-set fb format, take hdmi/dp into account, etc). This will also affect the max clock somehow. Perhaps there should be a way to force a connector to a certain bpc.
We used to look at the fb depth for the primary plane when picking the output bpc, but that doesn't really work when you have multiple planes, and you generally don't want to have to do a modeset to flip to a fb with another format. So in the end we just chose to go for the max bpc possible.
There are some potential issues with deep color though (crappy HDMI cables, dongles etc.) so I suggested a property to allow the user to limit it below a certain value. Problem is that IIRC the patch we got was just adding it to i915, whereas we really want to put it into the drm core so that everyone will implement the same thing.
- Add higher-bpc HDMI support
Bunch of interesting stuff in i915 to figure out the sink/dongle clock limit etc. If someone else is going to implement HDMI deep color we should perhaps look into lifting some of that stuff into some common place.
- Add 10bpc dithering (only makes sense if >= 10bpc output is
*actually* enabled first)
- Investigate YUV HDMI modes (esp since they can enable 4K@60 on HDMI
1.4 hardware)
We have 4:2:0 in i915, and pretty close to having YCbCr 4:4:4 too. The 4:4:4 thing would need some new properties though so that the user can actually enable it. What we do with 4:2:0 is enable it automagically when the display can't do RGB 4:4:4 for the given mode. But there's currently no way for the user to say that they prefer YCbCr 4:2:0 over RGB 4:4:4.
- Test out Wayland compositors
- Teach xf86-video-modesetting about addfb2 or that nouveau's
ordering is different.
I don't necessarily plan on working further on this, so if there are interested parties, they should definitely try to pick it up. I'll try to upstream all my changes though.
Cheers,
-ilia _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Wed, Feb 07, 2018 at 06:28:42PM +0200, Ville Syrjälä wrote:
On Sun, Feb 04, 2018 at 06:50:45PM -0500, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
https://github.com/imirkin/mesa/commits/30bpp
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
- modetest with a 10bpc gradient shows up smoother than an 8bpc
gradient. However it's still dithered to 8bpc, not "real" 10bpc.
- things generally work in X -- dri2 and dri3, xv, and obviously
regular X rendering / acceleration
- lots of X software can't handle 30bpp modes (mplayer hates it for
xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too:
- Figure out how to properly expose the 1024-sized LUT
We have the properties in the kernel. Not sure if x11 could expose it to clients somehow, or would we just have to interpolate the missing bits in the ddx?
Oh, and I think we're going to have to come up with a fancier uapi for this stuff because in the future the input points may not be evenly spaced (for HDR stuff). Also the hardware may provide various different modes for the gamma LUTs with different tradeoffs. So we may even want to somehow try to enumerate the different modes and let userspace pick the mode that best suits its needs.
On Wed, Feb 7, 2018 at 12:01 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
On Wed, Feb 07, 2018 at 06:28:42PM +0200, Ville Syrjälä wrote:
On Sun, Feb 04, 2018 at 06:50:45PM -0500, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
https://github.com/imirkin/mesa/commits/30bpp
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
- modetest with a 10bpc gradient shows up smoother than an 8bpc
gradient. However it's still dithered to 8bpc, not "real" 10bpc.
- things generally work in X -- dri2 and dri3, xv, and obviously
regular X rendering / acceleration
- lots of X software can't handle 30bpp modes (mplayer hates it for
xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too:
- Figure out how to properly expose the 1024-sized LUT
We have the properties in the kernel. Not sure if x11 could expose it to clients somehow, or would we just have to interpolate the missing bits in the ddx?
Oh, and I think we're going to have to come up with a fancier uapi for this stuff because in the future the input points may not be evenly spaced (for HDR stuff). Also the hardware may provide various different modes for the gamma LUTs with different tradeoffs. So we may even want to somehow try to enumerate the different modes and let userspace pick the mode that best suits its needs.
That's already the case -- NVIDIA actually has like 5 different LUT modes on recent chips.
https://github.com/envytools/envytools/blob/master/rnndb/display/nv_evo.xml#...
<value value="0x4" name="INTERPOLATE_1025_UNITY_RANGE" variants="GF119-"/> <value value="0x5" name="INTERPOLATE_1025_XRBIAS_RANGE" variants="GF119-"/> <value value="0x6" name="INTERPOLATE_1025_XVYCC_RANGE" variants="GF119-"/> <value value="0x7" name="INTERPOLATE_257_UNITY_RANGE" variants="GF119-"/> <value value="0x8" name="INTERPOLATE_257_LEGACY_RANGE" variants="GF119-"/>
On Thu, Feb 08, 2018 at 07:34:11PM -0500, Ilia Mirkin wrote:
On Wed, Feb 7, 2018 at 12:01 PM, Ville Syrjälä ville.syrjala@linux.intel.com wrote:
On Wed, Feb 07, 2018 at 06:28:42PM +0200, Ville Syrjälä wrote:
On Sun, Feb 04, 2018 at 06:50:45PM -0500, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
https://github.com/imirkin/mesa/commits/30bpp
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
- modetest with a 10bpc gradient shows up smoother than an 8bpc
gradient. However it's still dithered to 8bpc, not "real" 10bpc.
- things generally work in X -- dri2 and dri3, xv, and obviously
regular X rendering / acceleration
- lots of X software can't handle 30bpp modes (mplayer hates it for
xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too:
- Figure out how to properly expose the 1024-sized LUT
We have the properties in the kernel. Not sure if x11 could expose it to clients somehow, or would we just have to interpolate the missing bits in the ddx?
Oh, and I think we're going to have to come up with a fancier uapi for this stuff because in the future the input points may not be evenly spaced (for HDR stuff). Also the hardware may provide various different modes for the gamma LUTs with different tradeoffs. So we may even want to somehow try to enumerate the different modes and let userspace pick the mode that best suits its needs.
That's already the case -- NVIDIA actually has like 5 different LUT modes on recent chips.
https://github.com/envytools/envytools/blob/master/rnndb/display/nv_evo.xml#...
<value value="0x4" name="INTERPOLATE_1025_UNITY_RANGE" variants="GF119-"/> <value value="0x5" name="INTERPOLATE_1025_XRBIAS_RANGE" variants="GF119-"/> <value value="0x6" name="INTERPOLATE_1025_XVYCC_RANGE" variants="GF119-"/> <value value="0x7" name="INTERPOLATE_257_UNITY_RANGE" variants="GF119-"/> <value value="0x8" name="INTERPOLATE_257_LEGACY_RANGE" variants="GF119-"/>
Yeah, we also have several LUT modes on intel hw. IIRC ~4 on current hw. The main questions are whether all of them are actually useful for userspace, and how we can expose the relevant details to userspace in a succinct and hw independent way.
On 02/05/2018 12:50 AM, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
Wrt. mesa, those patches are now in master and i think we have a bit of a problem under X11+GLX:
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/dri/d...
dri_fill_in_modes() defines MESA_FORMAT_R10G10B10A2_UNORM, MESA_FORMAT_R10G10B10X2_UNORM at the top inbetween the BGRX/A formats ignoring the instructions that "/* The 32-bit RGBA format must not precede the 32-bit BGRA format. * Likewise for RGBX and BGRX. Otherwise, the GLX client and the GLX * server may disagree on which format the GLXFBConfig represents, * resulting in swapped color channels."
RGBA/X formats should only be exposed if (dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING))
and that is only the case for the Android loader.
The GLX code doesn't use the red/green/blueChannelMasks for proper matching of formats, and the server doesn't even transmit those masks to the client in the case of GLX. So whatever 10 bit format comes first will win when building the assignment to GLXFBConfigs.
I looked at the code and how it behaves. In practice Intel gfx works because it's a classic DRI driver with its own method of building the DRIconfig's, and it only exposes the BGR101010 formats, so no danger of mixups. AMD's gallium drivers expose both BGR and RGB ordered 10 bit formats, but due to the ordering, the matching ends up only assigning the desired BGR formats that are good for AMD hw, discarding the RGB formats. nouveau works because it only exposes the desired RGB format for the hw. But with other gallium drivers for some SoC's or future gallium drivers it is not so clear if the right thing will happen. E.g., freedreno seems to support both BGR and RGB 10 bit formats as PIPE_BIND_DISPLAY_TARGET afaics, so i don't know if by luck the right thing would happen?
Afaics EGL does the right thing wrt. channelmask matching of EGLConfigs to DRIconfigs, so we could probably implement dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING) == TRUE for the EGL loaders.
But for GLX it is not so easy or quick. I looked if i could make the servers GLX send proper channelmask attributes and Mesa parsing them, but there aren't any GLX tags defined for channel masks, and all other tags come from official GLX extension headers. I'm not sure what the proper procedure for defining new tags is? Do we have to define a new GLX extension for that and get it in the Khronos registry and then back into the server/mesa code-base?
The current patches in mesa for XBGR also lack enablement pieces for EGL, Wayland and X11 compositing, but that's a different problem.
-mario
libdrm:
For testing, I added a modetest gradient pattern split horizontally. Top half is 10bpc, bottom half is 8bpc. This is useful for seeing whether you're really getting 10bpc, or if things are getting truncated along the way. Definitely hacky, but ... wasn't intending on upstreaming it anyways:
https://github.com/imirkin/drm/commit/9b8776f58448b5745675c3a7f5eb2735e39894...
Results with the patches (tested on a GK208B and a "deep color" TV over HDMI):
- modetest with a 10bpc gradient shows up smoother than an 8bpc
gradient. However it's still dithered to 8bpc, not "real" 10bpc.
- things generally work in X -- dri2 and dri3, xv, and obviously
regular X rendering / acceleration
- lots of X software can't handle 30bpp modes (mplayer hates it for
xv and x11 rendering, aterm bails on shading the root pixmap, probably others)
I'm also told that with DP, it should actually send the higher-bpc data over the wire. With HDMI, we're still stuck at 24bpp for now (although the hardware can do 36bpp as well). This is why my gradient result above was still dithered.
Things to do - mostly nouveau specific, but probably some general infra needed too:
- Figure out how to properly expose the 1024-sized LUT
- Add fp16 scanout
- Stop relying on the max bpc of the monitor/connector and make
decisions based on the "effective" bpc (e.g. based on the currently-set fb format, take hdmi/dp into account, etc). This will also affect the max clock somehow. Perhaps there should be a way to force a connector to a certain bpc.
- Add higher-bpc HDMI support
- Add 10bpc dithering (only makes sense if >= 10bpc output is
*actually* enabled first)
- Investigate YUV HDMI modes (esp since they can enable 4K@60 on HDMI
1.4 hardware)
- Test out Wayland compositors
- Teach xf86-video-modesetting about addfb2 or that nouveau's
ordering is different.
I don't necessarily plan on working further on this, so if there are interested parties, they should definitely try to pick it up. I'll try to upstream all my changes though.
Cheers,
-ilia
On Mon, Mar 5, 2018 at 2:25 AM, Mario Kleiner mario.kleiner.de@gmail.com wrote:
On 02/05/2018 12:50 AM, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
Wrt. mesa, those patches are now in master and i think we have a bit of a problem under X11+GLX:
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/dri/d...
dri_fill_in_modes() defines MESA_FORMAT_R10G10B10A2_UNORM, MESA_FORMAT_R10G10B10X2_UNORM at the top inbetween the BGRX/A formats ignoring the instructions that "/* The 32-bit RGBA format must not precede the 32-bit BGRA format.
- Likewise for RGBX and BGRX. Otherwise, the GLX client and the GLX
- server may disagree on which format the GLXFBConfig represents,
- resulting in swapped color channels."
RGBA/X formats should only be exposed if (dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING))
and that is only the case for the Android loader.
The GLX code doesn't use the red/green/blueChannelMasks for proper matching of formats, and the server doesn't even transmit those masks to the client in the case of GLX. So whatever 10 bit format comes first will win when building the assignment to GLXFBConfigs.
I looked at the code and how it behaves. In practice Intel gfx works because it's a classic DRI driver with its own method of building the DRIconfig's, and it only exposes the BGR101010 formats, so no danger of mixups. AMD's gallium drivers expose both BGR and RGB ordered 10 bit formats, but due to the ordering, the matching ends up only assigning the desired BGR formats that are good for AMD hw, discarding the RGB formats. nouveau works because it only exposes the desired RGB format for the hw. But with other gallium drivers for some SoC's or future gallium drivers it is not so clear if the right thing will happen. E.g., freedreno seems to support both BGR and RGB 10 bit formats as PIPE_BIND_DISPLAY_TARGET afaics, so i don't know if by luck the right thing would happen?
FWIW freedreno does not presently support 10bpc scanout.
Afaics EGL does the right thing wrt. channelmask matching of EGLConfigs to DRIconfigs, so we could probably implement dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING) == TRUE for the EGL loaders.
But for GLX it is not so easy or quick. I looked if i could make the servers GLX send proper channelmask attributes and Mesa parsing them, but there aren't any GLX tags defined for channel masks, and all other tags come from official GLX extension headers. I'm not sure what the proper procedure for defining new tags is? Do we have to define a new GLX extension for that and get it in the Khronos registry and then back into the server/mesa code-base?
Can all of this be solved by a healthy dose of "don't do that"? i.e. make sure that the DDX only ever exposes one of these at a time? And also make the mesa driver only expose one as a DISPLAY_TARGET?
The current patches in mesa for XBGR also lack enablement pieces for EGL, Wayland and X11 compositing, but that's a different problem.
EGL/drm and EGL/wayland should be enabled (look at Daniel Stone's patches from a short while back, also upstream now). kmscube (with some patches that are upstream now) and weston both run OK for me. I think EGL/x11 is iffy though - haven't played with it.
-ilia
Cc'ing mesa-dev, which was left out.
On 03/05/2018 01:40 PM, Ilia Mirkin wrote:
On Mon, Mar 5, 2018 at 2:25 AM, Mario Kleiner mario.kleiner.de@gmail.com wrote:
On 02/05/2018 12:50 AM, Ilia Mirkin wrote:
In case anyone's curious about 30bpp framebuffer support, here's the current status:
Kernel:
Ben and I have switched the code to using a 256-based LUT for Kepler+, and I've also written a patch to cause the addfb ioctl to use the proper format. You can pick this up at:
https://github.com/skeggsb/linux/commits/linux-4.16 (note the branch!) https://patchwork.freedesktop.org/patch/202322/
With these two, you should be able to use "X -depth 30" again on any G80+ GPU to bring up a screen (as you could in kernel 4.9 and earlier). However this still has some deficiencies, some of which I've addressed:
xf86-video-nouveau:
DRI3 was broken, and Xv was broken. Patches available at:
https://github.com/imirkin/xf86-video-nouveau/commits/master
mesa:
The NVIDIA hardware (pre-Kepler) can only do XBGR scanout. Further the nouveau KMS doesn't add XRGB scanout for Kepler+ (although it could). Mesa was only enabled for XRGB, so I've piped XBGR through all the same places:
Wrt. mesa, those patches are now in master and i think we have a bit of a problem under X11+GLX:
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/state_trackers/dri/d...
dri_fill_in_modes() defines MESA_FORMAT_R10G10B10A2_UNORM, MESA_FORMAT_R10G10B10X2_UNORM at the top inbetween the BGRX/A formats ignoring the instructions that "/* The 32-bit RGBA format must not precede the 32-bit BGRA format.
- Likewise for RGBX and BGRX. Otherwise, the GLX client and the GLX
- server may disagree on which format the GLXFBConfig represents,
- resulting in swapped color channels."
RGBA/X formats should only be exposed if (dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING))
and that is only the case for the Android loader.
The GLX code doesn't use the red/green/blueChannelMasks for proper matching of formats, and the server doesn't even transmit those masks to the client in the case of GLX. So whatever 10 bit format comes first will win when building the assignment to GLXFBConfigs.
I looked at the code and how it behaves. In practice Intel gfx works because it's a classic DRI driver with its own method of building the DRIconfig's, and it only exposes the BGR101010 formats, so no danger of mixups. AMD's gallium drivers expose both BGR and RGB ordered 10 bit formats, but due to the ordering, the matching ends up only assigning the desired BGR formats that are good for AMD hw, discarding the RGB formats. nouveau works because it only exposes the desired RGB format for the hw. But with other gallium drivers for some SoC's or future gallium drivers it is not so clear if the right thing will happen. E.g., freedreno seems to support both BGR and RGB 10 bit formats as PIPE_BIND_DISPLAY_TARGET afaics, so i don't know if by luck the right thing would happen?
FWIW freedreno does not presently support 10bpc scanout.
Afaics EGL does the right thing wrt. channelmask matching of EGLConfigs to DRIconfigs, so we could probably implement dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING) == TRUE for the EGL loaders.
But for GLX it is not so easy or quick. I looked if i could make the servers GLX send proper channelmask attributes and Mesa parsing them, but there aren't any GLX tags defined for channel masks, and all other tags come from official GLX extension headers. I'm not sure what the proper procedure for defining new tags is? Do we have to define a new GLX extension for that and get it in the Khronos registry and then back into the server/mesa code-base?
Can all of this be solved by a healthy dose of "don't do that"? i.e. make sure that the DDX only ever exposes one of these at a time? And also make the mesa driver only expose one as a DISPLAY_TARGET?
Yes, if "don't do that" is consistently possible on all future drivers. Under EGL there is matching of channel masks, so only X11+GLX is problematic. Not sure if anything special would need to be done for XWayland, haven't looked at that at all so far. Or the modesetting ddx, which currently assumes xrgb ordering for 10 bit.
The current patches in mesa for XBGR also lack enablement pieces for EGL, Wayland and X11 compositing, but that's a different problem.
EGL/drm and EGL/wayland should be enabled (look at Daniel Stone's patches from a short while back, also upstream now). kmscube (with some patches that are upstream now) and weston both run OK for me. I think EGL/x11 is iffy though - haven't played with it.
-ilia
There are some from Daniel which unify the handling of formats inside egl, not with any abgr2101010 definitions though. Indeed on master compositing doesn't work for depth 30 windows. I have some patches that fix this, and some hack for EGL/x11 compositing that seems to work. Will send them out soon.
-mario
On Thu, Mar 8, 2018 at 11:57 AM, Mario Kleiner mario.kleiner.de@gmail.com wrote:
Cc'ing mesa-dev, which was left out.
On 03/05/2018 01:40 PM, Ilia Mirkin wrote:
On Mon, Mar 5, 2018 at 2:25 AM, Mario Kleiner mario.kleiner.de@gmail.com wrote:
Afaics EGL does the right thing wrt. channelmask matching of EGLConfigs to DRIconfigs, so we could probably implement dri_loader_get_cap(screen, DRI_LOADER_CAP_RGBA_ORDERING) == TRUE for the EGL loaders.
But for GLX it is not so easy or quick. I looked if i could make the servers GLX send proper channelmask attributes and Mesa parsing them, but there aren't any GLX tags defined for channel masks, and all other tags come from official GLX extension headers. I'm not sure what the proper procedure for defining new tags is? Do we have to define a new GLX extension for that and get it in the Khronos registry and then back into the server/mesa code-base?
Can all of this be solved by a healthy dose of "don't do that"? i.e. make sure that the DDX only ever exposes one of these at a time? And also make the mesa driver only expose one as a DISPLAY_TARGET?
Yes, if "don't do that" is consistently possible on all future drivers.
I don't think it'd be undue burden for a driver to have to decide on one ordering which is The Way To Do It (tm) for that hw, even if the hw supports both. Could also drop some logic into the glx thing to always pick a specific one in case both are supported, and hopefully the DDX would have identical logic.
Under EGL there is matching of channel masks, so only X11+GLX is problematic. Not sure if anything special would need to be done for XWayland, haven't looked at that at all so far. Or the modesetting ddx, which currently assumes xrgb ordering for 10 bit.
For the modesetting ddx, it has to switch to drmAddFB2 so that it knows the exact format. No other way around that, unfortunately. But that'll require work, and I'm happy enough that xf86-video-nouveau works (as that is what I recommend to anyone who'll listen).
The current patches in mesa for XBGR also lack enablement pieces for EGL, Wayland and X11 compositing, but that's a different problem.
EGL/drm and EGL/wayland should be enabled (look at Daniel Stone's patches from a short while back, also upstream now). kmscube (with some patches that are upstream now) and weston both run OK for me. I think EGL/x11 is iffy though - haven't played with it.
-ilia
There are some from Daniel which unify the handling of formats inside egl, not with any abgr2101010 definitions though. Indeed on master compositing doesn't work for depth 30 windows. I have some patches that fix this, and some hack for EGL/x11 compositing that seems to work. Will send them out soon.
D'oh! Those patches were definitely there. I guess they got dropped at some point. Daniel, can you resend those?
-ilia
Hi,
On 8 March 2018 at 17:08, Ilia Mirkin imirkin@alum.mit.edu wrote:
On Thu, Mar 8, 2018 at 11:57 AM, Mario Kleiner mario.kleiner.de@gmail.com wrote:
Under EGL there is matching of channel masks, so only X11+GLX is problematic. Not sure if anything special would need to be done for XWayland, haven't looked at that at all so far. Or the modesetting ddx, which currently assumes xrgb ordering for 10 bit.
For the modesetting ddx, it has to switch to drmAddFB2 so that it knows the exact format. No other way around that, unfortunately. But that'll require work, and I'm happy enough that xf86-video-nouveau works (as that is what I recommend to anyone who'll listen).
modesetting now uses AddFB2, as of relatively recently.
There are some from Daniel which unify the handling of formats inside egl, not with any abgr2101010 definitions though. Indeed on master compositing doesn't work for depth 30 windows. I have some patches that fix this, and some hack for EGL/x11 compositing that seems to work. Will send them out soon.
D'oh! Those patches were definitely there. I guess they got dropped at some point. Daniel, can you resend those?
Oops. Is this X11 or Wayland compositing? I'll resend those two, but it would probably be better to hold off merging them until you can verify I haven't done anything stupid.
Cheers, Daniel
dri-devel@lists.freedesktop.org