Upstreaming DAL

List overview All Threads
Download

newer

older

[Bug 94671] [radeonsi] Blue-ish...

[Bug 93178] Textures are filled...

Alex Deucher

23 Mar 2016 23 Mar '16

5:27 p.m.

Our goal is to transition to our new DAL display stack. It is the path we are validating internally for both the open and hybrid stacks and will be the only display stack we support going forward with new asics. When we initially released the patches, there were some rough edges and quite a few things were pointed out that need to be fixed. Some are relatively easy to fix, others will take time. Our goal is to make those changes. We'd like to target 4.7 for upstreaming DAL. To that end, I think it would be easier to review and track our progress if I maintained a public DAL branch and send out patches against that branch rather than respinning giant squashed patches every time we fix something. It's much harder to track what progress has been made. DAL is currently at ~400 patches. We previously tried to squash that down into a bunch of larger commits, but I'm not sure that is particularly easy to review. I'm also not sure it's worth spamming the list with 400 patches. I've posted our current DAL tree here for review: https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.7-wip-dal I can certainly send out the patches if that is preferred. We will be sending out all new patches to the list as the further clean up tasks and new features are implemented.

Our goal is to fix the following items in time for 4.7. Additional fixes and re-factoring will continue as we work to address all of the comments and concerns.

Tasks to be completed for 4.7: - Remove (malloc/free/sleep/delay/etc.) wrappers for building DAL in userspace (done) - Switch to using Linux i2c subsystem (done) - Switch to using Linux drm aux interface (in progress) - Convert cursors to planes API (in progress) - Switch to native drm EDID fetching (in progress)

Tasks post 4.7 (we will attempt to fix these sooner as time permits): - Using common Linux infoframe infrastructure - Migrating to common logging infrastructure - Refactoring DC

Current DAL status: - DCE8 Support (Hawaii, Bonaire) - DCE10 Support (Fiji, Tonga) - DCE11 Support (CZ, ST) - DP support (SST, MST, Audio) - HDMI Support - HDMI 2.0 Support (on supported asics) - DVI Support - Full integration with power management - Atomic KMS support - Solid 4K@60 Support

Thanks,

Alex

Show replies by date

Dave Airlie

23 Mar 23 Mar

11:15 p.m.

On 24 March 2016 at 03:27, Alex Deucher alexdeucher@gmail.com wrote:

...

Our goal is to transition to our new DAL display stack. It is the path we are validating internally for both the open and hybrid stacks and will be the only display stack we support going forward with new asics. When we initially released the patches, there were some rough edges and quite a few things were pointed out that need to be fixed. Some are relatively easy to fix, others will take time. Our goal is to make those changes. We'd like to target 4.7 for upstreaming DAL. To that end, I think it would be easier to review and track our progress if I maintained a public DAL branch and send out patches against that branch rather than respinning giant squashed patches every time we fix something. It's much harder to track what progress has been made. DAL is currently at ~400 patches. We previously tried to squash that down into a bunch of larger commits, but I'm not sure that is particularly easy to review. I'm also not sure it's worth spamming the list with 400 patches. I've posted our current DAL tree here for review: https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.7-wip-dal I can certainly send out the patches if that is preferred. We will be sending out all new patches to the list as the further clean up tasks and new features are implemented.

Our goal is to fix the following items in time for 4.7. Additional fixes and re-factoring will continue as we work to address all of the comments and concerns.

I think people are focusing on the minor comments and concerns and possibly deliberately ignoring the bigger concern that this code is base is pretty much unmergeable as-is.

Cleaning this up won't be a matter of just removing some memset's, mallocs and moving on.

This code needs redesign by someone with Linux kernel development experience. Alex if you have any other tasks in AMD, I suggest you apply to have them all scrapped and you take this on board.

So much of this code seems to be "architected" design. I mean someone senior draws a bunch of powerpoint slides with nice blocks, with what are abstraction layers between them, then a bunch of other coders take that powerpoint slide and work on a box each until it all comes together. However the abstractions don't survive so well and you end up with a bunch of leaky boxes all co-dependent and joined together but still abstracted.

So let's take a massive step back from the edge here and figure out what we expect a modesetting codebase for a driver in the Linux kernel to look like and write that.

Can we work out what code is actually per-GPU family and what code you want to write per GPU bringup etc.

Because looking at this, for a new DCE variant or GPU you need to write:

DC support: (dc/dce*) lots of code 9k-20k per DCE family DC clock gating support (dc/gpu/dce*) BIOS support: (bios_support/dce*) (can you spot leaky abstraction number one?) 1k lines bandwidth_calcs : wow that file alone is impenetrable, even after coding standards. adapter support: dc/adapter/dce* irq support gpio support asic_capability : per GPU code.

Now tell me what the abstractions bring if you end up having to poke per-DCE holes into each layer to get them to talk.

Maybe we should treat DAL as an example of how to drive the hardware, and evolve the driver to support things.

Like you could start with pulling the bandwidth_calcs into the current driver codebase, and using it, then you could pull the bios parser into the current driver codebase and start cleaning up using that and iterate across things.

So my advice is to spend more time on how a Linux display driver looks and not how to abstract everything away.

I'm actually nearly getting to the point of realising nobody actually understands my concerns here and just trying to write a driver on my own.

I'm still slightly open to some sort of STAGING for this, but I think I'm nearly at the level, that I'd only accept that on the understanding we'd try and evolve the driver to this level, rather than think we can clean DAL to the kernel standards.

Dave.

Daniel Vetter

24 Mar 24 Mar

6:37 a.m.

Just my 2 cents^Wcomments.

On Thu, Mar 24, 2016 at 12:15 AM, Dave Airlie airlied@gmail.com wrote:

...

On 24 March 2016 at 03:27, Alex Deucher alexdeucher@gmail.com wrote:

...
Our goal is to transition to our new DAL display stack. It is the path we are validating internally for both the open and hybrid stacks and will be the only display stack we support going forward with new asics. When we initially released the patches, there were some rough edges and quite a few things were pointed out that need to be fixed. Some are relatively easy to fix, others will take time. Our goal is to make those changes. We'd like to target 4.7 for upstreaming DAL. To that end, I think it would be easier to review and track our progress if I maintained a public DAL branch and send out patches against that branch rather than respinning giant squashed patches every time we fix something. It's much harder to track what progress has been made. DAL is currently at ~400 patches. We previously tried to squash that down into a bunch of larger commits, but I'm not sure that is particularly easy to review. I'm also not sure it's worth spamming the list with 400 patches. I've posted our current DAL tree here for review: https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.7-wip-dal I can certainly send out the patches if that is preferred. We will be sending out all new patches to the list as the further clean up tasks and new features are implemented.

Our goal is to fix the following items in time for 4.7. Additional fixes and re-factoring will continue as we work to address all of the comments and concerns.

I think people are focusing on the minor comments and concerns and possibly deliberately ignoring the bigger concern that this code is base is pretty much unmergeable as-is.

Cleaning this up won't be a matter of just removing some memset's, mallocs and moving on.

This code needs redesign by someone with Linux kernel development experience. Alex if you have any other tasks in AMD, I suggest you apply to have them all scrapped and you take this on board.

So much of this code seems to be "architected" design. I mean someone senior draws a bunch of powerpoint slides with nice blocks, with what are abstraction layers between them, then a bunch of other coders take that powerpoint slide and work on a box each until it all comes together. However the abstractions don't survive so well and you end up with a bunch of leaky boxes all co-dependent and joined together but still abstracted.

So let's take a massive step back from the edge here and figure out what we expect a modesetting codebase for a driver in the Linux kernel to look like and write that.

Can we work out what code is actually per-GPU family and what code you want to write per GPU bringup etc.

Because looking at this, for a new DCE variant or GPU you need to write:

DC support: (dc/dce*) lots of code 9k-20k per DCE family DC clock gating support (dc/gpu/dce*) BIOS support: (bios_support/dce*) (can you spot leaky abstraction number one?) 1k lines bandwidth_calcs : wow that file alone is impenetrable, even after coding standards. adapter support: dc/adapter/dce* irq support gpio support asic_capability : per GPU code.

Now tell me what the abstractions bring if you end up having to poke per-DCE holes into each layer to get them to talk.

Maybe we should treat DAL as an example of how to drive the hardware, and evolve the driver to support things.

Still the approach I'd suggest as the one with likely the best end-result, and the fastest way to get immediate improvements upstream. Also probably the least amount of work.

On top of that by chunking it up it's easier for the actual code owners to submit it and you have a much higher chance of getting external review. So much better chance DAL developers learn how to fence for their own things in an open&collaborative environment - assuming its not all Harry typing this much code ;-)

...

Like you could start with pulling the bandwidth_calcs into the current driver codebase, and using it, then you could pull the bios parser into the current driver codebase and start cleaning up using that and iterate across things.

So my advice is to spend more time on how a Linux display driver looks and not how to abstract everything away.

I'm actually nearly getting to the point of realising nobody actually understands my concerns here and just trying to write a driver on my own.

I'm still slightly open to some sort of STAGING for this, but I think I'm nearly at the level, that I'd only accept that on the understanding we'd try and evolve the driver to this level, rather than think we can clean DAL to the kernel standards.

This might work, as in merge DAL now, as-is, knowlingly still steaming. And then put all the focus on cleaning it up in public, with the goal that everyone on the DAL team at least learns the ropes of open collaboration while cleaning things up. Again you'd need to bring all the individual owners into the sunlight here, not just Harry forwarding patches.

But in my experience of trying this at much smaller scale with various groups&projects this has a rather large chance of failure, if engineers&their managers don't yet grok the basics of the upstream process. And to me it looks like the DAL team does prefer to hide behind Harry&Alex, so probably this is the case.

2cents out ;-)

Cheers, Daniel

-- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch

Jerome Glisse

3:23 p.m.

On Wed, Mar 23, 2016 at 01:27:21PM -0400, Alex Deucher wrote:

...

Our goal is to transition to our new DAL display stack. It is the path we are validating internally for both the open and hybrid stacks and will be the only display stack we support going forward with new asics. When we initially released the patches, there were some rough edges and quite a few things were pointed out that need to be fixed. Some are relatively easy to fix, others will take time. Our goal is to make those changes. We'd like to target 4.7 for upstreaming DAL. To that end, I think it would be easier to review and track our progress if I maintained a public DAL branch and send out patches against that branch rather than respinning giant squashed patches every time we fix something. It's much harder to track what progress has been made. DAL is currently at ~400 patches. We previously tried to squash that down into a bunch of larger commits, but I'm not sure that is particularly easy to review. I'm also not sure it's worth spamming the list with 400 patches. I've posted our current DAL tree here for review: https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.7-wip-dal I can certainly send out the patches if that is preferred. We will be sending out all new patches to the list as the further clean up tasks and new features are implemented.

Our goal is to fix the following items in time for 4.7. Additional fixes and re-factoring will continue as we work to address all of the comments and concerns.

So the dal code is kind of revulsing, for two reasons. The first one as to do with the overall state of the code. It is full of dead code, unuse struct, unuse enum, unuse union, ... I pushed a branch on fdo which just barely scratch the surface at removing dead stuff (see dal branch at https://cgit.freedesktop.org/~glisse/linux/log/?h=dal). It is also poorly commented, not even per file top comments giving information about what each file is about, like a brief overlook.

If i focus on the patchset and not on the final result then things are even worse. First a mega patch that adds all bunch of files instead of splitting things up in digestible chunk. Second you have several files that are added, then remove in latter patch, and some are added again latter with other name. Well this can be fix too, by redoing the patchset.

But the second reason is far more troubling and so fundamental that DAL should just never be consider for merging, not even in STAGING. It is about the whole design of it. DAL is like a full replacement of DRM modesetting code all wrap into several layers of abstraction. Seriously DAL is abstraction on top of abstraction on top of abstraction (grep for ->base.base.base). It is just wrong. Like Dave said, it is so obviously wrong that you have to poke hole into your abstraction layer for hardware specific code. So you have hw specific code that call abstraction layer that endup just calling the function above the one you were looking at. The whole layer on top of layer become a maze in which you get lost.

This is not how you do thing. If the DRM abstraction are not good enough then fix them working with others. Hardware is not fundamentaly different, it is not like their are specific brand of display for specific brand of GPU. This is a whole industry that rely on standardization. The DRM core modesetting have grown over the years to adapt not only to new userspace requirement (atomic) but also to new display technology like DP and MST.

So if you feel core DRM modesetting abstractions does not allow for a specific feature then propose change there, in the core DRM modesetting abstraction. Also DAL seems to implement quite a few things that should be move as drm helpers function (whole infoframes stuff feels that way). This is a linux device driver, we share things and improve on each others.

New modesetting code for amdgpu should not be about shoehorn an existing design behind an abstraction layer to interface with core modesetting. It should be about implementing core modesetting API (atomic being the flavor of the day). All this with only small abstraction inside device driver to leverage architectural similarities accross different GPU generation. But it should not be a monster of several thousand lines of code with dozens and dozens of functions callback struct.

I do not see why existing modesetting design (modulo the fact that we need to implement atomic modesetting) does not allow you to implement any of the features you listed (freesync, sync timing accross different display, power features, 6 displays, ...). None of the abstraction are needed to implement any of this. If i am mistaken, take one thing and explain in details why ! What would be the roadblocks with the existing code.

To sum up DAL is not only in an unreviewable state (code mess, patch mess) but it is fundamently wrong. I would not feel confortable having to fix thing inside it and i would forever be fearful of breaking things due to abstraction card castle effect (also known as jenga principle).

Regards, Jérôme

3336

Age (days ago)

3337

Last active (days ago)

dri-devel@lists.freedesktop.org

3 comments

4 participants

tags (0)

participants (4)

Alex Deucher
Daniel Vetter
Dave Airlie
Jerome Glisse