Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
On Mon, 14 Mar 2022 18:16:36 -0400 Lyude Paul lyude@redhat.com wrote:
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
Hi Lyude,
as mentioned in IRC, I believe parallel atomic modesets on separate CRTCs have very limited use, so serialising them in the kernel is good. Any userspace that wants to reliably program more than one CRTC will collect all CRTCs into the same atomic commit.
The reason is that userspace does TEST_ONLY commits first to see if things will work, and then do the real commit. If some other commit lands in between, then the test results are invalid and would need to be re-done. So parallel modesets are a gamble at best.
However, parallel modesets are not just an attack vector, they can happen accidentally through DRM leasing. With DRM leasing, the DRM master gives access to some CRTC to another process, which will be doing modesets of its own. These two processes will race each other, not having any idea what the other one is doing or when.
Thanks, pq
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
On Wed, 2022-03-16 at 13:01 +0200, Ville Syrjälä wrote:
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
JFYI I definitely don't have any kind of plan to try parallel modesetting with MST, I think it'd be near impossible to actually get working correctly for pretty little benefit :). I was just not entirely sure of the work that would be required to get private objects to do the right thing here in parallel modesets (e.g. make sure we wait on all CRTC commits like you mentioned).
Anyway - I looked at the code for this the other day and a solution that seems pretty reasonable for this to me would be to add a hook for DRM private objects which provides drivers a spot to inform the DRM core what drm_crtc_commits need to be waited on before starting a modeset. I should have some patches on the list soon so folks can tell me if what I'm doing looks sensible or not :).
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
OK so - this has become a bit of a larger rabbit hole. I've been putting quite a bit of work into this and I think I'm starting to make some progress - although on a different aspect of this issue. After talking with danvet they realized that we're also potentially not handling encoder stealing with MST correctly - which we need to do in order to know that we're correctly pulling in every related crtc/connector into the state - along with avoiding encoder conflicts if something tries to use a GPU's DP encoder in SST mode while it's driving other displays in MST mode.
So - it seems this will likely need to be solved first before we can deal with ensuring that we perform the correct CRTC waits in atomic commits with the MST helpers. This has been pretty painful to figure out, but I think I'm starting to make some progress - but I'd really appreciate getting some feedback on this approach I've came up with so I maybe can skip having to rewrite it later.
So: to clarify the problem, it boils down to something like this:
State 1: * DP-1 (hosts MST topology, so is disconnected + no encoder) * MST topology * DP-2 (has display) * DP-3 (has display) (In hardware) * drm_encoder A drives: * DP-2 * DP-3 (In software) * drm_encoder A unused * Fake MST drm_encoder B -> DP-2 * Fake MST drm_encoder C -> DP-3
Problems: * DP-1 gets disconnected, MST topology disappears * We disable maybe 1 display * DP-1 is disconnected, suddenly replaced with SST display * Driver tries to assign drm_encoder A to new DP-1 display *** Error! drm_encoder A is still driving fake encoders B and C ***
I'm not sure if the exact above example would actually happen - you might need to do some tricks to get it into such a state. But you get the general idea - there's missing coverage with how we handle encoder conflicts when it comes to encoders that aren't directly handling CRTCs. If we can fix this, I think we should be able to reliably figure out every CRTC involved in modesets like this - and ensure that nonblocking modesets come up with the right CRTC dependencies.
My current idea for handling this is as follows: * Add the following fields to drm_connector_state: * reserved_encoder → an encoder that is "reserved" by the connector, but is not directly attached to a monitor. Note reserved connectors cannot have CRTCs attached to them. When a connector has no more CRTCs reserved, it should lose it's reserved encoder. * dependent_crtcs → a bitmask of CRTCs that depend on this connector's state, but are not directly attached to it. * Add the following fields to drm_crtc_state: * connector_dependency → a connector whose state this CRTC relies on, but is not directly attached to. This connector must be pulled into the atomic state whenever this CRTC requires a modeset.
The reason for adding all of these fields to drm_connector_state and drm_crtc_state is because I don't think it's possible for us to rely on a particular private object being in all atomic states - so we need a way for the DRM core to be able to understand these object relationships on it's own and reference them from any type of atomic state change so that we can pull in dependent CRTCs as needed.
From there, we'd just:
* Add some functions to handle these new fields, something like: * drm_atomic_reserve_crtc_for_connector(crtc, encoder, conn_state) * drm_atomic_release_crtc_from_connector(crtc, conn_state) * Teach the various DRM core functions how to handle these new fields
Does this seem like I'm on the right track so far? JFYI - I've been busy trying to write up some patches for this, but there's definitely a lot of code to go through and change.
On Wed, 2022-03-16 at 16:28 -0400, Lyude Paul wrote:
On Wed, 2022-03-16 at 13:01 +0200, Ville Syrjälä wrote:
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
JFYI I definitely don't have any kind of plan to try parallel modesetting with MST, I think it'd be near impossible to actually get working correctly for pretty little benefit :). I was just not entirely sure of the work that would be required to get private objects to do the right thing here in parallel modesets (e.g. make sure we wait on all CRTC commits like you mentioned).
Anyway - I looked at the code for this the other day and a solution that seems pretty reasonable for this to me would be to add a hook for DRM private objects which provides drivers a spot to inform the DRM core what drm_crtc_commits need to be waited on before starting a modeset. I should have some patches on the list soon so folks can tell me if what I'm doing looks sensible or not :).
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
On Tue, Mar 22, 2022 at 05:37:40PM -0400, Lyude Paul wrote:
OK so - this has become a bit of a larger rabbit hole. I've been putting quite a bit of work into this and I think I'm starting to make some progress - although on a different aspect of this issue. After talking with danvet they realized that we're also potentially not handling encoder stealing with MST correctly - which we need to do in order to know that we're correctly pulling in every related crtc/connector into the state - along with avoiding encoder conflicts if something tries to use a GPU's DP encoder in SST mode while it's driving other displays in MST mode.
So - it seems this will likely need to be solved first before we can deal with ensuring that we perform the correct CRTC waits in atomic commits with the MST helpers. This has been pretty painful to figure out, but I think I'm starting to make some progress - but I'd really appreciate getting some feedback on this approach I've came up with so I maybe can skip having to rewrite it later.
So: to clarify the problem, it boils down to something like this:
State 1:
- DP-1 (hosts MST topology, so is disconnected + no encoder)
- MST topology
- DP-2 (has display)
- DP-3 (has display)
(In hardware)
- drm_encoder A drives:
- DP-2
- DP-3
(In software)
- drm_encoder A unused
- Fake MST drm_encoder B -> DP-2
- Fake MST drm_encoder C -> DP-3
Problems:
- DP-1 gets disconnected, MST topology disappears
- We disable maybe 1 display
- DP-1 is disconnected, suddenly replaced with SST display
- Driver tries to assign drm_encoder A to new DP-1 display
*** Error! drm_encoder A is still driving fake encoders B and C ***
I'm not sure if the exact above example would actually happen - you might need to do some tricks to get it into such a state. But you get the general idea - there's missing coverage with how we handle encoder conflicts when it comes to encoders that aren't directly handling CRTCs. If we can fix this, I think we should be able to reliably figure out every CRTC involved in modesets like this - and ensure that nonblocking modesets come up with the right CRTC dependencies.
My current idea for handling this is as follows:
- Add the following fields to drm_connector_state:
- reserved_encoder → an encoder that is "reserved" by the connector, but is not directly attached to a monitor. Note reserved connectors cannot have CRTCs attached to them. When a connector has no more CRTCs reserved, it should lose it's reserved encoder.
- dependent_crtcs → a bitmask of CRTCs that depend on this connector's state, but are not directly attached to it.
- Add the following fields to drm_crtc_state:
- connector_dependency → a connector whose state this CRTC relies on, but is not directly attached to. This connector must be pulled into the atomic state whenever this CRTC requires a modeset.
The reason for adding all of these fields to drm_connector_state and drm_crtc_state is because I don't think it's possible for us to rely on a particular private object being in all atomic states - so we need a way for the DRM core to be able to understand these object relationships on it's own and reference them from any type of atomic state change so that we can pull in dependent CRTCs as needed.
Why would tracking the mst private state object not be good enough? In any of the modesets which touch mst state you'd need to grab the mst state to change anything anyway (or there's a quite serious driver bug somewhere), so the private object should always be part of actual modeset changes.
Maybe there's some creative ways for drivers to get this wrong, but then I think it'd be better to think about how to prevent those than work around it. Since doing an mst modeset without having the mst state handy sounds rather broken irrespective of nonblocking atomic commit issues.
From there, we'd just:
- Add some functions to handle these new fields, something like:
- drm_atomic_reserve_crtc_for_connector(crtc, encoder, conn_state)
- drm_atomic_release_crtc_from_connector(crtc, conn_state)
- Teach the various DRM core functions how to handle these new fields
Does this seem like I'm on the right track so far? JFYI - I've been busy trying to write up some patches for this, but there's definitely a lot of code to go through and change.
tbh your entire scheme feels like adding commit tracking for private state objects, except we somehow don't track it on the private state itself, but instead spread it all around to semi-related existing modeset objects. And note that wrt the atomic commit machinery, drm_encoder isn't even a modeset object (it has no state of its own and is fully tracked by the (crtc, connector) combo). So this all feels very backwards.
Plus vc4 shows that there's other cases where tracking on the private state object is needed, mst wouldn't be the only thing. Your scheme would not be useful for vc4 at all, only for mst dependency tracking.
Also the encoder has the additional fun that there's multiple fake encoders for a single real mst port, whereas the mst state is an actual single struct per mst port.
Note that drm_crtc_commit is intentionally a stand-alone refcounted thing, so that it can attached to random other pieces for tracking dependencies, and we do attache them to various pieces all over already (for connector and plane switching). Your proposal is inventing a new way to track cross-crtc dependencies, and I'm confused why that's needed.
I guess in all this I don't get in what way your proposal here is better than just adding dependency tracking to private state structs?
Cheers, Daniel
On Wed, 2022-03-16 at 16:28 -0400, Lyude Paul wrote:
On Wed, 2022-03-16 at 13:01 +0200, Ville Syrjälä wrote:
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
JFYI I definitely don't have any kind of plan to try parallel modesetting with MST, I think it'd be near impossible to actually get working correctly for pretty little benefit :). I was just not entirely sure of the work that would be required to get private objects to do the right thing here in parallel modesets (e.g. make sure we wait on all CRTC commits like you mentioned).
Anyway - I looked at the code for this the other day and a solution that seems pretty reasonable for this to me would be to add a hook for DRM private objects which provides drivers a spot to inform the DRM core what drm_crtc_commits need to be waited on before starting a modeset. I should have some patches on the list soon so folks can tell me if what I'm doing looks sensible or not :).
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
-- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
On Wed, 2022-03-23 at 11:25 +0100, Daniel Vetter wrote:
On Tue, Mar 22, 2022 at 05:37:40PM -0400, Lyude Paul wrote:
OK so - this has become a bit of a larger rabbit hole. I've been putting quite a bit of work into this and I think I'm starting to make some progress - although on a different aspect of this issue. After talking with danvet they realized that we're also potentially not handling encoder stealing with MST correctly - which we need to do in order to know that we're correctly pulling in every related crtc/connector into the state - along with avoiding encoder conflicts if something tries to use a GPU's DP encoder in SST mode while it's driving other displays in MST mode.
So - it seems this will likely need to be solved first before we can deal with ensuring that we perform the correct CRTC waits in atomic commits with the MST helpers. This has been pretty painful to figure out, but I think I'm starting to make some progress - but I'd really appreciate getting some feedback on this approach I've came up with so I maybe can skip having to rewrite it later.
So: to clarify the problem, it boils down to something like this:
State 1: * DP-1 (hosts MST topology, so is disconnected + no encoder) * MST topology * DP-2 (has display) * DP-3 (has display) (In hardware) * drm_encoder A drives: * DP-2 * DP-3 (In software) * drm_encoder A unused * Fake MST drm_encoder B -> DP-2 * Fake MST drm_encoder C -> DP-3
Problems: * DP-1 gets disconnected, MST topology disappears * We disable maybe 1 display * DP-1 is disconnected, suddenly replaced with SST display * Driver tries to assign drm_encoder A to new DP-1 display *** Error! drm_encoder A is still driving fake encoders B and C ***
I'm not sure if the exact above example would actually happen - you might need to do some tricks to get it into such a state. But you get the general idea - there's missing coverage with how we handle encoder conflicts when it comes to encoders that aren't directly handling CRTCs. If we can fix this, I think we should be able to reliably figure out every CRTC involved in modesets like this - and ensure that nonblocking modesets come up with the right CRTC dependencies.
My current idea for handling this is as follows: * Add the following fields to drm_connector_state: * reserved_encoder → an encoder that is "reserved" by the connector, but is not directly attached to a monitor. Note reserved connectors cannot have CRTCs attached to them. When a connector has no more CRTCs reserved, it should lose it's reserved encoder. * dependent_crtcs → a bitmask of CRTCs that depend on this connector's state, but are not directly attached to it. * Add the following fields to drm_crtc_state: * connector_dependency → a connector whose state this CRTC relies on, but is not directly attached to. This connector must be pulled into the atomic state whenever this CRTC requires a modeset.
The reason for adding all of these fields to drm_connector_state and drm_crtc_state is because I don't think it's possible for us to rely on a particular private object being in all atomic states - so we need a way for the DRM core to be able to understand these object relationships on it's own and reference them from any type of atomic state change so that we can pull in dependent CRTCs as needed.
Why would tracking the mst private state object not be good enough? In any of the modesets which touch mst state you'd need to grab the mst state to change anything anyway (or there's a quite serious driver bug somewhere), so the private object should always be part of actual modeset changes.
Maybe there's some creative ways for drivers to get this wrong, but then I think it'd be better to think about how to prevent those than work around it. Since doing an mst modeset without having the mst state handy sounds rather broken irrespective of nonblocking atomic commit issues.
From there, we'd just: * Add some functions to handle these new fields, something like: * drm_atomic_reserve_crtc_for_connector(crtc, encoder, conn_state) * drm_atomic_release_crtc_from_connector(crtc, conn_state) * Teach the various DRM core functions how to handle these new fields
Does this seem like I'm on the right track so far? JFYI - I've been busy trying to write up some patches for this, but there's definitely a lot of code to go through and change.
tbh your entire scheme feels like adding commit tracking for private state objects, except we somehow don't track it on the private state itself, but instead spread it all around to semi-related existing modeset objects. And note that wrt the atomic commit machinery, drm_encoder isn't even a modeset object (it has no state of its own and is fully tracked by the (crtc, connector) combo). So this all feels very backwards.
Plus vc4 shows that there's other cases where tracking on the private state object is needed, mst wouldn't be the only thing. Your scheme would not be useful for vc4 at all, only for mst dependency tracking.
Also the encoder has the additional fun that there's multiple fake encoders for a single real mst port, whereas the mst state is an actual single struct per mst port.
Note that drm_crtc_commit is intentionally a stand-alone refcounted thing, so that it can attached to random other pieces for tracking dependencies, and we do attache them to various pieces all over already (for connector and plane switching). Your proposal is inventing a new way to track cross-crtc dependencies, and I'm confused why that's needed.
I guess in all this I don't get in what way your proposal here is better than just adding dependency tracking to private state structs?
I wonder if I was misunderstanding the issue you were pointing out then. The reason I ended up trying to add this to the connector structs is because I thought one of the issues that you brought up was the fact that we wouldn't be able to handle encoder conflicts with the MST encoder properly because of the fake encoder - and that issue would cause us not to be able to properly determine which CRTCs we need to block on for commits that try to use the real encoder behind the fake encoder later. I had attempted this because in such a situation, it doesn't seem like we'd be guaranteed that the MST manager would actually be in the state if say - the only thing we're doing is trying to enable an SST display on a connector that previously had an MST topology (which has since been disabled in a non-blocking commit that hasn't finished yet).
So I guess I'm not really sure where to go from here then? This whole rabbit hole dive started from me trying to move MST over to using private objects for state tracking as much as possible - which lead to questions on whether or not that would be safe at all with MST because of connectors changing and fake encoders. FWIW The issue that started things was me asking whether we could fill certain information into a private object's state from the context of an atomic commit (since this makes certain aspects of payload table management easier). So I'm really just looking for a way to make these things work, and ensure that we're not doing anything unsafe by using the private state for the topology manager this way.
Cheers, Daniel
On Wed, 2022-03-16 at 16:28 -0400, Lyude Paul wrote:
On Wed, 2022-03-16 at 13:01 +0200, Ville Syrjälä wrote:
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
JFYI I definitely don't have any kind of plan to try parallel modesetting with MST, I think it'd be near impossible to actually get working correctly for pretty little benefit :). I was just not entirely sure of the work that would be required to get private objects to do the right thing here in parallel modesets (e.g. make sure we wait on all CRTC commits like you mentioned).
Anyway - I looked at the code for this the other day and a solution that seems pretty reasonable for this to me would be to add a hook for DRM private objects which provides drivers a spot to inform the DRM core what drm_crtc_commits need to be waited on before starting a modeset. I should have some patches on the list soon so folks can tell me if what I'm doing looks sensible or not :).
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
-- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
On Wed, Mar 23, 2022 at 02:34:35PM -0400, Lyude Paul wrote:
On Wed, 2022-03-23 at 11:25 +0100, Daniel Vetter wrote:
On Tue, Mar 22, 2022 at 05:37:40PM -0400, Lyude Paul wrote:
OK so - this has become a bit of a larger rabbit hole. I've been putting quite a bit of work into this and I think I'm starting to make some progress - although on a different aspect of this issue. After talking with danvet they realized that we're also potentially not handling encoder stealing with MST correctly - which we need to do in order to know that we're correctly pulling in every related crtc/connector into the state - along with avoiding encoder conflicts if something tries to use a GPU's DP encoder in SST mode while it's driving other displays in MST mode.
So - it seems this will likely need to be solved first before we can deal with ensuring that we perform the correct CRTC waits in atomic commits with the MST helpers. This has been pretty painful to figure out, but I think I'm starting to make some progress - but I'd really appreciate getting some feedback on this approach I've came up with so I maybe can skip having to rewrite it later.
So: to clarify the problem, it boils down to something like this:
State 1: * DP-1 (hosts MST topology, so is disconnected + no encoder) * MST topology * DP-2 (has display) * DP-3 (has display) (In hardware) * drm_encoder A drives: * DP-2 * DP-3 (In software) * drm_encoder A unused * Fake MST drm_encoder B -> DP-2 * Fake MST drm_encoder C -> DP-3
Problems: * DP-1 gets disconnected, MST topology disappears * We disable maybe 1 display * DP-1 is disconnected, suddenly replaced with SST display * Driver tries to assign drm_encoder A to new DP-1 display *** Error! drm_encoder A is still driving fake encoders B and C ***
I'm not sure if the exact above example would actually happen - you might need to do some tricks to get it into such a state. But you get the general idea - there's missing coverage with how we handle encoder conflicts when it comes to encoders that aren't directly handling CRTCs. If we can fix this, I think we should be able to reliably figure out every CRTC involved in modesets like this - and ensure that nonblocking modesets come up with the right CRTC dependencies.
My current idea for handling this is as follows: * Add the following fields to drm_connector_state: * reserved_encoder → an encoder that is "reserved" by the connector, but is not directly attached to a monitor. Note reserved connectors cannot have CRTCs attached to them. When a connector has no more CRTCs reserved, it should lose it's reserved encoder. * dependent_crtcs → a bitmask of CRTCs that depend on this connector's state, but are not directly attached to it. * Add the following fields to drm_crtc_state: * connector_dependency → a connector whose state this CRTC relies on, but is not directly attached to. This connector must be pulled into the atomic state whenever this CRTC requires a modeset.
The reason for adding all of these fields to drm_connector_state and drm_crtc_state is because I don't think it's possible for us to rely on a particular private object being in all atomic states - so we need a way for the DRM core to be able to understand these object relationships on it's own and reference them from any type of atomic state change so that we can pull in dependent CRTCs as needed.
Why would tracking the mst private state object not be good enough? In any of the modesets which touch mst state you'd need to grab the mst state to change anything anyway (or there's a quite serious driver bug somewhere), so the private object should always be part of actual modeset changes.
Maybe there's some creative ways for drivers to get this wrong, but then I think it'd be better to think about how to prevent those than work around it. Since doing an mst modeset without having the mst state handy sounds rather broken irrespective of nonblocking atomic commit issues.
From there, we'd just: * Add some functions to handle these new fields, something like: * drm_atomic_reserve_crtc_for_connector(crtc, encoder, conn_state) * drm_atomic_release_crtc_from_connector(crtc, conn_state) * Teach the various DRM core functions how to handle these new fields
Does this seem like I'm on the right track so far? JFYI - I've been busy trying to write up some patches for this, but there's definitely a lot of code to go through and change.
tbh your entire scheme feels like adding commit tracking for private state objects, except we somehow don't track it on the private state itself, but instead spread it all around to semi-related existing modeset objects. And note that wrt the atomic commit machinery, drm_encoder isn't even a modeset object (it has no state of its own and is fully tracked by the (crtc, connector) combo). So this all feels very backwards.
Plus vc4 shows that there's other cases where tracking on the private state object is needed, mst wouldn't be the only thing. Your scheme would not be useful for vc4 at all, only for mst dependency tracking.
Also the encoder has the additional fun that there's multiple fake encoders for a single real mst port, whereas the mst state is an actual single struct per mst port.
Note that drm_crtc_commit is intentionally a stand-alone refcounted thing, so that it can attached to random other pieces for tracking dependencies, and we do attache them to various pieces all over already (for connector and plane switching). Your proposal is inventing a new way to track cross-crtc dependencies, and I'm confused why that's needed.
I guess in all this I don't get in what way your proposal here is better than just adding dependency tracking to private state structs?
I wonder if I was misunderstanding the issue you were pointing out then. The reason I ended up trying to add this to the connector structs is because I thought one of the issues that you brought up was the fact that we wouldn't be able to handle encoder conflicts with the MST encoder properly because of the fake encoder - and that issue would cause us not to be able to properly determine which CRTCs we need to block on for commits that try to use the real encoder behind the fake encoder later. I had attempted this because in such a situation, it doesn't seem like we'd be guaranteed that the MST manager would actually be in the state if say - the only thing we're doing is trying to enable an SST display on a connector that previously had an MST topology (which has since been disabled in a non-blocking commit that hasn't finished yet).
Oh this was all idle musings trying to figure out whether just relying on the drm_connector syncing is enough, and realizing that it's not enough by far.
So I guess I'm not really sure where to go from here then? This whole rabbit hole dive started from me trying to move MST over to using private objects for state tracking as much as possible - which lead to questions on whether or not that would be safe at all with MST because of connectors changing and fake encoders. FWIW The issue that started things was me asking whether we could fill certain information into a private object's state from the context of an atomic commit (since this makes certain aspects of payload table management easier). So I'm really just looking for a way to make these things work, and ensure that we're not doing anything unsafe by using the private state for the topology manager this way.
So what I think we need here is roughly:
- Move the drm_crtc_commit tracking Maxime added to vc4 private state from vc4 to drm atomic helpers. Both the datastructure and the book-keeping handling.
- To make this all easier we might want to allocate the drm_crtc_commit stuff a lot earlier, so that it's available in atomic_check already. Otherwise a lot more hooks are needed. Essentially allocate it as part of crtc state duplication.
- Then in the mst handling, every time you change anything for a sink add the drm_crtc_commit from its connector as a sync point so we order stuff correctly. I'm not sure we have this already.
It is a bit a mess :-(
The other thing I'm pondering is whether we should just give up on nonblocking atomic commits, but unfortunately it's not that easy because at least in general we do sometimes upgrade page flips to modesets under the hood, while lying to userspace. This is done in the self refresh helpers, and it might spread.
If we just completely nuke nonblocking modesets then that wouldn't work anymore, even though userspace itself does not seem to rely on these.
For details it's probably best to sort them out with Maxime, he's got a pile of experience in making this work for at least the specific case of vc4. -Daniel
Cheers, Daniel
On Wed, 2022-03-16 at 16:28 -0400, Lyude Paul wrote:
On Wed, 2022-03-16 at 13:01 +0200, Ville Syrjälä wrote:
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it. If this is all correct, I'd definitely appreciate being able to take a look at your work on this to see how I can help move things forward. I've got a WIP of my atomic only MST branch as well:
https://gitlab.freedesktop.org/lyudess/linux/-/commits/wip/mst-atomic-only-v...
However it's very certainly broken right now (it compiles and I had thought it worked already, but I realized I totally forgot to come up with a way of doing bookkeeping for VC start slots atomically - which is what led me down this current rabbit hole), but it should at least give a general idea of what I'm trying to do.
Anyway, let me know what you think.
For MST in particular parallel modeset on the same physical link sounds pretty crazy to me. Trying to make sure everything happens in the right order would not be pleasant. I think a simple solution would be just to add all the crtcs on the affected link to the state and call it a day.
JFYI I definitely don't have any kind of plan to try parallel modesetting with MST, I think it'd be near impossible to actually get working correctly for pretty little benefit :). I was just not entirely sure of the work that would be required to get private objects to do the right thing here in parallel modesets (e.g. make sure we wait on all CRTC commits like you mentioned).
Anyway - I looked at the code for this the other day and a solution that seems pretty reasonable for this to me would be to add a hook for DRM private objects which provides drivers a spot to inform the DRM core what drm_crtc_commits need to be waited on before starting a modeset. I should have some patches on the list soon so folks can tell me if what I'm doing looks sensible or not :).
i915 already does that on modern platforms actually because the hardware architecture kinda needs it. Although we could perhaps optimize it a bit to skip it in some cases, but not sure the extra complexity would really be justified.
In i915 we also serialize *all* modesets by running them on a ordered wq (+ explicit flush_workqueue() to serialize non-blocking vs. blocking modesets). We did semi-accidentally enable parallel modesets once but I undid that because there was just way too much pre-existing code that wasn't even considering the possibility of a parallel modeset, and I didn't really feel like reviewing the entire codebase to find all of it.
-- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
-- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat
Hi Lyude,
On Mon, Mar 14, 2022 at 06:16:36PM -0400, Lyude Paul wrote:
Hi! First a little bit of background: I've recently been trying to get rid of all of the non-atomic payload bandwidth management code in the MST helpers in order to make it easier to implement DSC and fallback link rate retraining support down the line. Currently bandwidth information is stored in two places, partially in the MST atomic state and partially in the mst manager's payload table (which exists outside of the atomic state and has its own locking). The portions in the atomic state are used to try to determine if a given display configuration can fit within the given bandwidth limitations during the atomic check phase, and are implemented through the use of private state objects.
Yeah, this looks awfully similar to what vc4 is doing.
The way the hardware is setup is that there's a device (HVS) that does the composition of all planes, over 3 FIFO, and each FIFO can be output to a CRTC with a mux. The HVS then needs to raise its internal clock rate depending on the load (based on the resolution and number of planes, mostly) of each FIFO.
Thus we have a private object (vc4_hvs_state) that lists all the FIFOs, with their associated loads, and we pull that state in for each commit.
My current goal has been to move as much of this as possible over to the atomic state and entirely get rid of the payload table along with it's locks. My main reason for doing this is that it both decomplicates things quite a bit, and I'm really also hoping that getting rid of that payload code will make it clearer to others how it works - and stop the influx of bandaid patches (e.g. adding more and more special cases to MST to fix poorly understood issues being hit in one specific driver and nowhere else) that I've had to spend way more time then I'd like trying to investigate and review.
So, the actual problem: following a conversation with Daniel Vetter today I've gotten the impression that private modesetting objects are basically just broken with parallel modesets? I'm still wrapping my head around all of this honestly, but from what I've gathered: CRTC atomic infra knows how to do waits in the proper places for when other CRTCs need to be waited on to continue a modeset, but there's no such tracking with private objects. If I understand this correctly, that means that even if two CRTC modesets require pulling in the same private object state for the MST mgr: we're only provided a guarantee that the atomic checks pulling in that private object state won't concurrently. But when it comes to commits, it doesn't sound like there's any actual tracking for this and as such - two CRTC modesets which have both pulled in the MST private state object are not prevented from running concurrently.
Yeah, in our case the issue was two-fold:
- The first one is that since the load is shared over each CRTC, we need, for each commit, to have the global load. This is non-obvious because you might get some new states that only affect a single CRTC, or a plane but not its CRTC, so making sure we always get the entire picture was a bit challenging.
You'll need to pull the state at each commit in atomic_check, and then basically remove the old state of whatever is in your commit, and add the new stuff. Just iterating over all the connectors in a state for example won't work.
- Then, indeed, there's a race issue. IIRC, the basic idea is that if two (non-blocking) commits don't share any resources (so like both have a single plane, CRTC and connector, but none shared), there's no execution ordering guaranteed by default. But there is one in the structures: you still have your old and new states.
So if you commit your changes like this:
* Initial State * State 1 * State 2
Your old private object in state 1 will be the initial state one, the new will be the one from state 1. And for state 2, the old will be state 1, the new will be state 2.
But if state 2 gets committed first, then your old state is weird already, because it's actually the next one. It get worse when state 1 gets committed since the old state is the initial state, but since state 2 has been committed the initial state has been freed already.
This unfortunately throws an enormous wrench into the MST atomic conversion I've been working on - as I was under the understanding while writing the code for this that all objects in an atomic state are blocked from being used in any new atomic commits (not checks, as parallel checks should be fine in my case) until there's no commits active with said object pulled into the atomic state. I certainly am not aware of any way parallel modesetting could actually be supported on MST, so it's not really a feature we want to deal with at all besides stopping it from happening. This also unfortunately means that the current atomic modesetting code we have for MST is potentially broken, although I assume we've never hit any real world issues with it because of the non-atomic locking we currently have for the payload table.
So, Daniel had mentioned that supposedly you've been dealing with similar issues with VC4 and might have already made progress on coming up with ways to deal with it.
This is all upstream already :)
You can have a look at vc4_atomic_commit_setup, that pull the HVS state in, and associates the drm_crtc_commit to the HVS state.
Then, vc4_atomic_commit_tail, we'll wait for the old HVS state drm_crtc_commit before moving forward.
It's not really challenging: the basic idea is that whenever your commit start you would wait for the drm_crtc_commit to be completed, forcing the ordering.
This requires some code in atomic_commit path though, so it might be difficult to integrate properly in the drivers that would use MST.
It would be an interesting discussion, I have a similar issue with HDMI scrambling support: I'd like to have most of the logic generic, but making sure all the HDMI drivers (but only them) use that generic code properly is challenging.
Maxime
dri-devel@lists.freedesktop.org