Hi,
I've tried to bring-up the DSI controller on the RaspberryPi4, and I've just encountered something that could make it troublesome to support.
For context, the RaspberryPi has an official panel that uses a DSI->DPI bridge, a DPI panel, a touchscreen and a small micro-controller. That microcontroller controls the power management on the screen, so communicating with it is very much needed, and it's done through an i2c bus.
To reflect that, the entire panel has been described in the Device Tree as an I2C device (since that's how you would control it), together with an OF-Graph endpoint linking that i2c device to the DSI controller[1].
That deviates a bit from the generic DSI binding though[2], since it requires that the panel should be a subnode of the DSI controller (which also makes sense since DCS commands is usually how you would control that device).
This is where the trouble begins. Since the two devices are on entirely different buses, there's basically no guarantee on the probe order. The driver has tried to address this by using the OF-Graph and the component framework. Indeed, the DSI driver (component-based) will register a MIPI-DSI host in its probe, and call component_add[3]. If component_add fails, it will remove the DSI host and return the error code. It makes sense.
The panel on the other hand will probe, and look for a DSI host through the OF-Graph [4]. If it isn't there, it will return EPROBE_DEFER, hoping that it will be available at some point. It also makes complete sense.
Where the issue lies is that component_add has two very different behaviours here that will create the bug that we see on the RPi4:
- If there's still components to probe, component_add will simply return 0 [5][6]
- And if we're the last component to probe, component_add will then run all the bind callbacks and return the result on error of the master bind callback[7]. That master bind will usually have component_bind_all that will return the result of the bind callback of each component.
Now, on the RPi4, the last component to probe is the DSI controller since it relies on a power-domain exposed by the firmware driver, so the driver core will defer its probe function until the power-domain is there [8]. We're thus pretty much guaranteed to fall in the second case above and run through all the bind callbacks. The DSI bind callback however will try to find and attach its panel, and return EPROBE_DEFER if it doesn't find it[9]. That error will then be propagated to the return code of component_bind_all, then to the master bind callback, and finally will be the return code of component_add.
And since component_add is failing, we remove the DSI host. Since the DSI host isn't there, on the next occasion the i2c panel driver will not probe either, and we enter a loop that cannot really be broken, since one depends on the other.
This was working on the RPi3 because the DSI is not the last driver to probe: indeed the v3d is depending on the same power domain[10][11] and is further down the list of components to add in the driver [12], so we're always in the first component_add case for DSI above, the DSI host sticks around, and the i2c driver can probe.
I'm not entirely sure how we can fix that though. I guess the real flaw here is the assumption that component_add will not fail if one of the bind fails, which isn't true, but we can't really ignore those errors either since it might be something else than DSI that returns that error.
One way to work around it is to make the mailbox, firmware and power domain drivers probe earlier by tweaking the initcalls at which they register, but it's not really fixing anything and compiling them as module would make it broken again.
Maxime
1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... 2: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... 3: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 4: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 5: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 6: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 7: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 8: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 9: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 10: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... 11: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... 12: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv...
On 30.06.2020 15:27, Maxime Ripard wrote:
Hi,
I've tried to bring-up the DSI controller on the RaspberryPi4, and I've just encountered something that could make it troublesome to support.
For context, the RaspberryPi has an official panel that uses a DSI->DPI bridge, a DPI panel, a touchscreen and a small micro-controller. That microcontroller controls the power management on the screen, so communicating with it is very much needed, and it's done through an i2c bus.
To reflect that, the entire panel has been described in the Device Tree as an I2C device (since that's how you would control it), together with an OF-Graph endpoint linking that i2c device to the DSI controller[1].
That deviates a bit from the generic DSI binding though[2], since it requires that the panel should be a subnode of the DSI controller (which also makes sense since DCS commands is usually how you would control that device).
This is where the trouble begins. Since the two devices are on entirely different buses, there's basically no guarantee on the probe order. The driver has tried to address this by using the OF-Graph and the component framework. Indeed, the DSI driver (component-based) will register a MIPI-DSI host in its probe, and call component_add[3]. If component_add fails, it will remove the DSI host and return the error code. It makes sense.
The panel on the other hand will probe, and look for a DSI host through the OF-Graph [4]. If it isn't there, it will return EPROBE_DEFER, hoping that it will be available at some point. It also makes complete sense.
Where the issue lies is that component_add has two very different behaviours here that will create the bug that we see on the RPi4:
If there's still components to probe, component_add will simply return 0 [5][6]
And if we're the last component to probe, component_add will then run all the bind callbacks and return the result on error of the master bind callback[7]. That master bind will usually have component_bind_all that will return the result of the bind callback of each component.
Now, on the RPi4, the last component to probe is the DSI controller since it relies on a power-domain exposed by the firmware driver, so the driver core will defer its probe function until the power-domain is there [8]. We're thus pretty much guaranteed to fall in the second case above and run through all the bind callbacks. The DSI bind callback however will try to find and attach its panel, and return EPROBE_DEFER if it doesn't find it[9]. That error will then be propagated to the return code of component_bind_all, then to the master bind callback, and finally will be the return code of component_add.
And since component_add is failing, we remove the DSI host. Since the DSI host isn't there, on the next occasion the i2c panel driver will not probe either, and we enter a loop that cannot really be broken, since one depends on the other.
This was working on the RPi3 because the DSI is not the last driver to probe: indeed the v3d is depending on the same power domain[10][11] and is further down the list of components to add in the driver [12], so we're always in the first component_add case for DSI above, the DSI host sticks around, and the i2c driver can probe.
I'm not entirely sure how we can fix that though. I guess the real flaw here is the assumption that component_add will not fail if one of the bind fails, which isn't true, but we can't really ignore those errors either since it might be something else than DSI that returns that error.
One way to work around it is to make the mailbox, firmware and power domain drivers probe earlier by tweaking the initcalls at which they register, but it's not really fixing anything and compiling them as module would make it broken again.
Forgive me - I have not read whole post, but I hope you have a problem already solved.
As I understand you have:
1. Componentized DSI-host.
2. Some sink laying on DSI bus.
General rule I promote: "do not expose functionality, until you have all dependencies", in this case it would be "do not call component_add until you know sink(your dependency) is ready".
Already tested solution (to be checked in drivers):
1. In DSI-host you register dsi bus in probe, but do not call component_add.
2. In DSI-host callback informing about DSI device registration you get the sink and since you have all resources then you call component_add.
I hope it will be helpful.
Regards
Andrzej
Maxime
1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... 2: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Docu... 3: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 4: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 5: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 6: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 7: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 8: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 9: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv... 10: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... 11: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... 12: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/driv...
dri-devel mailing list dri-devel@lists.freedesktop.org https://protect2.fireeye.com/url?k=44989d8e-194c21e6-449916c1-0cc47a3356b2-0...
Hi
On Fri, Jul 03, 2020 at 05:47:08PM +0200, Andrzej Hajda wrote:
On 30.06.2020 15:27, Maxime Ripard wrote:
I've tried to bring-up the DSI controller on the RaspberryPi4, and I've just encountered something that could make it troublesome to support.
For context, the RaspberryPi has an official panel that uses a DSI->DPI bridge, a DPI panel, a touchscreen and a small micro-controller. That microcontroller controls the power management on the screen, so communicating with it is very much needed, and it's done through an i2c bus.
To reflect that, the entire panel has been described in the Device Tree as an I2C device (since that's how you would control it), together with an OF-Graph endpoint linking that i2c device to the DSI controller[1].
That deviates a bit from the generic DSI binding though[2], since it requires that the panel should be a subnode of the DSI controller (which also makes sense since DCS commands is usually how you would control that device).
This is where the trouble begins. Since the two devices are on entirely different buses, there's basically no guarantee on the probe order. The driver has tried to address this by using the OF-Graph and the component framework. Indeed, the DSI driver (component-based) will register a MIPI-DSI host in its probe, and call component_add[3]. If component_add fails, it will remove the DSI host and return the error code. It makes sense.
The panel on the other hand will probe, and look for a DSI host through the OF-Graph [4]. If it isn't there, it will return EPROBE_DEFER, hoping that it will be available at some point. It also makes complete sense.
Where the issue lies is that component_add has two very different behaviours here that will create the bug that we see on the RPi4:
If there's still components to probe, component_add will simply return 0 [5][6]
And if we're the last component to probe, component_add will then run all the bind callbacks and return the result on error of the master bind callback[7]. That master bind will usually have component_bind_all that will return the result of the bind callback of each component.
Now, on the RPi4, the last component to probe is the DSI controller since it relies on a power-domain exposed by the firmware driver, so the driver core will defer its probe function until the power-domain is there [8]. We're thus pretty much guaranteed to fall in the second case above and run through all the bind callbacks. The DSI bind callback however will try to find and attach its panel, and return EPROBE_DEFER if it doesn't find it[9]. That error will then be propagated to the return code of component_bind_all, then to the master bind callback, and finally will be the return code of component_add.
And since component_add is failing, we remove the DSI host. Since the DSI host isn't there, on the next occasion the i2c panel driver will not probe either, and we enter a loop that cannot really be broken, since one depends on the other.
This was working on the RPi3 because the DSI is not the last driver to probe: indeed the v3d is depending on the same power domain[10][11] and is further down the list of components to add in the driver [12], so we're always in the first component_add case for DSI above, the DSI host sticks around, and the i2c driver can probe.
I'm not entirely sure how we can fix that though. I guess the real flaw here is the assumption that component_add will not fail if one of the bind fails, which isn't true, but we can't really ignore those errors either since it might be something else than DSI that returns that error.
One way to work around it is to make the mailbox, firmware and power domain drivers probe earlier by tweaking the initcalls at which they register, but it's not really fixing anything and compiling them as module would make it broken again.
Forgive me - I have not read whole post, but I hope you have a problem already solved.
As I understand you have:
Componentized DSI-host.
Some sink laying on DSI bus.
General rule I promote: "do not expose functionality, until you have all dependencies", in this case it would be "do not call component_add until you know sink(your dependency) is ready".
Already tested solution (to be checked in drivers):
In DSI-host you register dsi bus in probe, but do not call component_add.
In DSI-host callback informing about DSI device registration you get the
sink and since you have all resources then you call component_add.
That's a great idea :)
I just tested and it works, so it ended up to much easier to fix than I anticipated :)
Thanks! Maxime
dri-devel@lists.freedesktop.org