On Fri, 8 Apr 2022 at 16:56, Doug Anderson dianders@chromium.org wrote:
Hi,
On Fri, Apr 8, 2022 at 5:13 AM Dmitry Baryshkov dmitry.baryshkov@linaro.org wrote:
On Fri, 8 Apr 2022 at 03:28, Doug Anderson dianders@chromium.org wrote:
Hi,
On Thu, Apr 7, 2022 at 4:36 PM Dmitry Baryshkov dmitry.baryshkov@linaro.org wrote:
The ps8640 driver looks 'working by coincidence'. It calls dp_aux_populate, then immediately after the function returns it checks for the panel. If panel-edp is built as a module, the probe might fail easily. The anx7625 driver has the same kind of issue. The DP AUX bus is populated from the probe() and after some additional work the panel is being checked. This design is fragile and from my quick glance it can break (or be broken) too easy. It reminds me of our drm msm 'probe' loops preventing the device to boot completely if the dsi bridge/panel could not be probed in time.
I did spend some time thinking about this, at least for ps8640. I believe that as long as the panel's probe isn't asynchronous.
By panel probe (as a probe of any device) is potentially asynchronous. As in your example, the PWM might not be present, the regulator probe might have been delayed, the panel-edp might be built as a module, which is not present for some reason.
Well, in those cases it's not exactly asynchronous, or at least not in the way I was thinking about. Either it will work now, or we should try again later when more drivers have probed. The case I was worried about was:
We call devm_of_dp_aux_populate_ep_devices()
devm_of_dp_aux_populate_ep_devices() kicks off a probe to the panel
in the background
- devm_of_dp_aux_populate_ep_devices() returns to us without waiting
for the panel's probe to finish.
I think that's possible if the panel driver sets PROBE_PREFER_ASYNCHRONOUS.
That would be a separate story, yes. However the general case is still valid: one can not guarantee that the panel device is available immediately after aux bus population. So ps8640 works at this moment and in the particular configuration.
I was less worried about the resources missing since I thought that would be handled by the normal probe deferral case. IIRC the "infinite loop" that we used to end up with MSM's probe was because something about the way the MSM DRM driver worked would cause the deferral mechanisms to retry instantly each time we deferred. I don't remember exactly what triggered it, but I don't _think_ that's true for ps8640?
I'm not sure either. If you have a system with that bridge, it can be easily tried by returning -EPROBE_DEFER from the panel driver's probe().
For the msm driver it was the following sequence of events: - mdss probes - mdss creates subdevices including dsi host - subdevices are probed - mdss drivers tries to perform component binding - dsi host determines that the next item is not available - it returns -EPROBE_DEFER to the component bind - mdss reverts registration of subdevices, returning probe defer.
However as there were devices added to the device list, the deferral list was retried immediately. Thus we faced the probe loop.
I think this looks close to the ps8640, but I wouldn't bet on that.
Basically if the panel isn't ready then ps8640 should return and we'll retry later. I do remember the probe loops that we used to have with msm and I don't _think_ this would trigger it.
I don't have proof here. But as I wrote, this thing might break at some point.
That being said, if we need to separate out the AUX bus into a sub-device like we did in sn65dsi86 we certainly could.
Ideally we should separate the "bridge" subdevice, like it was done in sn65dsi86. So that the aux host probes, registers the EP device, then the bridge device can not be probed (and thus the drm_bridge is not created) until the next bridge becomes available.
You're definitely right, that's best. I was hesitant to force the issue during review of the ps8640 because it adds a bunch of complexity and didn't _seem_ to be needed. Maybe it makes sense to just code it up, though...
As I wrote, the test is easy. If the system boots fine, there is no need to fix that immediately. However I think in general we should stop depending on the panel being available right after populating the aux bus.
-- With best wishes Dmitry