On 19/11/2020 19:35, Marc Zyngier wrote:
On 2020-11-19 18:13, Jerome Brunet wrote:
On Thu 19 Nov 2020 at 19:04, Guillaume Tucker guillaume.tucker@collabora.com wrote:
Hi Marc,
On 19/11/2020 11:58, Marc Zyngier wrote:
On 2020-11-19 10:26, Neil Armstrong wrote:
On 19/11/2020 11:20, Marc Zyngier wrote:
On 2020-11-19 08:50, Guillaume Tucker wrote: > Please see the automated bisection report below about some kernel > errors on meson-gxbb-p200. > > Reports aren't automatically sent to the public while we're > trialing new bisection features on kernelci.org, however this one > looks valid. > > The bisection started with next-20201118 but the errors are still > present in next-20201119. Details for this regression: > > https://kernelci.org/test/case/id/5fb6196bfd0127fd68d8d902/ > > The first error is: > > [ 14.757489] Internal error: synchronous external abort: 96000210 > [#1] PREEMPT SMP
Looks like yet another clock ordering setup. I guess different Amlogic platforms have slightly different ordering requirements.
Neil, do you have any idea of which platform requires which ordering? The variability in DT and platforms is pretty difficult to follow (and I don't think I have such board around).
The requirements should be the same, here the init was done before calling dw_hdmi_probe to be sure the clocks and internals resets were deasserted. But since you boot from u-boot already enabling these, it's already active.
The solution would be to revert and do some check in meson_dw_hdmi_init() to check if already enabled and do nothing.
A better fix seems to be this, which makes it explicit that there is a dependency between some of the registers accessed from meson_dw_hdmi_init() and the iahb clock.
Guillaume, can you give this a go on your failing box?
I confirm it solves the problem. Please add this to your fix patch if it's OK with you:
Reported-by: "kernelci.org bot" bot@kernelci.org Tested-by: Guillaume Tucker guillaume.tucker@collabora.com
For the record, it passed all the tests when applied on top of the "bad" revision found by the bisection:
http://lava.baylibre.com:10080/scheduler/alljobs?search=v5.10-rc3-1021-gb866...
and the exact same test on the "bad" revision without the fix consistently showed the error:
http://lava.baylibre.com:10080/scheduler/job/374176
Thanks, Guillaume
diff --git a/drivers/gpu/drm/meson/meson_dw_hdmi.c b/drivers/gpu/drm/meson/meson_dw_hdmi.c index 7f8eea494147..52af8ba94311 100644 --- a/drivers/gpu/drm/meson/meson_dw_hdmi.c +++ b/drivers/gpu/drm/meson/meson_dw_hdmi.c @@ -146,6 +146,7 @@ struct meson_dw_hdmi { struct reset_control *hdmitx_ctrl; struct reset_control *hdmitx_phy; struct clk *hdmi_pclk; + struct clk *iahb_clk; struct clk *venci_clk; struct regulator *hdmi_supply; u32 irq_stat; @@ -1033,6 +1034,13 @@ static int meson_dw_hdmi_bind(struct device *dev, struct device *master, } clk_prepare_enable(meson_dw_hdmi->hdmi_pclk);
+ meson_dw_hdmi->iahb_clk = devm_clk_get(dev, "iahb"); + if (IS_ERR(meson_dw_hdmi->iahb_clk)) { + dev_err(dev, "Unable to get iahb clk\n"); + return PTR_ERR(meson_dw_hdmi->iahb_clk); + } + clk_prepare_enable(meson_dw_hdmi->iahb_clk);
On previous SoCs, iahb was directly the bus clock (clk81), and on recent socs this clock is a gate.
The question is why is it disabled. Maybe a previous failed probe disabled it in the dw-hdmi probe failure code and this clock is needed for meson_dw_hdmi_init(), so yeah this is the right fix.
Thanks.
Could you send a revert of b33340e33acdfe5ca6a5aa1244709575ae1e0432 and then proper fix with clk_disable_unprepare added ?
If you guys are going ahead with this fix, this call to clk_prepare_enable() needs to be balanced with clk_disable_unprepare() somehow
Yup, good point.
Although this driver *never* disables any clock it enables, and leaves it to the main DW driver, which I guess makes it leak references.
So all 3 clocks need fixing.
Exact.
Thx Guillaume for testing,
Neil
Thanks,
M.