Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there

18 Oct 2011


      On 10/18/2011 11:14, Jean Delvare wrote:
...
Hi Dave, Eugeni, Alex,
On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
...
...
This allows to avoid talking to a non-existent bus repeatedly until we
finally timeout. The non-existent bus is signalled by -ENXIO error,
provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use
drm_get_edid would benefit for this timeout.
This change should fix
https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall
edid detection timing by 10-30% in most cases, and by a much larger margin
in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
This seems to be the wrong place for the test. The code below is really
testing for non-responsive (possibly not present) EDID, NOT
"non-existent adapter". Non-existent adapter should be checked in the
firmware if possible, or failing that, by testing the bus lines at bus
creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting
has been known to cause trouble recently (not per se but because it was
triggering a bug somewhere else in the radeon driver), it might still
have value, and could be changed to a per-adapter setting by exporting
the test function (I have a patch ready doing exactly this) and letting
video drivers test their I2C buses and discard the non-working ones if
they want.
I am worried that the patch below will cause regressions on other
machines. There's a comment right before the loop which reads:
/* The core i2c driver will automatically retry the transfer if the
    * adapter reports EAGAIN. However, we find that bit-banging transfers
    * are susceptible to errors under a heavily loaded machine and
    * generate spurious NAKs and timeouts. Retrying the transfer
    * of the individual block a few times seems to overcome this.
    */
So the retries are there for a reason, and -ENXIO is exactly what you
get on spurious NAKs.
Hi Jean,
You are right about the bit_test=1 testing, my initial attempt at the
fix did exactly that
(https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).
The problem is that for some buses, namely HDMI ones in my testing,
bit_test-like tests always consider them as non-existent when the
connectivity is not setup; and as working when it is. So in any case, we
could not just blacklist them - when they do, they are gone for good,
and won't work anymore, and we need to keep re-trying them at each attempt.
And in case of continuous pre-testing for the stuck bits and like when
driver attempts to get the edid (for example, when xrandr is run), we
still hit the same issue - the drm_do_probe_ddc_edid will continue to
retry the attempts until it reaches the maximum number of retries. E.g., there
is no way to tell drm_do_probe_ddc_edid to treat any return code as a
permanent failure and make it give up faster.
It could be fixed either in per-driver fashion, like I did with the other patch
(which also tests for -ENXIO); or in a generic way, in DRM. I couldn't reproduce
any false positives coming from -ENXIO on i915 driver, but perhaps it is
easier with radeon? Do you have any specific workload which trick the
driver into generating spurious NAKs by a chance?
...
One thing which may make sense would be to make the retry count a
module parameter, instead of a hard-coded value, so that users can
lower the value if they care more about boot time than reliability. But
again, ideally problematic buses would not have been created in the
first place.
Or perhaps it would be possible to have any error code coming from
i2c_transfer to signal a permanent error, which should not be retried..
what do you think?
--
Eugeni Dodonov

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [Intel-gfx] [PATCH 1/2] Give up on edid retries when i2c tells us that bus is not there