Those are two identical fixes for improving EDID detection timings, which also fix extremely slow xrandr queries in case of phantom outputs (https://bugs.freedesktop.org/show_bug.cgi?id=41059)
The first fix is a small change to drm_edid, which prevents it from talking to non-existent adapters by detecting them faster.
The second fix replicates the first one within the i915 driver. It does the same thing, but without touching core DRM files.
Those are some of the testing results from our QA team:
Regressions and functional testing: Machine+ports result Ironlake(mobile) eDP/VGA pass Ironlake(desktop) DP/VGA pass Ironlake(mobile) LVDS/VGA/DP pass G45(desktop) VGA/HDMI pass Pineview(mobile) LVDS/VGA pass
xrandr performance: Without patch with patch E6510(Ironlake mobile) 0.119 0.111 PK1(Ironlake desktop) 0.101 0.080 T410b(Ironlake mobile) 0.406 0.114 G45b( G45 desktop) 0.121 0.091 Pnv1(Pineview mobile) 0.043 0.040
Those are the results for machines affected by phantom outputs issue, based on fd.o #41059 feedback:
xrandr performance: Without patch with patch System 1 0.840 0.290 System 2 0.690 0.140 System 3 0.315 0.280 System 4 0.175 0.140 System 6 (original issue) 4s 0.184
We have observed no regressions in any cases, and performance improvements of 20-30% for edid detection timing. Combining it with the results obtained at https://bugs.freedesktop.org/show_bug.cgi?id=41059, besides those improvements it also improves xrandr timing by up to 20x in the worst case of phantom outputs.
I believe that the better way to fix this is via the drm_get_edid() fix, as it is a one-line fix and could benefit all other chipsets. And we won't have to reinvent the wheel with intel_drm_get_edid, which only duplicates the existent functionality with no additional benefits.
Could we have any feedback or reviewed-by or from non-intel drm maintainers?
Thanks!
Eugeni Dodonov (2): Give up on edid retries when i2c tells us that bus is not there Check if the bus is valid prior to discovering edid.
drivers/gpu/drm/drm_edid.c | 5 ++++ drivers/gpu/drm/i915/intel_crt.c | 46 ++++++++++++++++++------------------ drivers/gpu/drm/i915/intel_dp.c | 4 +- drivers/gpu/drm/i915/intel_drv.h | 3 +- drivers/gpu/drm/i915/intel_hdmi.c | 4 +- drivers/gpu/drm/i915/intel_i2c.c | 42 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_lvds.c | 2 +- drivers/gpu/drm/i915/intel_modes.c | 29 +---------------------- drivers/gpu/drm/i915/intel_sdvo.c | 4 +- 9 files changed, 80 insertions(+), 59 deletions(-)
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com --- drivers/gpu/drm/drm_edid.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..1bca6d7 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2); + if (ret == -ENXIO) { + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n", + adapter->name); + break; + } } while (ret != 2 && --retries);
return ret == 2 ? 0 : -1;
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov eugeni.dodonov@intel.com wrote:
if (ret == -ENXIO) {
DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
adapter->name);
break;
}
This seems good to me; are there additional error values which should also be considered fatal and not subject to retry?
Reviewed-by: Keith Packard keithp@keithp.com
-keith
On Mon, Oct 17, 2011 at 18:41, Keith Packard keithp@keithp.com wrote:
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov < eugeni.dodonov@intel.com> wrote:
if (ret == -ENXIO) {
DRM_DEBUG_KMS("drm: skipping non-existent adapter
%s\n",
adapter->name);
break;
}
This seems good to me; are there additional error values which should also be considered fatal and not subject to retry?
From what I've checked, the other return error value in this context could
be -EREMOTEIO, which could be caused by transmission error so it should be retried.
On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov eugeni@dodonov.net wrote:
From what I've checked, the other return error value in this context could be -EREMOTEIO, which could be caused by transmission error so it should be retried.
Oh, there's -ENOMEM, -EINVAL and probably a few others down in the bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems safest to me.
On Mon, Oct 17, 2011 at 20:41, Keith Packard keithp@keithp.com wrote:
On Mon, 17 Oct 2011 19:07:51 -0200, Eugeni Dodonov eugeni@dodonov.net wrote:
From what I've checked, the other return error value in this context
could
be -EREMOTEIO, which could be caused by transmission error so it should
be
retried.
Oh, there's -ENOMEM, -EINVAL and probably a few others down in the bowels of the kernel i2c bits. Starting with the obvious (ENXIO) seems safest to me.
Yes, of course, but I was referring to the values which could be returned through the i2c-algo-bit call used in this edid detection call.
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
Dave.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com
drivers/gpu/drm/drm_edid.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..1bca6d7 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2);
- if (ret == -ENXIO) {
- DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
- adapter->name);
- break;
- }
} while (ret != 2 && --retries);
return ret == 2 ? 0 : -1;
1.7.7
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Hi Dave, Eugeni, Alex,
On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
This seems to be the wrong place for the test. The code below is really testing for non-responsive (possibly not present) EDID, NOT "non-existent adapter". Non-existent adapter should be checked in the firmware if possible, or failing that, by testing the bus lines at bus creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting has been known to cause trouble recently (not per se but because it was triggering a bug somewhere else in the radeon driver), it might still have value, and could be changed to a per-adapter setting by exporting the test function (I have a patch ready doing exactly this) and letting video drivers test their I2C buses and discard the non-working ones if they want.
I am worried that the patch below will cause regressions on other machines. There's a comment right before the loop which reads:
/* The core i2c driver will automatically retry the transfer if the * adapter reports EAGAIN. However, we find that bit-banging transfers * are susceptible to errors under a heavily loaded machine and * generate spurious NAKs and timeouts. Retrying the transfer * of the individual block a few times seems to overcome this. */
So the retries are there for a reason, and -ENXIO is exactly what you get on spurious NAKs.
One thing which may make sense would be to make the retry count a module parameter, instead of a hard-coded value, so that users can lower the value if they care more about boot time than reliability. But again, ideally problematic buses would not have been created in the first place.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com
drivers/gpu/drm/drm_edid.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..1bca6d7 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2);
- if (ret == -ENXIO) {
- DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
- adapter->name);
- break;
- }
} while (ret != 2 && --retries);
return ret == 2 ? 0 : -1;
On 10/18/2011 11:14, Jean Delvare wrote:
Hi Dave, Eugeni, Alex,
On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
This seems to be the wrong place for the test. The code below is really testing for non-responsive (possibly not present) EDID, NOT "non-existent adapter". Non-existent adapter should be checked in the firmware if possible, or failing that, by testing the bus lines at bus creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting has been known to cause trouble recently (not per se but because it was triggering a bug somewhere else in the radeon driver), it might still have value, and could be changed to a per-adapter setting by exporting the test function (I have a patch ready doing exactly this) and letting video drivers test their I2C buses and discard the non-working ones if they want.
I am worried that the patch below will cause regressions on other machines. There's a comment right before the loop which reads:
/* The core i2c driver will automatically retry the transfer if the * adapter reports EAGAIN. However, we find that bit-banging transfers * are susceptible to errors under a heavily loaded machine and * generate spurious NAKs and timeouts. Retrying the transfer * of the individual block a few times seems to overcome this. */
So the retries are there for a reason, and -ENXIO is exactly what you get on spurious NAKs.
Hi Jean,
You are right about the bit_test=1 testing, my initial attempt at the fix did exactly that (https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).
The problem is that for some buses, namely HDMI ones in my testing, bit_test-like tests always consider them as non-existent when the connectivity is not setup; and as working when it is. So in any case, we could not just blacklist them - when they do, they are gone for good, and won't work anymore, and we need to keep re-trying them at each attempt.
And in case of continuous pre-testing for the stuck bits and like when driver attempts to get the edid (for example, when xrandr is run), we still hit the same issue - the drm_do_probe_ddc_edid will continue to retry the attempts until it reaches the maximum number of retries. E.g., there is no way to tell drm_do_probe_ddc_edid to treat any return code as a permanent failure and make it give up faster.
It could be fixed either in per-driver fashion, like I did with the other patch (which also tests for -ENXIO); or in a generic way, in DRM. I couldn't reproduce any false positives coming from -ENXIO on i915 driver, but perhaps it is easier with radeon? Do you have any specific workload which trick the driver into generating spurious NAKs by a chance?
One thing which may make sense would be to make the retry count a module parameter, instead of a hard-coded value, so that users can lower the value if they care more about boot time than reliability. But again, ideally problematic buses would not have been created in the first place.
Or perhaps it would be possible to have any error code coming from i2c_transfer to signal a permanent error, which should not be retried.. what do you think?
-- Eugeni Dodonov
On Tue, 18 Oct 2011 11:37:38 -0200, Eugeni Dodonov wrote:
On 10/18/2011 11:14, Jean Delvare wrote:
Hi Dave, Eugeni, Alex,
On Tue, 18 Oct 2011 11:02:00 +0100, Dave Airlie wrote:
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
This seems to be the wrong place for the test. The code below is really testing for non-responsive (possibly not present) EDID, NOT "non-existent adapter". Non-existent adapter should be checked in the firmware if possible, or failing that, by testing the bus lines at bus creation time, as i2c_algo_bit.bit_test=1 is doing. While this setting has been known to cause trouble recently (not per se but because it was triggering a bug somewhere else in the radeon driver), it might still have value, and could be changed to a per-adapter setting by exporting the test function (I have a patch ready doing exactly this) and letting video drivers test their I2C buses and discard the non-working ones if they want.
I am worried that the patch below will cause regressions on other machines. There's a comment right before the loop which reads:
/* The core i2c driver will automatically retry the transfer if the * adapter reports EAGAIN. However, we find that bit-banging transfers * are susceptible to errors under a heavily loaded machine and * generate spurious NAKs and timeouts. Retrying the transfer * of the individual block a few times seems to overcome this. */
So the retries are there for a reason, and -ENXIO is exactly what you get on spurious NAKs.
You are right about the bit_test=1 testing, my initial attempt at the fix did exactly that (https://bugs.freedesktop.org/show_bug.cgi?id=41059, comments 14 and 15).
The problem is that for some buses, namely HDMI ones in my testing, bit_test-like tests always consider them as non-existent when the connectivity is not setup; and as working when it is.
Just to clarify: by "connectivity is setup", do you mean code in the driver setting the GPIO pin direction etc., or a display being connected to the graphics card?
I admit I am a little surprised. I2C buses should have their lines up by default, thanks to pull-up resistors, and masters and slaves should merely drive the lines low as needed. The absence of slaves should have no impact on the line behavior. If the Intel graphics chips (or the driver) rely on the display to hold the lines high, I'd say this is a seriously broken design.
As a side note, I thought that HDMI had the capability of presence detection, so there may be a better way to know if a display is connected or not, and this information could used to dynamically instantiate and delete the I2C buses? That way we could skip probing for EDID when there is no chance of success.
In the specific case of the user report that started this discussion, the card has no HDMI port to start with, so it seems weird to even attempt to create a DDC bus for HDMI. If there no way the i915 driver can detect this case and skip the whole HDMI setup? As I understand it the radeon driver is able to do that. If the user report is an isolated case of faulty BIOS/firmware, you could consider handling it specifically (as the radeon driver does, too.)
So in any case, we could not just blacklist them - when they do, they are gone for good, and won't work anymore, and we need to keep re-trying them at each attempt.
And in case of continuous pre-testing for the stuck bits and like when driver attempts to get the edid (for example, when xrandr is run), we still hit the same issue - the drm_do_probe_ddc_edid will continue to retry the attempts until it reaches the maximum number of retries. E.g., there is no way to tell drm_do_probe_ddc_edid to treat any return code as a permanent failure and make it give up faster.
Well, you could always do manual line testing of the I2C bus _before_ calling drm_do_probe_ddc_edid()? And skip the call if the test fails (i.e. the bus isn't ready for use.) As said before, I am willing to export bit_test if it helps any. I've attached a patch doing exactly this. Let me know if you want me to commit it.
Your proposed patch is better than I first thought. I had forgotten that i2c-algo-bit tries up to 3 times to get a first ack from a slave. So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has already attempted 3 times to contact the slave, with no reply, so there's probably no point going on. A communication problem with a present slave would have returned a different error code.
Without your patch, a missing slave would cause 3 * 5 = 15 retries, which is definitely too much.
That being said, even then the whole probe sequence shouldn't exceed 10 ms, which I wouldn't expect a user to notice. The user-reported 4 second delay when running xrandr can't be caused by this. 4 seconds for 15 attempts is 250 ms per attempt, this looks like a timeout due to non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250, which I guess was what the reporting user was running. So even with your patch, there's still 750 ms to wait, I don't think this is acceptable. You really have to fix that I2C bus or skip probing it.
It could be fixed either in per-driver fashion, like I did with the other patch (which also tests for -ENXIO); or in a generic way, in DRM. I couldn't reproduce any false positives coming from -ENXIO on i915 driver, but perhaps it is easier with radeon? Do you have any specific workload which trick the driver into generating spurious NAKs by a chance?
I'm not sure if it is related to the driver, graphics chip or display. The way I read the comment in the code, the cause of the problem would rather be CPU load. And as a matter of fact, in bit-banging mode, the I2C clock is generated by the CPU itself, and we have no guarantee that it can be done on time. For example if interrupts fire during the I2C transfer and they take time to be processed, we might exceed the standard time constraints and slaves might consider that the transfer is aborted. This was discussed before on the linux-i2c list [1], but no solution was found yet and I'm not sure if such a solution exists. If anyone has ideas, they are welcome.
[1] http://marc.info/?l=linux-i2c&m=129250841025737&w=2
I don't think I ever hit the problem (but with the retry code in place, that's hard to tell for sure.) Best would be to ask the developers involved in 4819d2e4310796c4e9eef674499af9b9caf36b5a, which added the retry mechanism: Chris Wilson and Michael Buesch (both Cc'd.) Chris, Michael, do you know of ways to reproduce the issue? Can you please also confirm that the error code you were receiving was not -ENXIO?
Note that the problem is more likely to happen with slow masters, because (1) every transaction takes longer and thus has a higher probability to be hit by interrupts and (2) long delays mean smaller margins to the spec limits, so interrupts are more likely to cause trouble. I see that both radeon_i2c and intel_i2c set udelay to 20 µs, which means a 25 kbps I2C bus. In general it is recommended to not drive the I2C bus below 10 kbps (that's even a hard limit for SMBus compliance.) nouveau_i2c is even worse with udelay = 40 µs which is equivalent to a 12.5 kbps I2C bus, very close to the low limit.
I would recommend lowering udelay to at least 10 µs (50 kbps I2C bus). It might even make sense to lower even more, maybe down to 5 µs to hit the max speed of standard I2C (100 kbps). Compliant slaves can slow down the clock on the fly if they need more time (but I wouldn't expect EEPROMs to need time as they don't have anything to process.) Not only this may help avoid the problems retries attempt to work around, but it will also make EDID block read faster (both on success and failure). At 25 kbps, reading a 128-byte EDID block takes about 50 ms. This could be lowered to 25 ms with udelay = 10, or even 12 ms if driving the I2C clock at max speed.
FWIW, framebuffer drivers radeon, cyber2000, i810, matrox, s3, savage, tdfx and via all have udelay = 10 or lower, so it can't be that bad. I'll submit a patch for the 3 KMS DRM drivers.
One thing which may make sense would be to make the retry count a module parameter, instead of a hard-coded value, so that users can lower the value if they care more about boot time than reliability. But again, ideally problematic buses would not have been created in the first place.
Or perhaps it would be possible to have any error code coming from i2c_transfer to signal a permanent error, which should not be retried.. what do you think?
i2c_transfer doesn't do anything by itself, it delegates all the work to the i2c bus driver or algorithm (in this case i2c-algo-bit and specifically bit_xfer() in this module.) Unfortunately there is no way to tell a permanent error from a transient one. What we can do OTOH is differentiate between the different error types and have adjusted retry strategies. As your patch does, and i2c-algo-bit too.
I see a number of issues in i2c-algo-bit as I read through the code. First issue, it doesn't handle multi-master setups, i.e. no busy bus detection and no arbitration loss detection. I don't know if multi-master I2C topologies are possible on graphics cards?
Second issue, it returns error codes (-EREMOTEIO) not documented in Documentation/i2c/fault-codes. These should be converted to comply. I'll look into it.
Forgot to attach the patch, sorry. Here it is.
On Thu, 20 Oct 2011 14:33:39 +0200, Jean Delvare wrote:
That being said, even then the whole probe sequence shouldn't exceed 10 ms, which I wouldn't expect a user to notice. The user-reported 4 second delay when running xrandr can't be caused by this. 4 seconds for 15 attempts is 250 ms per attempt, this looks like a timeout due to non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250,
D'oh. Today is not my day, I can't believe I wrote this :/ 1 jiffy is obviously 4 ms only at HZ=250. So I can't explain why it is taking so much time on the reporter's machine.
which I guess was what the reporting user was running. So even with your patch, there's still 750 ms to wait, I don't think this is acceptable. You really have to fix that I2C bus or skip probing it.
This conclusion probably still holds.
On Thu, 20 Oct 2011 14:33:39 +0200 Jean Delvare khali@linux-fr.org wrote:
retry mechanism: Chris Wilson and Michael Buesch (both Cc'd.) Chris, Michael, do you know of ways to reproduce the issue?
The error could easily reproduced by loading the machine heavily. So my guess is that it is caused by electrical noise injected into the i2c bus. Probably due to bad hardware design. But that doesn't matter. I have to live with that. The error did not trigger again with the workaround in place, though.
Can you please also confirm that the error code you were receiving was not -ENXIO?
I really don't remember what error code it was.
Note that the problem is more likely to happen with slow masters, because (1) every transaction takes longer and thus has a higher probability to be hit by interrupts and (2) long delays mean smaller margins to the spec limits, so interrupts are more likely to cause trouble. I see that both radeon_i2c and intel_i2c set udelay to 20 µs, which means a 25 kbps I2C bus. In general it is recommended to not drive the I2C bus below 10 kbps (that's even a hard limit for SMBus compliance.) nouveau_i2c is even worse with udelay = 40 µs which is equivalent to a 12.5 kbps I2C bus, very close to the low limit.
Hm. Not sure. I don't think the machine had heavy IRQ load. Just high CPU and memory load (compile the kernel or something like that).
On Thu, Oct 20, 2011 at 10:33, Jean Delvare khali@linux-fr.org wrote:
Just to clarify: by "connectivity is setup", do you mean code in the driver setting the GPIO pin direction etc., or a display being connected to the graphics card?
I admit I am a little surprised. I2C buses should have their lines up by default, thanks to pull-up resistors, and masters and slaves should merely drive the lines low as needed. The absence of slaves should have no impact on the line behavior. If the Intel graphics chips (or the driver) rely on the display to hold the lines high, I'd say this is a seriously broken design.
As a side note, I thought that HDMI had the capability of presence detection, so there may be a better way to know if a display is connected or not, and this information could used to dynamically instantiate and delete the I2C buses? That way we could skip probing for EDID when there is no chance of success.
Yes, I think so too.
I admit I haven't got to the root of the problem here. My test was really simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable plugged in, I was getting the "SDA stuck high" messages; with the cable plugged in, I wasn't getting any of those.
But in any case, I haven't investigated it deeper in the hardware direction after figuring out that drm_get_edid would retry its attempts for retreiving the edid for 15 times, getting the -ENXIO error for all of them.
Well, you could always do manual line testing of the I2C bus _before_ calling drm_do_probe_ddc_edid()? And skip the call if the test fails (i.e. the bus isn't ready for use.) As said before, I am willing to export bit_test if it helps any. I've attached a patch doing exactly this. Let me know if you want me to commit it.
Yes, surely, I can do this. I did a similar test in the i915-specific patch, checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid, but I thought that perhaps it would be easier for all the cards relying on drm_get_edid to have a faster return path in case of problems.
I am fine with any solution, if this problem is happening to appear on i915 cards only, we could do this in our driver. It is that 15 attempts looks a bit overkill.
Your proposed patch is better than I first thought. I had forgotten that i2c-algo-bit tries up to 3 times to get a first ack from a slave. So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has already attempted 3 times to contact the slave, with no reply, so there's probably no point going on. A communication problem with a present slave would have returned a different error code.
Without your patch, a missing slave would cause 3 * 5 = 15 retries, which is definitely too much.
That being said, even then the whole probe sequence shouldn't exceed 10 ms, which I wouldn't expect a user to notice. The user-reported 4 second delay when running xrandr can't be caused by this. 4 seconds for 15 attempts is 250 ms per attempt, this looks like a timeout due to non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250, which I guess was what the reporting user was running. So even with your patch, there's still 750 ms to wait, I don't think this is acceptable. You really have to fix that I2C bus or skip probing it.
Yep, true. I've followed the easiest route so far - find out where the initial problem happens, and attempt at solving it. This change in drm_get_edid solves the delay at its origin, but we certainly could have additional delays propagated somewhere else. I couldn't reproduce the original reporter's huge delay, so I looked at what was within my reach.
In any case - looking at a faster way to leave the drm_do_probe_ddc_edid, while also allowing a way for having a more detailed probing - would it be an acceptable solution to add a:
MODULE_PARM(skip_unresponsive_edid, "Ignore outputs which do not provide edid on first attempt");
and update the patch to use this option?
This allows to avoid talking to a non-responding bus repeatedly until we finally timeout after 15 attempts. We can do this by catching the -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
Within the bit_doAddress we already try 3 times to get the edid data, so if the routine tells us that bus is not responding, it is mostly pointless to keep re-trying those attempts over and over again until we reach final number of retries.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
v2: added a module parameter to control this behavior. The idea came from discussion with Jean Delvare.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com --- drivers/gpu/drm/drm_edid.c | 5 +++++ drivers/gpu/drm/drm_stub.c | 5 +++++ include/drm/drmP.h | 2 ++ 3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..c7426ab 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2); + if (drm_ignore_unresponsive_edid && ret == -ENXIO) { + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n", + adapter->name); + break; + } } while (ret != 2 && --retries);
return ret == 2 ? 0 : -1; diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c index 6d7b083..b1013fe 100644 --- a/drivers/gpu/drm/drm_stub.c +++ b/drivers/gpu/drm/drm_stub.c @@ -46,16 +46,21 @@ EXPORT_SYMBOL(drm_vblank_offdelay); unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */ EXPORT_SYMBOL(drm_timestamp_precision);
+unsigned int drm_ignore_unresponsive_edid = 1; /* Ignore non-responding edid by default */ +EXPORT_SYMBOL(drm_ignore_unresponsive_edid); + MODULE_AUTHOR(CORE_AUTHOR); MODULE_DESCRIPTION(CORE_DESC); MODULE_LICENSE("GPL and additional rights"); MODULE_PARM_DESC(debug, "Enable debug output"); MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]"); MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]"); +MODULE_PARM_DESC(ignore_unresponsive_edid, "Automatically ignore outputs which do not provide EDID after 3 attempts");
module_param_named(debug, drm_debug, int, 0600); module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); +module_param_named(ignore_unresponsive_edid, drm_ignore_unresponsive_edid, int, 0600);
struct idr drm_minors_idr;
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 9b7c2bb..d7b0286 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1465,6 +1465,8 @@ extern unsigned int drm_debug; extern unsigned int drm_vblank_offdelay; extern unsigned int drm_timestamp_precision;
+extern unsigned int drm_ignore_unresponsive_edid; + extern struct class *drm_class; extern struct proc_dir_entry *drm_proc_root; extern struct dentry *drm_debugfs_root;
This allows to avoid talking to a non-responding bus repeatedly until we finally timeout after 15 attempts. We can do this by catching the -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
Within the bit_doAddress we already try 3 times to get the edid data, so if the routine tells us that bus is not responding, it is mostly pointless to keep re-trying those attempts over and over again until we reach final number of retries.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Timing results for i915-powered machines for 'time xrandr' command: Machine 1: from 0.840s to 0.290s Machine 2: from 0.315s to 0.280s Machine 3: from +/- 1s to 0.184s
Timing results for HD5770 with 'time xrandr' command: Machine 4: from 3.210s to 1.060s
v2: added a module parameter to control this behavior. The idea came from discussion with Jean Delvare.
v3: added external tested-by's and timing results.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com Tested-By: Sean Finney seanius@seanius.net Tested-By: Soren Hansen soren@linux2go.dk Tested-By: Hernando Torque sirius@sonnenkinder.org --- drivers/gpu/drm/drm_edid.c | 5 +++++ drivers/gpu/drm/drm_stub.c | 5 +++++ include/drm/drmP.h | 2 ++ 3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..c7426ab 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2); + if (drm_ignore_unresponsive_edid && ret == -ENXIO) { + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n", + adapter->name); + break; + } } while (ret != 2 && --retries);
return ret == 2 ? 0 : -1; diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c index 6d7b083..b1013fe 100644 --- a/drivers/gpu/drm/drm_stub.c +++ b/drivers/gpu/drm/drm_stub.c @@ -46,16 +46,21 @@ EXPORT_SYMBOL(drm_vblank_offdelay); unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */ EXPORT_SYMBOL(drm_timestamp_precision);
+unsigned int drm_ignore_unresponsive_edid = 1; /* Ignore non-responding edid by default */ +EXPORT_SYMBOL(drm_ignore_unresponsive_edid); + MODULE_AUTHOR(CORE_AUTHOR); MODULE_DESCRIPTION(CORE_DESC); MODULE_LICENSE("GPL and additional rights"); MODULE_PARM_DESC(debug, "Enable debug output"); MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]"); MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]"); +MODULE_PARM_DESC(ignore_unresponsive_edid, "Automatically ignore outputs which do not provide EDID after 3 attempts");
module_param_named(debug, drm_debug, int, 0600); module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); +module_param_named(ignore_unresponsive_edid, drm_ignore_unresponsive_edid, int, 0600);
struct idr drm_minors_idr;
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 9b7c2bb..d7b0286 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1465,6 +1465,8 @@ extern unsigned int drm_debug; extern unsigned int drm_vblank_offdelay; extern unsigned int drm_timestamp_precision;
+extern unsigned int drm_ignore_unresponsive_edid; + extern struct class *drm_class; extern struct proc_dir_entry *drm_proc_root; extern struct dentry *drm_debugfs_root;
On Wed, 26 Oct 2011 15:06:24 -0200, Eugeni Dodonov eugeni.dodonov@intel.com wrote:
This allows to avoid talking to a non-responding bus repeatedly until we finally timeout after 15 attempts. We can do this by catching the -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
Within the bit_doAddress we already try 3 times to get the edid data, so if the routine tells us that bus is not responding, it is mostly pointless to keep re-trying those attempts over and over again until we reach final number of retries.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Timing results for i915-powered machines for 'time xrandr' command: Machine 1: from 0.840s to 0.290s Machine 2: from 0.315s to 0.280s Machine 3: from +/- 1s to 0.184s
Timing results for HD5770 with 'time xrandr' command: Machine 4: from 3.210s to 1.060s
v2: added a module parameter to control this behavior. The idea came from discussion with Jean Delvare.
Just one minor nitpick here... Otherwise it looks a very tasty patch.
+unsigned int drm_ignore_unresponsive_edid = 1; /* Ignore non-responding edid by default */ +EXPORT_SYMBOL(drm_ignore_unresponsive_edid);
This comment would be better in the user-facing description, i.e. (default: true). Do we have a candidate user for exporting the symbol?
MODULE_AUTHOR(CORE_AUTHOR); MODULE_DESCRIPTION(CORE_DESC); MODULE_LICENSE("GPL and additional rights"); MODULE_PARM_DESC(debug, "Enable debug output"); MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]"); MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]"); +MODULE_PARM_DESC(ignore_unresponsive_edid, "Automatically ignore outputs which do not provide EDID after 3 attempts");
This allows to avoid talking to a non-responding bus repeatedly until we finally timeout after 15 attempts. We can do this by catching the -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
Within the bit_doAddress we already try 3 times to get the edid data, so if the routine tells us that bus is not responding, it is mostly pointless to keep re-trying those attempts over and over again until we reach final number of retries.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Timing results for i915-powered machines for 'time xrandr' command: Machine 1: from 0.840s to 0.290s Machine 2: from 0.315s to 0.280s Machine 3: from +/- 1s to 0.184s
Timing results for HD5770 with 'time xrandr' command: Machine 4: from 3.210s to 1.060s
v2: added a module parameter to control this behavior. The idea came from discussion with Jean Delvare.
v3: added external tested-by's and timing results.
v4: changed module parameter to bool, added default value description
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com Reviewed-By: Chris Wilson chris@chris-wilson.co.uk Tested-By: Sean Finney seanius@seanius.net Tested-By: Soren Hansen soren@linux2go.dk Tested-By: Hernando Torque sirius@sonnenkinder.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41059 --- drivers/gpu/drm/drm_edid.c | 5 +++++ drivers/gpu/drm/drm_stub.c | 5 +++++ include/drm/drmP.h | 2 ++ 3 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..c7426ab 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2); + if (drm_ignore_unresponsive_edid && ret == -ENXIO) { + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n", + adapter->name); + break; + } } while (ret != 2 && --retries);
return ret == 2 ? 0 : -1; diff --git a/drivers/gpu/drm/drm_stub.c b/drivers/gpu/drm/drm_stub.c index 6d7b083..dea0fcd 100644 --- a/drivers/gpu/drm/drm_stub.c +++ b/drivers/gpu/drm/drm_stub.c @@ -46,16 +46,21 @@ EXPORT_SYMBOL(drm_vblank_offdelay); unsigned int drm_timestamp_precision = 20; /* Default to 20 usecs. */ EXPORT_SYMBOL(drm_timestamp_precision);
+unsigned int drm_ignore_unresponsive_edid = 1; /* Ignore non-responding edid by default */ +EXPORT_SYMBOL(drm_ignore_unresponsive_edid); + MODULE_AUTHOR(CORE_AUTHOR); MODULE_DESCRIPTION(CORE_DESC); MODULE_LICENSE("GPL and additional rights"); MODULE_PARM_DESC(debug, "Enable debug output"); MODULE_PARM_DESC(vblankoffdelay, "Delay until vblank irq auto-disable [msecs]"); MODULE_PARM_DESC(timestamp_precision_usec, "Max. error on timestamps [usecs]"); +MODULE_PARM_DESC(ignore_unresponsive_edid, "Automatically ignore outputs which do not provide EDID after 3 attempts (default: true)");
module_param_named(debug, drm_debug, int, 0600); module_param_named(vblankoffdelay, drm_vblank_offdelay, int, 0600); module_param_named(timestamp_precision_usec, drm_timestamp_precision, int, 0600); +module_param_named(ignore_unresponsive_edid, drm_ignore_unresponsive_edid, bool, 0600);
struct idr drm_minors_idr;
diff --git a/include/drm/drmP.h b/include/drm/drmP.h index 9b7c2bb..d7b0286 100644 --- a/include/drm/drmP.h +++ b/include/drm/drmP.h @@ -1465,6 +1465,8 @@ extern unsigned int drm_debug; extern unsigned int drm_vblank_offdelay; extern unsigned int drm_timestamp_precision;
+extern unsigned int drm_ignore_unresponsive_edid; + extern struct class *drm_class; extern struct proc_dir_entry *drm_proc_root; extern struct dentry *drm_debugfs_root;
Hi Eugeni,
On Mon, 24 Oct 2011 12:40:14 -0200, Eugeni Dodonov wrote:
On Thu, Oct 20, 2011 at 10:33, Jean Delvare khali@linux-fr.org wrote:
Just to clarify: by "connectivity is setup", do you mean code in the driver setting the GPIO pin direction etc., or a display being connected to the graphics card?
I admit I am a little surprised. I2C buses should have their lines up by default, thanks to pull-up resistors, and masters and slaves should merely drive the lines low as needed. The absence of slaves should have no impact on the line behavior. If the Intel graphics chips (or the driver) rely on the display to hold the lines high, I'd say this is a seriously broken design.
As a side note, I thought that HDMI had the capability of presence detection, so there may be a better way to know if a display is connected or not, and this information could used to dynamically instantiate and delete the I2C buses? That way we could skip probing for EDID when there is no chance of success.
Yes, I think so too.
I admit I haven't got to the root of the problem here. My test was really simple, borrowed from the test_bus() at i2c-algo-bit.c - without HDMI cable plugged in, I was getting the "SDA stuck high" messages; with the cable plugged in, I wasn't getting any of those.
Just the HDMI cable, or with a screen at the other end?
Either way, this smells like bad hardware design. The graphics card itself should pull the I2C bus lines up by default, it shouldn't rely on a cable (I'm not familiar with HDMI but I'm not sure if that makes sense at all) or external display (more likely) to do it. But I can also imagine that this could be a driver issue, maybe the GPIO lines aren't setup properly by the driver? I'm not familiar enough with the Intel graphics hardware and driver to tell, you'll know better.
But in any case, I haven't investigated it deeper in the hardware direction after figuring out that drm_get_edid would retry its attempts for retreiving the edid for 15 times, getting the -ENXIO error for all of them.
Well, you could always do manual line testing of the I2C bus _before_ calling drm_do_probe_ddc_edid()? And skip the call if the test fails (i.e. the bus isn't ready for use.) As said before, I am willing to export bit_test if it helps any. I've attached a patch doing exactly this. Let me know if you want me to commit it.
Yes, surely, I can do this. I did a similar test in the i915-specific patch, checking if we can talk to i2c adapter pior to call drm_do_probe_ddc_edid, but I thought that perhaps it would be easier for all the cards relying on drm_get_edid to have a faster return path in case of problems.
I am fine with any solution, if this problem is happening to appear on i915 cards only, we could do this in our driver. It is that 15 attempts looks a bit overkill.
Yes, I agree, 15 retries makes no sense. And I like your original patch, it makes a lot of sense.
Your proposed patch is better than I first thought. I had forgotten that i2c-algo-bit tries up to 3 times to get a first ack from a slave. So if i2c_transfer returns -ENXIO, this means that i2c-algo-bit has already attempted 3 times to contact the slave, with no reply, so there's probably no point going on. A communication problem with a present slave would have returned a different error code.
Without your patch, a missing slave would cause 3 * 5 = 15 retries, which is definitely too much.
That being said, even then the whole probe sequence shouldn't exceed 10 ms, which I wouldn't expect a user to notice. The user-reported 4 second delay when running xrandr can't be caused by this. 4 seconds for 15 attempts is 250 ms per attempt, this looks like a timeout due to non-functional bus, not a nack. Not that 250 ms is 1 jiffy for HZ=250, which I guess was what the reporting user was running. So even with your patch, there's still 750 ms to wait, I don't think this is acceptable. You really have to fix that I2C bus or skip probing it.
Yep, true. I've followed the easiest route so far - find out where the initial problem happens, and attempt at solving it. This change in drm_get_edid solves the delay at its origin, but we certainly could have additional delays propagated somewhere else. I couldn't reproduce the original reporter's huge delay, so I looked at what was within my reach.
Your fix is not really "at the origin" of the problem. Fixing it at the origin would be not creating the I2C bus if it won't work (or marking it as unavailable in some way, if the drm framework allows this.) If that is not possible, testing the lines before accessing the bus would be closer to the origin, however it might be argued that it adds delays in the working case, and also somehow violates the I2C specification (the line testing is not part of the specification and could confuse some chips, in theory at least.)
In any case - looking at a faster way to leave the drm_do_probe_ddc_edid, while also allowing a way for having a more detailed probing - would it be an acceptable solution to add a:
MODULE_PARM(skip_unresponsive_edid, "Ignore outputs which do not provide edid on first attempt");
and update the patch to use this option?
I am not in favor of this, as it makes the code more complex while your original patch looks just fine to me. I no longer expect it to cause any trouble, so I suggest applying it now. If any problem is reported during the next two months, we can always reconsider. But even then, expecting users to pass module parameters is usually a bad strategy, as this means things aren't working for them out of the box in the first place.
On Tue, Oct 18, 2011 at 6:02 AM, Dave Airlie airlied@gmail.com wrote:
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Jean, Alex,
I'm thinking of thowing this into -next, any objections?
I haven't really hit the issue this patch attempts to fix on any cards, but on radeon at least, we know which i2c buses are used for ddc and which are not, so barring the occasional oem vbios bug, we won't be trying ddc on arbitrary i2c buses.
Alex
Dave.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com
drivers/gpu/drm/drm_edid.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..1bca6d7 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2);
- if (ret == -ENXIO) {
- DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n",
- adapter->name);
- break;
- }
} while (ret != 2 && --retries);
return ret == 2 ? 0 : -1;
1.7.7
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
So as we all agree on this patch now, I updated it with all the Tested-by and Reviewed-by.
Dave, I think it is good for pulling into -next now.
Thanks!
This allows to avoid talking to a non-responding bus repeatedly until we finally timeout after 15 attempts. We can do this by catching the -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
Within the bit_doAddress we already try 3 times to get the edid data, so if the routine tells us that bus is not responding, it is mostly pointless to keep re-trying those attempts over and over again until we reach final number of retries.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs (up to 30x in one worst case).
Timing results for i915-powered machines for 'time xrandr' command: Machine 1: from 0.840s to 0.290s Machine 2: from 0.315s to 0.280s Machine 3: from +/- 4s to 0.184s
Timing results for HD5770 with 'time xrandr' command: Machine 4: from 3.210s to 1.060s
Reviewed-by: Chris Wilson chris@hchris-wilson.co.uk Reviewed-by: Keith Packard keithp@keithp.com Tested-by: Sean Finney seanius@seanius.net Tested-by: Soren Hansen soren@linux2go.dk Tested-by: Hernando Torque sirius@sonnenkinder.org Tested-by: Mike Lothian mike@fireburn.co.uk Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41059 Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com --- drivers/gpu/drm/drm_edid.c | 5 +++++ 1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 7425e5c..1bca6d7 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -265,6 +265,11 @@ drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, } }; ret = i2c_transfer(adapter, msgs, 2); + if (ret == -ENXIO) { + DRM_DEBUG_KMS("drm: skipping non-existent adapter %s\n", + adapter->name); + break; + } } while (ret != 2 && --retries);
return ret == 2 ? 0 : -1;
On Mon, 17 Oct 2011 11:12:29 -0200, Eugeni Dodonov eugeni.dodonov@intel.com wrote:
This allows to avoid talking to a non-existent bus repeatedly until we finally timeout. The non-existent bus is signalled by -ENXIO error, provided by i2c_algo_bit:bit_doAddress call.
As the advantage of such change, all the other routines which use drm_get_edid would benefit for this timeout.
This change should fix https://bugs.freedesktop.org/show_bug.cgi?id=41059 and improve overall edid detection timing by 10-30% in most cases, and by a much larger margin in case of phantom outputs.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com
Looks like we have reached the conclusion that this simple patch is the least likely to cause problems and easiest to fix if it does. :) Reviewed-by: Chris Wilson chris@hchris-wilson.co.uk -Chris
This adds a new function intel_drm_get_valid_edid, which attempts to detect EDID values from available devices and returns quickly in case no connection is present. We detect non-existing devices by checking the i2c_transfer status, and if it fails with the -ENXIO result, we know that the i2c_algo_bit was unable to locate the hardware, so we give up on further edid discovery.
This should improve the edid detection timeouts by about 10-30% in most cases, and by a much larger margin in case of broken or phantom outputs https://bugs.freedesktop.org/show_bug.cgi?id=41059.
This also removes intel_ddc_probe, whose functionality is now done by intel_drm_get_valid_edid.
Signed-off-by: Eugeni Dodonov eugeni.dodonov@intel.com --- drivers/gpu/drm/i915/intel_crt.c | 46 ++++++++++++++++++------------------ drivers/gpu/drm/i915/intel_dp.c | 4 +- drivers/gpu/drm/i915/intel_drv.h | 3 +- drivers/gpu/drm/i915/intel_hdmi.c | 4 +- drivers/gpu/drm/i915/intel_i2c.c | 42 ++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_lvds.c | 2 +- drivers/gpu/drm/i915/intel_modes.c | 29 +---------------------- drivers/gpu/drm/i915/intel_sdvo.c | 4 +- 8 files changed, 75 insertions(+), 59 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c index 0979d88..3b55fdf 100644 --- a/drivers/gpu/drm/i915/intel_crt.c +++ b/drivers/gpu/drm/i915/intel_crt.c @@ -273,36 +273,36 @@ static bool intel_crt_detect_ddc(struct drm_connector *connector) { struct intel_crt *crt = intel_attached_crt(connector); struct drm_i915_private *dev_priv = crt->base.base.dev->dev_private; + struct edid *edid = NULL; + bool is_digital = false;
/* CRT should always be at 0, but check anyway */ if (crt->base.type != INTEL_OUTPUT_ANALOG) return false;
- if (intel_ddc_probe(&crt->base, dev_priv->crt_ddc_pin)) { - struct edid *edid; - bool is_digital = false; + edid = intel_drm_get_valid_edid(connector, + &dev_priv->gmbus[dev_priv->crt_ddc_pin].adapter); + if (!edid) + return false;
- edid = drm_get_edid(connector, - &dev_priv->gmbus[dev_priv->crt_ddc_pin].adapter); - /* - * This may be a DVI-I connector with a shared DDC - * link between analog and digital outputs, so we - * have to check the EDID input spec of the attached device. - * - * On the other hand, what should we do if it is a broken EDID? - */ - if (edid != NULL) { - is_digital = edid->input & DRM_EDID_INPUT_DIGITAL; - connector->display_info.raw_edid = NULL; - kfree(edid); - } + /* + * This may be a DVI-I connector with a shared DDC + * link between analog and digital outputs, so we + * have to check the EDID input spec of the attached device. + * + * On the other hand, what should we do if it is a broken EDID? + */ + if (edid != NULL) { + is_digital = edid->input & DRM_EDID_INPUT_DIGITAL; + connector->display_info.raw_edid = NULL; + kfree(edid); + }
- if (!is_digital) { - DRM_DEBUG_KMS("CRT detected via DDC:0x50 [EDID]\n"); - return true; - } else { - DRM_DEBUG_KMS("CRT not detected via DDC:0x50 [EDID reports a digital panel]\n"); - } + if (!is_digital) { + DRM_DEBUG_KMS("CRT detected via DDC:0x50 [EDID]\n"); + return true; + } else { + DRM_DEBUG_KMS("CRT not detected via DDC:0x50 [EDID reports a digital panel]\n"); }
return false; diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 44fef5e..dd0d8b1 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -1715,7 +1715,7 @@ intel_dp_detect(struct drm_connector *connector, bool force) if (intel_dp->force_audio) { intel_dp->has_audio = intel_dp->force_audio > 0; } else { - edid = drm_get_edid(connector, &intel_dp->adapter); + edid = intel_drm_get_valid_edid(connector, &intel_dp->adapter); if (edid) { intel_dp->has_audio = drm_detect_monitor_audio(edid); connector->display_info.raw_edid = NULL; @@ -1772,7 +1772,7 @@ intel_dp_detect_audio(struct drm_connector *connector) struct edid *edid; bool has_audio = false;
- edid = drm_get_edid(connector, &intel_dp->adapter); + edid = intel_drm_get_valid_edid(connector, &intel_dp->adapter); if (edid) { has_audio = drm_detect_monitor_audio(edid);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h index fe1099d..0e4b823 100644 --- a/drivers/gpu/drm/i915/intel_drv.h +++ b/drivers/gpu/drm/i915/intel_drv.h @@ -264,8 +264,9 @@ struct intel_fbc_work { int interval; };
+struct edid * intel_drm_get_valid_edid(struct drm_connector *connector, + struct i2c_adapter *adapter); int intel_ddc_get_modes(struct drm_connector *c, struct i2c_adapter *adapter); -extern bool intel_ddc_probe(struct intel_encoder *intel_encoder, int ddc_bus);
extern void intel_attach_force_audio_property(struct drm_connector *connector); extern void intel_attach_broadcast_rgb_property(struct drm_connector *connector); diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c index 226ba83..714f2fb 100644 --- a/drivers/gpu/drm/i915/intel_hdmi.c +++ b/drivers/gpu/drm/i915/intel_hdmi.c @@ -329,7 +329,7 @@ intel_hdmi_detect(struct drm_connector *connector, bool force)
intel_hdmi->has_hdmi_sink = false; intel_hdmi->has_audio = false; - edid = drm_get_edid(connector, + edid = intel_drm_get_valid_edid(connector, &dev_priv->gmbus[intel_hdmi->ddc_bus].adapter);
if (edid) { @@ -371,7 +371,7 @@ intel_hdmi_detect_audio(struct drm_connector *connector) struct edid *edid; bool has_audio = false;
- edid = drm_get_edid(connector, + edid = intel_drm_get_valid_edid(connector, &dev_priv->gmbus[intel_hdmi->ddc_bus].adapter); if (edid) { if (edid->input & DRM_EDID_INPUT_DIGITAL) diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/intel_i2c.c index d98cee6..b3a6eda 100644 --- a/drivers/gpu/drm/i915/intel_i2c.c +++ b/drivers/gpu/drm/i915/intel_i2c.c @@ -470,3 +470,45 @@ void intel_teardown_gmbus(struct drm_device *dev) kfree(dev_priv->gmbus); dev_priv->gmbus = NULL; } + +/** + * intel_drm_get_valid_edid - gets edid from existent adapters only + * @connector: DRM connector device to use + * @adapter: i2c adapter + * + * Verifies if the i2c adapter is responding to our queries before + * attempting to do proper communication with it. If it does, + * retreive the EDID with help of drm_get_edid + */ +struct edid * +intel_drm_get_valid_edid(struct drm_connector *connector, + struct i2c_adapter *adapter) +{ + int ret; + u8 out_buf[] = { 0x0, 0x0}; + u8 buf[2]; + struct i2c_msg msgs[] = { + { + .addr = 0x50, + .flags = 0, + .len = 1, + .buf = out_buf, + }, + { + .addr = 0x50, + .flags = I2C_M_RD, + .len = 1, + .buf = buf, + } + }; + + /* We just check for -ENXIO - drm_get_edid checks if the transfer + * works and manages the remaining parts of the EDID */ + ret = i2c_transfer(adapter, msgs, 2); + if (ret == -ENXIO) { + DRM_DEBUG_KMS("i915: adapter %s not responding: %d\n", + adapter->name, ret); + return NULL; + } + return drm_get_edid(connector, adapter); +} diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c index 31da77f..8f59a8e 100644 --- a/drivers/gpu/drm/i915/intel_lvds.c +++ b/drivers/gpu/drm/i915/intel_lvds.c @@ -921,7 +921,7 @@ bool intel_lvds_init(struct drm_device *dev) * Attempt to get the fixed panel mode from DDC. Assume that the * preferred mode is the right one. */ - intel_lvds->edid = drm_get_edid(connector, + intel_lvds->edid = intel_drm_get_valid_edid(connector, &dev_priv->gmbus[pin].adapter); if (intel_lvds->edid) { if (drm_add_edid_modes(connector, diff --git a/drivers/gpu/drm/i915/intel_modes.c b/drivers/gpu/drm/i915/intel_modes.c index 3b26a3b..ae1a989 100644 --- a/drivers/gpu/drm/i915/intel_modes.c +++ b/drivers/gpu/drm/i915/intel_modes.c @@ -31,33 +31,6 @@ #include "i915_drv.h"
/** - * intel_ddc_probe - * - */ -bool intel_ddc_probe(struct intel_encoder *intel_encoder, int ddc_bus) -{ - struct drm_i915_private *dev_priv = intel_encoder->base.dev->dev_private; - u8 out_buf[] = { 0x0, 0x0}; - u8 buf[2]; - struct i2c_msg msgs[] = { - { - .addr = 0x50, - .flags = 0, - .len = 1, - .buf = out_buf, - }, - { - .addr = 0x50, - .flags = I2C_M_RD, - .len = 1, - .buf = buf, - } - }; - - return i2c_transfer(&dev_priv->gmbus[ddc_bus].adapter, msgs, 2) == 2; -} - -/** * intel_ddc_get_modes - get modelist from monitor * @connector: DRM connector device to use * @adapter: i2c adapter @@ -70,7 +43,7 @@ int intel_ddc_get_modes(struct drm_connector *connector, struct edid *edid; int ret = 0;
- edid = drm_get_edid(connector, adapter); + edid = intel_drm_get_valid_edid(connector, adapter); if (edid) { drm_mode_connector_update_edid_property(connector, edid); ret = drm_add_edid_modes(connector, edid); diff --git a/drivers/gpu/drm/i915/intel_sdvo.c b/drivers/gpu/drm/i915/intel_sdvo.c index 6348c49..8b4ac61 100644 --- a/drivers/gpu/drm/i915/intel_sdvo.c +++ b/drivers/gpu/drm/i915/intel_sdvo.c @@ -1240,7 +1240,7 @@ static struct edid * intel_sdvo_get_edid(struct drm_connector *connector) { struct intel_sdvo *sdvo = intel_attached_sdvo(connector); - return drm_get_edid(connector, &sdvo->ddc); + return intel_drm_get_valid_edid(connector, &sdvo->ddc); }
/* Mac mini hack -- use the same DDC as the analog connector */ @@ -1249,7 +1249,7 @@ intel_sdvo_get_analog_edid(struct drm_connector *connector) { struct drm_i915_private *dev_priv = connector->dev->dev_private;
- return drm_get_edid(connector, + return intel_drm_get_valid_edid(connector, &dev_priv->gmbus[dev_priv->crt_ddc_pin].adapter); }
dri-devel@lists.freedesktop.org