i2c retries if sees an EGAIN, drm_do_probe_ddc_edid retries until it gets a result and *then* drm_do_get_edid retries until it gets a result it is happy with. All in all, that is a lot of processor intensive looping in cases where we do not expect and cannot get valid data - for example on Intel with disconnected hardware we will busy-spin until we hit the i2c timeout. This is then repeated for every connector when querying the current status of outputs.
So to improve the situation, we can trim the number of retries for reading the base block and to check for a reschedule before proceeding so that we do not hog the machine whilst probing outputs. (Though since we will be likely blocking the graphics device, the user is still going to notice the latency.)
Reported-by: Linus Torvalds torvalds@linux-foundation.org Reported-by: Stephan Bärwolf stephan.baerwolf@tu-ilmenau.de Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Eugeni Dodonov eugeni.dodonov@intel.com Cc: Dave Airlie airlied@redhat.com Cc: Alex Deucher alexdeucher@gmail.com --- drivers/gpu/drm/drm_edid.c | 30 +++++++++++++++++++++++------- 1 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index ece03fc..4409cd4 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -35,6 +35,9 @@ #include "drm_edid.h" #include "drm_edid_modes.h"
+#define EDID_PROBE_RETRIES 1 +#define EDID_READ_RETRIES 5 + #define version_greater(edid, maj, min) \ (((edid)->version > (maj)) || \ ((edid)->version == (maj) && (edid)->revision > (min))) @@ -239,11 +242,12 @@ EXPORT_SYMBOL(drm_edid_is_valid); * Try to fetch EDID information by calling i2c driver function. */ static int -drm_do_probe_ddc_edid(struct i2c_adapter *adapter, unsigned char *buf, - int block, int len) +drm_get_edid_block(struct i2c_adapter *adapter, + unsigned char *buf, int block, int len, + int retries) { unsigned char start = block * EDID_LENGTH; - int ret, retries = 5; + int ret;
/* The core i2c driver will automatically retry the transfer if the * adapter reports EAGAIN. However, we find that bit-banging transfers @@ -293,7 +297,19 @@ drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
/* base block fetch */ for (i = 0; i < 4; i++) { - if (drm_do_probe_ddc_edid(adapter, block, 0, EDID_LENGTH)) + /* EDID transfer may be quite processor intensive and last + * a long time. For example, when waiting for a timeout on + * a non-existent connector whilst using bit-banging. As a + * result we can end up hogging the machine, so give someone + * else the chance to run first. But when we have started a + * transfer don't interrupt until finished. + */ + if (need_resched()) + schedule(); + + if (drm_get_edid_block(adapter, + block, 0, EDID_LENGTH, + EDID_PROBE_RETRIES)) goto out; if (drm_edid_block_valid(block)) break; @@ -316,9 +332,9 @@ drm_do_get_edid(struct drm_connector *connector, struct i2c_adapter *adapter)
for (j = 1; j <= block[0x7e]; j++) { for (i = 0; i < 4; i++) { - if (drm_do_probe_ddc_edid(adapter, + if (drm_get_edid_block(adapter, block + (valid_extensions + 1) * EDID_LENGTH, - j, EDID_LENGTH)) + j, EDID_LENGTH, EDID_READ_RETRIES)) goto out; if (drm_edid_block_valid(block + (valid_extensions + 1) * EDID_LENGTH)) { valid_extensions++; @@ -362,7 +378,7 @@ drm_probe_ddc(struct i2c_adapter *adapter) { unsigned char out;
- return (drm_do_probe_ddc_edid(adapter, &out, 0, 1) == 0); + return drm_get_edid_block(adapter, &out, 0, 1, EDID_PROBE_RETRIES) == 0; }
/**
On Thu, Feb 23, 2012 at 11:52 AM, Chris Wilson chris@chris-wilson.co.uk wrote:
i2c retries if sees an EGAIN, drm_do_probe_ddc_edid retries until it gets a result and *then* drm_do_get_edid retries until it gets a result it is happy with. All in all, that is a lot of processor intensive looping in cases where we do not expect and cannot get valid data - for example on Intel with disconnected hardware we will busy-spin until we hit the i2c timeout. This is then repeated for every connector when querying the current status of outputs.
Sadly, this doesn't seem to make any difference to my case. My xrandr stays at 0.555s even with this patch.
So apparently the Apple Mac Mini issue is not about retries. But maybe this fixes the problem Stephan Bärwolf reported? The Apple Mac Mini is known for doing things oddly, so ..
You didn't include Stephan on the cc for that patch, though.
Btw, clearly X does *not* cache the EDID results, at least not for this case. So the explicit xrandr example is probably pretty close to what wine does. Maybe the proper fix is to just make X.org force caching when clients do this (because it's definitely X that does the drm_mode_getconnector() thing - xrandr itself spends zero time on this, it just does an X request and waits for the result).
The fact that EDID takes half a second to get is not a problem per se. It's a problem only when X does it over and over again because some client does something unexpected.
Linus
On Thu, Feb 23, 2012 at 12:15 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
Sadly, this doesn't seem to make any difference to my case. My xrandr stays at 0.555s even with this patch.
Btw, profiling with call chains seems to say that it all comes from intel_sdvo_get_analog_edid() (about half from intel_sdvo_get_modes() and half from intel_sdvo_tmds_sink_detect()).
All called through drm_ioctl() -> drm_mode_getconnector() -> drm_helper_probe_single_connector_modes().
Which I guess isn't anything interesting, but that intel_sdvo_get_analog_edid() thing seems to be very much a Mac mini hack. There's a comment about that in the sources too:
/* * Mac mini hack. On this device, the DVI-I connector shares one DDC * link between analog and digital outputs. So, if the regular SDVO * DDC fails, check to see if the analog output is disconnected, in * which case we'll look there for the digital DDC data. */
and maybe that mac mini hack ends up interacting badly with something else? I'll happily test patches if people have any ideas.
Linus
On 02/23/2012 06:15 PM, Linus Torvalds wrote:
On Thu, Feb 23, 2012 at 11:52 AM, Chris Wilsonchris@chris-wilson.co.uk wrote:
i2c retries if sees an EGAIN, drm_do_probe_ddc_edid retries until it gets a result and *then* drm_do_get_edid retries until it gets a result it is happy with. All in all, that is a lot of processor intensive looping in cases where we do not expect and cannot get valid data - for example on Intel with disconnected hardware we will busy-spin until we hit the i2c timeout. This is then repeated for every connector when querying the current status of outputs.
Sadly, this doesn't seem to make any difference to my case. My xrandr stays at 0.555s even with this patch.
Perhaps a stupid question, but does you tree has http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-next&id=9292f37... from Dave's drm-next?
If it has, it would be the 1st time that I see xrandr take longer than .5s with that patch on an Intel GPU. We even added a check for this into intel-gpu-tools to warn us if any machine takes that long, and none had hit it so far. So if this is the case here, there is something Mac Mini-specific indeed to investigate.
-Eugeni
On Thu, Feb 23, 2012 at 1:36 PM, Eugeni Dodonov eugeni.dodonov@intel.com wrote:
Perhaps a stupid question, but does you tree has http://cgit.freedesktop.org/~airlied/linux/commit/?h=drm-next&id=9292f37... from Dave's drm-next?
If it has, it would be the 1st time that I see xrandr take longer than .5s with that patch on an Intel GPU. We even added a check for this into intel-gpu-tools to warn us if any machine takes that long, and none had hit it so far. So if this is the case here, there is something Mac Mini-specific indeed to investigate.
Yup, the tree I tested was current -git with the two commits mentioned in this thread:
drm: Reduce the number of retries whilst reading EDIDs drm: give up on edid retries when i2c bus is not responding
applied separately on top of it.
Linus
On Thu, 2012-02-23 at 12:15 -0800, Linus Torvalds wrote:
Btw, clearly X does *not* cache the EDID results, at least not for this case. So the explicit xrandr example is probably pretty close to what wine does. Maybe the proper fix is to just make X.org force caching when clients do this (because it's definitely X that does the drm_mode_getconnector() thing - xrandr itself spends zero time on this, it just does an X request and waits for the result).
RANDR exposes two requests here for a reason. We used to only have RRGetScreenResources which always pulled data afresh. We added RRGetScreenResourcesCurrent to get the cached version. If you want xrandr(1) to use the latter say xrandr --current. Arguably xrandr should do things the other way around and require you to say --reprobe or something.
Even then, the kernel should cache if it can. And any competent hardware has plug interrupts on DP and DVI, so we really should get this right.
- ajax
dri-devel@lists.freedesktop.org