I'm stuck at home with just my i5 laptop due to the office being shut due to the ongoing floods. But I've booted and ran this for a few hours and it seems to be better than the current tree. It contains a couple of patches to fix DMAR interaction issues I see on this laptop on top of Chris's pull.
Dave.
The following changes since commit 4162cf64973df51fc885825bc9ca4d055891c49f:
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 (2011-01-11 16:32:41 -0800)
are available in the git repository at:
ssh://master.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6.git drm-fixes
Chris Wilson (25): drm/i915/sdvo: Defer detection of output capabilities until probing drm/i915/panel: Only record the backlight level when it is enabled drm/i915/lvds: Always use 0 to disable the pfit controller drm/i915: Use the mappable sizes determined by GTT for consistency. drm/i915: Workaround erratum on i830 for TAIL pointer within last 2 cachelines agp/intel: Flush the chipset write buffers when changing GTT base drm/i915: add 'reset' parameter drm/i915: Remove impossible test drm/i915: Enforce write ordering through the GTT drm/i915: Handle ringbuffer stalls when flushing drm/i915: Mask USER interrupts on gen6 (until required) drm/i915/debugfs: Show the per-ring IMR drm/i915/ringbuffer: Simplify the ring irq refcounting drm/i915: Make the ring IMR handling private drm/i915: Propagate error from flushing the ring drm/i915: Include TLB miss overhead for computing WM drm/i915: Record the error batchbuffer on each ring drm/i915/gtt: Unmap the PCI pages after unbinding them from the GTT drm/i915: Periodically flush the active lists and requests drm/i915: Record AGP memory type upon error drm/i915/debugfs: Show all objects in the gtt drm/i915/execbuffer: Correctly clear the current object list upon EFAULT drm/i915/evict: Ensure we completely cleanup on failure drm/i915: If we hit OOM when allocating GTT pages, clear the aperture drm/i915/execbuffer: Reorder binding of objects to favour restrictions
Dave Airlie (3): Merge branch 'drm-intel-fixes' of ssh://master.kernel.org/.../ickle/drm-intel i915/gtt: fix ordering issues with status setup and DMAR i915/gtt: fix ordering causing DMAR errors on object teardown.
David Müller (1): drm/i915/crt: Check for a analog monitor in case of DVI-I
Jesse Barnes (9): drm/i915: check eDP encoder correctly when setting modes drm/i915: make DP training try a little harder drm/i915: support overclocking on Sandy Bridge drm/i915: support low power watermarks on Ironlake drm/i915: avoid reading non-existent PLL reg on Ironlake+ drm/i915: re-enable rc6 support for Ironlake+ drm/i915: fix rc6 enabling around suspend/resume drm/i915: cleanup rc6 code drm/i915: detect & report PCH display error interrupts
Yuanhan Liu (2): drm/i915: fix calculation of eDP signal levels on Sandybridge drm/i915: fix the wrong latency value while computing wm0
drivers/char/agp/intel-agp.h | 2 + drivers/char/agp/intel-gtt.c | 17 +- drivers/gpu/drm/i915/i915_debugfs.c | 87 +++++- drivers/gpu/drm/i915/i915_dma.c | 8 - drivers/gpu/drm/i915/i915_drv.c | 9 + drivers/gpu/drm/i915/i915_drv.h | 24 +- drivers/gpu/drm/i915/i915_gem.c | 156 +++++++--- drivers/gpu/drm/i915/i915_gem_evict.c | 9 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 119 +++++--- drivers/gpu/drm/i915/i915_gem_gtt.c | 10 +- drivers/gpu/drm/i915/i915_irq.c | 269 +++++++----------- drivers/gpu/drm/i915/i915_reg.h | 95 ++++++- drivers/gpu/drm/i915/i915_suspend.c | 8 +- drivers/gpu/drm/i915/intel_crt.c | 30 ++- drivers/gpu/drm/i915/intel_display.c | 434 ++++++++++++++++------------ drivers/gpu/drm/i915/intel_dp.c | 50 +++- drivers/gpu/drm/i915/intel_drv.h | 3 + drivers/gpu/drm/i915/intel_fb.c | 20 +- drivers/gpu/drm/i915/intel_lvds.c | 14 +- drivers/gpu/drm/i915/intel_panel.c | 31 ++ drivers/gpu/drm/i915/intel_ringbuffer.c | 255 ++++++++++++----- drivers/gpu/drm/i915/intel_ringbuffer.h | 36 ++- drivers/gpu/drm/i915/intel_sdvo.c | 33 +-- 23 files changed, 1082 insertions(+), 637 deletions(-)
On Tue, Jan 11, 2011 at 10:03 PM, Dave Airlie airlied@linux.ie wrote:
I'm stuck at home with just my i5 laptop due to the office being shut due to the ongoing floods. But I've booted and ran this for a few hours and it seems to be better than the current tree. It contains a couple of patches to fix DMAR interaction issues I see on this laptop on top of Chris's pull.
Hmm. I'm not seeing the screensaver issue any more, but there's something wrong with video. At least the TED ones (I'm not seeing it on a youtube video i tried). See for example
http://www.ted.com/talks/lang/eng/david_gallo_shows_underwater_astonishments...
and when there is fast movement in the video (like when the octopus is spooked), I get these odd lines of noise.
In fact, while I noticed the lines in the video itself, it's actually most repeatably noticeable in the buttons underneath while the video is playing: make your mouse go back-and-forth between the "rate" and "share" buttons, and they get corrupted (and it also corrupts the progress bar).
It looks a bit like the noise you get with insufficient memory bandwidth, but I doubt that's the case here. Perhaps just some motion-comp problem?
Any ideas?
Linus
On Wed, 12 Jan 2011 11:22:58 -0800 Linus Torvalds torvalds@linux-foundation.org wrote:
On Tue, Jan 11, 2011 at 10:03 PM, Dave Airlie airlied@linux.ie wrote:
I'm stuck at home with just my i5 laptop due to the office being shut due to the ongoing floods. But I've booted and ran this for a few hours and it seems to be better than the current tree. It contains a couple of patches to fix DMAR interaction issues I see on this laptop on top of Chris's pull.
Hmm. I'm not seeing the screensaver issue any more, but there's something wrong with video. At least the TED ones (I'm not seeing it on a youtube video i tried). See for example
http://www.ted.com/talks/lang/eng/david_gallo_shows_underwater_astonishments...
and when there is fast movement in the video (like when the octopus is spooked), I get these odd lines of noise.
In fact, while I noticed the lines in the video itself, it's actually most repeatably noticeable in the buttons underneath while the video is playing: make your mouse go back-and-forth between the "rate" and "share" buttons, and they get corrupted (and it also corrupts the progress bar).
It looks a bit like the noise you get with insufficient memory bandwidth, but I doubt that's the case here. Perhaps just some motion-comp problem?
Any ideas?
Since I doubt we're actually offloading to our video decode kernels for Flash video on your machine, it could very well be a memory bw issue. Can you try this small patch to see if one of the low power watermarks is giving you trouble (note: cut & pasted)?
It could also be the normal power watermarks though too; you could just make plane-wm and cursor_wm higher to test that.
On Wed, Jan 12, 2011 at 11:46 AM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
Since I doubt we're actually offloading to our video decode kernels for Flash video on your machine
It's the latest 64-bit beta flash player, so maybe it does use hw acceleration.
it could very well be a memory bw issue.
Can you try this small patch to see if one of the low power watermarks is giving you trouble (note: cut & pasted)?
No difference.
It could also be the normal power watermarks though too; you could just make plane-wm and cursor_wm higher to test that.
I multiplied them by two, no difference. The patch I used attached.
Does nobody else see this?
Linus
On Wed, Jan 12, 2011 at 12:27 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 11:46 AM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
Since I doubt we're actually offloading to our video decode kernels for Flash video on your machine
It's the latest 64-bit beta flash player, so maybe it does use hw acceleration.
it could very well be a memory bw issue. Can you try this small patch to see if one of the low power watermarks is giving you trouble (note: cut & pasted)?
No difference.
Oh, and I'm also seeing corruption on my sandybridge machine. No video involved, the gdm login screen is already corrupted this way. Similar odd shifted lines etc, so I'd assume it's related.
Linus
On Wed, 12 Jan 2011 13:28:53 -0800 Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 12:27 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 11:46 AM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
Since I doubt we're actually offloading to our video decode kernels for Flash video on your machine
It's the latest 64-bit beta flash player, so maybe it does use hw acceleration.
it could very well be a memory bw issue. Can you try this small patch to see if one of the low power watermarks is giving you trouble (note: cut & pasted)?
No difference.
Oh, and I'm also seeing corruption on my sandybridge machine. No video involved, the gdm login screen is already corrupted this way. Similar odd shifted lines etc, so I'd assume it's related.
Ah, ok. So it could be our internal FDI link is underrunning; it goes between the CPU and PCH and carries display bits.
Are these both desktop type machines with DVI attached monitors?
If it's an FDI or transcoder problem, something like the below may give us more info.
Can you take a picture of the corruption? If I see it I can try to reproduce it here by messing with FDI, transcoder, and DP link settings to see if they're the problem.
On Wed, Jan 12, 2011 at 2:22 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
Ah, ok. So it could be our internal FDI link is underrunning; it goes between the CPU and PCH and carries display bits.
I'm not sure it's an underrun or anything like that: the corruption is long-term in the non-video case. So I take back the "looks like memory bandwidth problems", because it really looks more like a corrupted blit operation there.
Are these both desktop type machines with DVI attached monitors?
DVI on the Core i5, plain analog VGA on the sandybridge one (I can hear you asking "Why?". Because the silly intel motherboard doesn't _have_ DVI out, and I didn't have a hdmi cable)
If it's an FDI or transcoder problem, something like the below may give us more info.
See above. It's long-term, it was just the video behavior that made me originally think it was temporary.
Can you take a picture of the corruption?
Will do. I'll have to reboot to the broken kernel (my bisection ended in a non-broken case)
Linus
On Wed, 12 Jan 2011 14:31:33 -0800 Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 2:22 PM, Jesse Barnes jbarnes@virtuousgeek.org wrote:
Ah, ok. So it could be our internal FDI link is underrunning; it goes between the CPU and PCH and carries display bits.
I'm not sure it's an underrun or anything like that: the corruption is long-term in the non-video case. So I take back the "looks like memory bandwidth problems", because it really looks more like a corrupted blit operation there.
Ah ok if it's long running then yeah it's more likely to be a rendering issue. It could also be the FDI link getting its timings messed up though, and consistently delivering the wrong bits; that could show up in the same place on the screen each time, or it might move in a pattern across the screen (usually from top to bottom).
Will do. I'll have to reboot to the broken kernel (my bisection ended in a non-broken case)
Great, thanks.
On Wed, Jan 12, 2011 at 1:28 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
Oh, and I'm also seeing corruption on my sandybridge machine. No video involved, the gdm login screen is already corrupted this way. Similar odd shifted lines etc, so I'd assume it's related.
Hmm. I bisected it down to
commit 6fe4f14044f181e146cdc15485428f95fa541ce8 Author: Chris Wilson chris@chris-wilson.co.uk Date: Mon Jan 10 17:35:37 2011 +0000
drm/i915/execbuffer: Reorder binding of objects to favour restrictions
on my sandybridge machine. Chris?
Linus
On Wed, 12 Jan 2011 14:24:17 -0800, Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 1:28 PM, Linus Torvalds torvalds@linux-foundation.org wrote:
Oh, and I'm also seeing corruption on my sandybridge machine. No video involved, the gdm login screen is already corrupted this way. Similar odd shifted lines etc, so I'd assume it's related.
Hmm. I bisected it down to
commit 6fe4f14044f181e146cdc15485428f95fa541ce8 Author: Chris Wilson chris@chris-wilson.co.uk Date: Mon Jan 10 17:35:37 2011 +0000
drm/i915/execbuffer: Reorder binding of objects to favour restrictions
on my sandybridge machine. Chris?
Wow. That should have had zero visible impact upon the rendering. All it should have done is reorder the sequence in which we pin the buffers into the GTT before applying the relocations, just to allow some pathological execbuffers.
Just the SNB machine? -Chris
On Wed, Jan 12, 2011 at 2:40 PM, Chris Wilson chris@chris-wilson.co.uk wrote:
Wow. That should have had zero visible impact upon the rendering. All it should have done is reorder the sequence in which we pin the buffers into the GTT before applying the relocations, just to allow some pathological execbuffers.
Just the SNB machine?
No. I just checked. Reverting that commit on my other machine makes that TED video on my Core i5 machine look fine too.
So it's definitely the same bug on both Sandybridge and Core-i5 (I guess that's "Ironlake" in the crazy intel codename naming), just two slightly different symptoms. And I worried a bit that my bisect was bogus, but with the revert clearing it up on the other machine, I'm confident the bisect was good too.
On my sandybridge machine, the corruption happens already at the gdm login screen, which is why I used that one to bisect things. I'm including a (bad) photo taken with my cellphone of what the corruption looks like - see how the "sandybridge.linux-foundation.org" machine name text has been corrupted, and obviously my name (and the "e" in Other). And that blue rounded rectangle should contain "Log in as torvalds" or something like that, but instead it's clear.
Linus
On Wed, 12 Jan 2011 15:05:36 -0800, Linus Torvalds torvalds@linux-foundation.org wrote:
On Wed, Jan 12, 2011 at 2:40 PM, Chris Wilson chris@chris-wilson.co.uk wrote:
Just the SNB machine?
No. I just checked. Reverting that commit on my other machine makes that TED video on my Core i5 machine look fine too.
So it's definitely the same bug on both Sandybridge and Core-i5 (I guess that's "Ironlake" in the crazy intel codename naming), just two slightly different symptoms. And I worried a bit that my bisect was bogus, but with the revert clearing it up on the other machine, I'm confident the bisect was good too.
On my sandybridge machine, the corruption happens already at the gdm login screen, which is why I used that one to bisect things. I'm including a (bad) photo taken with my cellphone of what the corruption looks like - see how the "sandybridge.linux-foundation.org" machine name text has been corrupted, and obviously my name (and the "e" in Other). And that blue rounded rectangle should contain "Log in as torvalds" or something like that, but instead it's clear.
Yes, that looks consistent with using the wrong relocation entry or GTT offset within the batch.
Thanks, -Chris
dri-devel@lists.freedesktop.org