Hi everyone,
(please Cc)
I am running 3.7-rc2 and got recently hit a few times (under rc1, too) by hanging drm i915 while doing large io operations.
The efect in the dmesg: [13193.297751] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [13193.297758] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [13193.302728] [drm:init_ring_common] *ERROR* failed to set render ring head to zero ctl 00000000 head 85a05e3c tail 00000000 start 00003000 [13193.357584] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head 85a05e3c tail 00000000 start 00003000 [13194.861769] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [13194.861838] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [13194.861840] [drm:i915_reset] *ERROR* Failed to reset chip.
I captured the i915_error_state and uploaded it here: http://www.logic.at/people/preining/drm_i915_error_state.gz
The hangs have been normally initiated on svn up in a very big repository, or git checkout on a very big repository or so.
Other system is Debian/unstable. The above output and error state is from after a reboot without any suspends or other tricks inbetween, uptime 3.5h.
Best wishes and thanks for any suggestions
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ CORRIEMOILLIE (n.) The dreadful sinking sensation in a long passageway encounter when both protagonists immediately realise they have plumped for the corriedoo (q.v.) much too early as they are still a good thirty yards apart. They were embarrassed by the pretence of corriecravie (q.v.) and decided to make use of the corriedoo because they felt silly. This was a mistake as corrievorrie (q.v.) will make them seem far sillier. --- Douglas Adams, The Meaning of Liff
(please Cc)
I am running 3.7-rc2 and got recently hit a few times (under rc1, too) by hanging drm i915 while doing large io operations.
Does booting with i915.i915_enable_rc6=0 help?
(Daniel, looks like an ironlake).
Dave.
------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel
Hi Dave,
(switched to freedesktop for dri-dvel)
Does booting with i915.i915_enable_rc6=0 help?
Will try immediately.
(Daniel, looks like an ironlake).
Sorry, I forgot that one ... how stupid>
From XOrg.0.log:
... [ 13535.841] (II) intel(0): Integrated Graphics Chipset: Intel(R) Arrandale [ 13535.841] (--) intel(0): Chipset: "Arrandale" ...
00:02.0 0300: 8086:0046 (rev 02) (prog-if 00 [VGA controller]) Subsystem: 17aa:215a Flags: bus master, fast devsel, latency 0, IRQ 42 Memory at f0000000 (64-bit, non-prefetchable) [size=4M] Memory at d0000000 (64-bit, prefetchable) [size=256M] I/O ports at 1800 [size=8] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit- Capabilities: [d0] Power Management version 2 Capabilities: [a4] PCI Advanced Features Kernel driver in use: i915
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 02)
Does that make any differences?
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ WIKE (vb.) To rip a piece of sticky plaster off your skin as fast as possible in the hope that it will (a) show how brave you are, and (b) not hurt. --- Douglas Adams, The Meaning of Liff
On Tue, 23 Oct 2012 14:38:30 +0900, Norbert Preining preining@logic.at wrote:
Hi everyone,
(please Cc)
I am running 3.7-rc2 and got recently hit a few times (under rc1, too) by hanging drm i915 while doing large io operations.
[snip]
I captured the i915_error_state and uploaded it here: http://www.logic.at/people/preining/drm_i915_error_state.gz
The hangs have been normally initiated on svn up in a very big repository, or git checkout on a very big repository or so.
Other system is Debian/unstable. The above output and error state is from after a reboot without any suspends or other tricks inbetween, uptime 3.5h.
Looks like fallout from a missing ILK rc6 workaround - it looks like the write to the ring tail never landed and so the command streamer hung.
See https://bugs.freedesktop.org/show_bug.cgi?id=55984 and http://cgit.freedesktop.org/~danvet/drm/log/?h=ilk-wa-pile of which I think http://cgit.freedesktop.org/~danvet/drm/commit/?h=ilk-wa-pile&id=0d5fed2... is the missing ingredient. -Chris
Hi Dave, hi Chris,
thanks for your answers.
On Di, 23 Okt 2012, Dave Airlie wrote:
Does booting with i915.i915_enable_rc6=0 help?
No,booted with that, it happened again on a completely idle system (well, I believe completely idle, I was doing the dishes ;-)
[12437.995026] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [12437.995034] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [12438.000213] [drm:init_ring_common] *ERROR* failed to set render ring head to zero ctl 00000000 head 5ee06f14 tail 00000000 start 00003000 [12438.054894] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head 5ee06f14 tail 00000000 start 00003000 [12439.583064] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [12439.583176] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [12439.583182] [drm:i915_reset] *ERROR* Failed to reset chip.
New output see here: http://www.logic.at/people/preining/i915_error_state.gz
http://cgit.freedesktop.org/~danvet/drm/commit/?h=ilk-wa-pile&id=0d5fed2... is the missing ingredient.
I am compiling a kernel with this patch based on current git now. Should I still use the above kernel cmd argument (i915...rc6=0) or try without it?
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ What are you talking about? Never mind, eat the fruit. You know, this place almost looks like the Garden of Eden. Eat the fruit. Sounds quite like it too. --- Douglas Adams, The Hitchhikers Guide to the Galaxy
On Wed, 24 Oct 2012 09:36:59 +0900, Norbert Preining preining@logic.at wrote:
Hi Dave, hi Chris,
thanks for your answers.
On Di, 23 Okt 2012, Dave Airlie wrote:
Does booting with i915.i915_enable_rc6=0 help?
No,booted with that, it happened again on a completely idle system (well, I believe completely idle, I was doing the dishes ;-)
[12437.995026] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [12437.995034] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [12438.000213] [drm:init_ring_common] *ERROR* failed to set render ring head to zero ctl 00000000 head 5ee06f14 tail 00000000 start 00003000 [12438.054894] [drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head 5ee06f14 tail 00000000 start 00003000 [12439.583064] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [12439.583176] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [12439.583182] [drm:i915_reset] *ERROR* Failed to reset chip.
New output see here: http://www.logic.at/people/preining/i915_error_state.gz
That has a very similar look to it, so reasonable to assume that is the same issue.
http://cgit.freedesktop.org/~danvet/drm/commit/?h=ilk-wa-pile&id=0d5fed2... is the missing ingredient.
I am compiling a kernel with this patch based on current git now. Should I still use the above kernel cmd argument (i915...rc6=0) or try without it?
Without any rc6 parameter would be best. But if rc6=0 wasn't the solution for you, then I may have identified the wrong w/a. Can I ask you try the patches in that branch until you find one (or more perhaps) that stabilise your system? -Chris
Hi Chris,
I haven't answered due to several reboots necessary (sometimes I have to work on Win***) and no effect, but ..
On Mi, 24 Okt 2012, Chris Wilson wrote:
http://cgit.freedesktop.org/~danvet/drm/commit/?h=ilk-wa-pile&id=0d5fed2... is the missing ingredient.
I am compiling a kernel with this patch based on current git now. Should I still use the above kernel cmd argument (i915...rc6=0) or try without it?
Without any rc6 parameter would be best. But if rc6=0 wasn't the solution for you, then I may have identified the wrong w/a. Can I ask you try the patches in that branch until you find one (or more perhaps) that stabilise your system?
I pulled the whole branch into my compile branch, and removed everything from kernel cmd line regarding rc6, and got the [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [drm:i915_reset] *ERROR* Failed to reset chip. new i915_error_state.gz at the same place.
So it seems that the patches in the ilk-wa-pile branch do not help.
All the best
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ SCRONKEY (n.) Something that hits the window as a result of a violent sneeze. --- Douglas Adams, The Meaning of Liff
On Sun, 28 Oct 2012 11:47:53 +0900, Norbert Preining preining@logic.at wrote:
I pulled the whole branch into my compile branch, and removed everything from kernel cmd line regarding rc6, and got the [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [drm:i915_reset] *ERROR* Failed to reset chip. new i915_error_state.gz at the same place.
So it seems that the patches in the ilk-wa-pile branch do not help.
Yeah, looks like we have another issue to contend with, so can you please file a bug on bugzilla.freedesktop.org (or bugzilla.kernel.org) so that we don't lose track of it.
If your have the option, can you switch the ddx between using SNA and UXA. They stress different paths through the driver which may provide a lue. Thanks, -Chris
Hi Chris,
so can you please file a bug on bugzilla.freedesktop.org (or bugzilla.kernel.org) so that we don't lose track of it.
Will do when I'm back from the mountains.
If your have the option, can you switch the ddx between using SNA and UXA.
??? Is that a BIOS option? Or kernel? I can try both.
Norbert
(on mobile)
On Sun, Oct 28, 2012 at 21:32:53 +0900, Norbert Preining wrote:
Hi Chris,
so can you please file a bug on bugzilla.freedesktop.org (or bugzilla.kernel.org) so that we don't lose track of it.
Will do when I'm back from the mountains.
If your have the option, can you switch the ddx between using SNA and UXA.
??? Is that a BIOS option? Or kernel? I can try both.
It is an option in the Intel Xorg driver. What is actually used depends on the options provided to the configure script during build time. You can see the current state in nur Xorg.0.log.
Here is an example of /etc/X11/xorg.conf which enforces SNA:
Section "Device" Option "AccelMethod" "SNA" Identifier "Card0" Driver "intel" EndSection
Regards, Tino
On Mo, 29 Okt 2012, Tino Keitel wrote:
Section "Device" Option "AccelMethod" "SNA" Identifier "Card0" Driver "intel" EndSection
Thanks, running now with SNA. Let us see what happens.
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ RECULVER (n.) The sort of remark only ever made during Any Questions. --- Douglas Adams, The Meaning of Liff
On Tue, Oct 30, 2012 at 10:49 AM, Norbert Preining preining@logic.at wrote:
On Mo, 29 Okt 2012, Tino Keitel wrote:
Section "Device" Option "AccelMethod" "SNA" Identifier "Card0" Driver "intel" EndSection
Thanks, running now with SNA. Let us see what happens.
Please don't, we ain't going to find the bug any quicker changing variables, if the only thing that changed on your system was the kernel then we need to figure out which kernel changes caused it and remove them. Changing userspace is only complicating things and making it less likely we'll ever find the regressions. Once we find the regression, changing userspace optiosn to help understand it is more reasonable.
How long does it take you to reproduce, and does it happen when in actual use. On my laptop I've noticed I come back to it sometimes and gnome-shell is dead. This never happened pre 3.7-rc's. But for me its a 3-4 day window so far for it to die, which makes bisecting it a bit of a major problem. and I'm just finished bisecting the last Ironlake regression that took over a month.
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Dave.
Hi Dave,
On Di, 30 Okt 2012, Dave Airlie wrote:
Thanks, running now with SNA. Let us see what happens.
Please don't, we ain't going to find the bug any quicker changing variables, if the only thing that changed on your system was the
Sorry, didn't know. I supposed from the email of Chris that I should try it "to stress different code path" ... anyway, disabling it again.
How long does it take you to reproduce, and does it happen when in
Very hard to say, most of the times it is in a few days scale. Though it happened also after a few hours once.
actual use. On my laptop I've noticed I come back to it sometimes and
Concerning actual use: I had instances on several occassions. Just 30min ago it was while working with shotwell on my photo collection, tagging photos. So there should not be a big disk activity or so, but a lot of screen redraws etc when going through the photos. On other times it was locked screen without screen saver.
Concerning coming back: For me it never worked. I always have to reboot to get a working state again. Ok, to be more specific. GNome3 is dead. I can close the windows normally with kbd shortcuts and some mouse interaction, but no new windows, no moving etc.
gnome-shell is dead. This never happened pre 3.7-rc's. But for me its a 3-4 day window so far for it to die, which makes bisecting it a bit
That sounds pretty much like my case, but since I often don't use the laptop for 2 days or so, it might be a bit longer.
of a major problem. and I'm just finished bisecting the last Ironlake regression that took over a month.
Ouch ...
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Ok, thanks. I will try.
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ QUALL (vb.) To speak with the voice of one who requires another to do something for them. --- Douglas Adams, The Meaning of Liff
On Tue, 30 Oct 2012 10:01:38 +0900 Norbert Preining preining@logic.at wrote:
Hi Dave,
On Di, 30 Okt 2012, Dave Airlie wrote:
Thanks, running now with SNA. Let us see what happens.
Please don't, we ain't going to find the bug any quicker changing variables, if the only thing that changed on your system was the
Sorry, didn't know. I supposed from the email of Chris that I should try it "to stress different code path" ... anyway, disabling it again.
How long does it take you to reproduce, and does it happen when in
Very hard to say, most of the times it is in a few days scale. Though it happened also after a few hours once.
actual use. On my laptop I've noticed I come back to it sometimes and
Concerning actual use: I had instances on several occassions. Just 30min ago it was while working with shotwell on my photo collection, tagging photos. So there should not be a big disk activity or so, but a lot of screen redraws etc when going through the photos. On other times it was locked screen without screen saver.
Concerning coming back: For me it never worked. I always have to reboot to get a working state again. Ok, to be more specific. GNome3 is dead. I can close the windows normally with kbd shortcuts and some mouse interaction, but no new windows, no moving etc.
gnome-shell is dead. This never happened pre 3.7-rc's. But for me its a 3-4 day window so far for it to die, which makes bisecting it a bit
That sounds pretty much like my case, but since I often don't use the laptop for 2 days or so, it might be a bit longer.
of a major problem. and I'm just finished bisecting the last Ironlake regression that took over a month.
Ouch ...
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Ok, thanks. I will try.
Best wishes
Norbert
Hi Norbert. In addition to the above, if this truly appears to be related to i/o, can we try to decrease the time to failure with some serious i/o tests? Off the top of my head I am not sure what's available, but surely Google should be able to find something.
On Mo, 29 Okt 2012, Ben Widawsky wrote:
Hi Norbert. In addition to the above, if this truly appears to be related to i/o, can we try to decrease the time to failure with some
I am *not* sure. As I said, the last thing was shotwell photo editing. It might be some io while loading the photos, but after that they are in the cache, and the only thing is done is lots of displaying.
serious i/o tests? Off the top of my head I am not sure what's
Anyway, that is my idea. I think I don't need google. A simple svn up on my 15Gb svn repository creates enough io. And doing some git pull or so on same sized repositories in parallel brings anyway the laptop to its knees (actually, badly to its knees).
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ Far out in the uncharted backwaters of the unfashionable end of the western spiral arm of the Galaxy lies a small unregarded yellow sun. --- Douglas Adams, The Hitchhikers Guide to the Galaxy
Hi all,
On Di, 30 Okt 2012, Dave Airlie wrote:
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Sorry for my ignorance ... I did on master branch $ git checkout v3.7-rc1 ... $ git bisect start drivers/gpu/drm/i915 $ git bisect bad $ git bisect good v3.6 Bisecting: 121 revisions left to test after this (roughly 7 steps) [25c5b2665fe4cc5a93edd29b62e7c05c15dddd26] drm/i915: implement new set_mode code flow $ after that I am back somewhere around 3.6.0-rc2 ???
Am I doing something wrong? I thought I am bisecting between 3.6 and 3.7.-rc2? How can I go back to 3.6.0-rc2?
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ SCREEB (n.) To make the noise of a nylon anorak rubbing against a pair of corduroy trousers. --- Douglas Adams, The Meaning of Liff
On Sun, Nov 4, 2012 at 10:44 AM, Norbert Preining preining@logic.at wrote:
Hi all,
On Di, 30 Okt 2012, Dave Airlie wrote:
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Sorry for my ignorance ... I did on master branch $ git checkout v3.7-rc1 ... $ git bisect start drivers/gpu/drm/i915 $ git bisect bad $ git bisect good v3.6 Bisecting: 121 revisions left to test after this (roughly 7 steps) [25c5b2665fe4cc5a93edd29b62e7c05c15dddd26] drm/i915: implement new set_mode code flow $ after that I am back somewhere around 3.6.0-rc2 ???
Am I doing something wrong? I thought I am bisecting between 3.6 and 3.7.-rc2? How can I go back to 3.6.0-rc2?
Yeah thats fine, bisecting works by going to where commits were originally committed, so drm-intel-next was 3.6.0-rc2 at some point was only merged into Linus later.
Dave.
On So, 04 Nov 2012, Dave Airlie wrote:
Yeah thats fine, bisecting works by going to where commits were originally committed, so drm-intel-next was 3.6.0-rc2 at some point was only merged into Linus later.
Ok, thanks, didn't know that. Have started the bisect game now, coming back in about 1 year ;-)
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ SCOPWICK (n.) The flap of skin which is torn off you lip when trying to smoke an untipped cigarette. --- Douglas Adams, The Meaning of Liff
On Sunday 04 November 2012 16:08:47 Dave Airlie wrote:
On Sun, Nov 4, 2012 at 10:44 AM, Norbert Preining preining@logic.at wrote:
On Di, 30 Okt 2012, Dave Airlie wrote:
I would suggest starting a bisect on drivers/gpu/drm/i915 from 3.6 final to 3.7-rc1 or maybe -rc2.
Sorry for my ignorance ... I did on master branch
$ git checkout v3.7-rc1 ... $ git bisect start drivers/gpu/drm/i915 $ git bisect bad $ git bisect good v3.6 Bisecting: 121 revisions left to test after this (roughly 7 steps) [25c5b2665fe4cc5a93edd29b62e7c05c15dddd26] drm/i915: implement new set_mode code flow $
after that I am back somewhere around
3.6.0-rc2
???
Am I doing something wrong? I thought I am bisecting between 3.6 and 3.7.-rc2? How can I go back to 3.6.0-rc2?
Yeah thats fine, bisecting works by going to where commits were originally committed, so drm-intel-next was 3.6.0-rc2 at some point was only merged into Linus later.
As I mentioned on https://bugs.freedesktop.org/show_bug.cgi?id=55984, I also hit this bug. The first time was on branch drm-intel-next-2012-09-20 on Daniel Vetters drm-intel git.
I guess it has something to do with low memory. To reproduce the bug on my laptop with 8GB RAM and a i5-460M, I did:
1. Boot (I use KDE) 3. Start glxspheres (from http://virtualgl.org/, but glxgears might work too, not tested) 2. Copy a 1.2 GiB Linux source tree to /dev/shm and /tmp (both tmpfs), 5 times. This uses 6GiB of RAM. I used this bash script: #!/bin/bash for i in /tmp/hang-l1 /tmp/hang-l2 /tmp/hang-l3 \ /dev/shm/hang-l1 /dev/shm/hang-l2; do cp -ra ~/Linux-src/linux "$i" & done; wait 3. When the copy is almost done, watch the machine become sluggish and eventually print the "[drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung" message to the kernel log. Until the machine is rebooted, all OpenGL applications will fail to load.
On kernels where it was working fine, there is no lag when the copy is almost finished.
504c7267a1e84b157cbd7e9c1b805e1bc0c2c846 is the first bad commit commit 504c7267a1e84b157cbd7e9c1b805e1bc0c2c846 Author: Chris Wilson chris@chris-wilson.co.uk Date: Thu Aug 23 13:12:52 2012 +0100
drm/i915: Use cpu relocations if the object is in the GTT but not mappable
This prevents the case of unbinding the object in order to process the relocations through the GTT and then rebinding it only to then proceed to use cpu relocations as the object is now in the CPU write domain. By choosing to use cpu relocations up front, we can therefore avoid the rebind penalty.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch
:040000 040000 090ed3d52b4f3210b988877f747b6ff86e123385 1d48be89ded4777a543b693db833de64877059c4 M drivers
Regards, Peter
Hi Chris,
On So, 28 Okt 2012, Chris Wilson wrote:
I pulled the whole branch into my compile branch, and removed everything from kernel cmd line regarding rc6, and got the [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged! [drm:i915_reset] *ERROR* Failed to reset chip. new i915_error_state.gz at the same place.
So it seems that the patches in the ilk-wa-pile branch do not help.
Yeah, looks like we have another issue to contend with, so can you please file a bug on bugzilla.freedesktop.org (or bugzilla.kernel.org) so that we don't lose track of it.
I have seen this here: https://bugs.freedesktop.org/show_bug.cgi?id=55984 does it make sense to start a new bug for that?
Best wishes
Norbert ------------------------------------------------------------------------ Norbert Preining preining@{jaist.ac.jp, logic.at, debian.org} JAIST, Japan TeX Live & Debian Developer DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094 ------------------------------------------------------------------------ SCREMBY (n.) The dehydrated felt-tip pen attached by a string to the 'Don't Forget' board in the kitchen which has never worked in living memory but which no one can be bothered to throw away. --- Douglas Adams, The Meaning of Liff
On Tue, 30 Oct 2012 09:39:43 +0900, Norbert Preining preining@logic.at wrote:
Hi Chris,
On So, 28 Okt 2012, Chris Wilson wrote:
Yeah, looks like we have another issue to contend with, so can you please file a bug on bugzilla.freedesktop.org (or bugzilla.kernel.org) so that we don't lose track of it.
I have seen this here: https://bugs.freedesktop.org/show_bug.cgi?id=55984 does it make sense to start a new bug for that?
I was fearing it was something different, but since Dave has now found that rc6=0 was not sufficient in his case, it is probably the same. The issue surrounding cpu-relocs was never explained and I suspect that we are still being bitten by that root cause. Along those lines:
commit 86a1ee26bb60e1ab8984e92f0e9186c354670aed Author: Chris Wilson chris@chris-wilson.co.uk Date: Sat Aug 11 15:41:04 2012 +0100
drm/i915: Only pwrite through the GTT if there is space in the aperture
is the most contentious patch in 3.7-rc. -Chris
dri-devel@lists.freedesktop.org