Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at usb-0000:00:1d.0-1.2, CDC Ethernet Device, 22:1b:e4:4e:56:f5 [ 3190.767227] [drm] GPU HANG: ecode 6:0:0xbb409fff, in chromium [4597], reason: Hang on render ring, action: reset [ 3190.767311] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 3190.767313] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 3190.767315] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 3190.767317] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 3190.767320] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 3190.767427] drm/i915: Resetting chip after gpu hang [ 3228.329384] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been dropped [ 3228.329604] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been dropped [ 3877.246261] perf: interrupt took too long (3142 > 3133), lowering kernel.perf_event_max_sample_rate to 63500 [ 4802.784478] drm/i915: Resetting chip after gpu hang [ 4810.784851] drm/i915: Resetting chip after gpu hang [ 4829.829795] drm/i915: Resetting chip after gpu hang [ 4837.826154] drm/i915: Resetting chip after gpu hang [ 5125.026814] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308257 end=308258) time 203 us, min 763, max 767, scanline start 761, end 771 [ 5125.192602] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe B (start=307385 end=307386) time 204 us, min 1073, max 1079, scanline start 1071, end 1086 [ 5125.309992] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308274 end=308275) time 203 us, min 763, max 767, scanline start 758, end 768 [ 5125.460013] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308283 end=308284) time 204 us, min 763, max 767, scanline start 761, end 771 [ 5125.493340] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308285 end=308286) time 202 us, min 763, max 767, scanline start 761, end 771 [ 5125.526684] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308287 end=308288) time 204 us, min 763, max 767, scanline start 762, end 772 [ 5125.593245] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308291 end=308292) time 203 us, min 763, max 767, scanline start 758, end 768 [ 5125.676636] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308296 end=308297) time 202 us, min 763, max 767, scanline start 762, end 772 [ 5125.709960] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308298 end=308299) time 203 us, min 763, max 767, scanline start 762, end 772 [ 5126.093109] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308321 end=308322) time 204 us, min 763, max 767, scanline start 759, end 770 [ 5647.879171] drm/i915: Resetting chip after gpu hang [ 5655.879507] drm/i915: Resetting chip after gpu hang [ 5850.864464] drm/i915: Resetting chip after gpu hang [ 5858.864853] drm/i915: Resetting chip after gpu hang [ 5904.850879] drm/i915: Resetting chip after gpu hang [ 5912.851252] drm/i915: Resetting chip after gpu hang [ 5949.876973] drm/i915: Resetting chip after gpu hang [ 5957.877460] drm/i915: Resetting chip after gpu hang [ 6018.872153] drm/i915: Resetting chip after gpu hang [ 6030.872646] drm/i915: Resetting chip after gpu hang [ 7108.362610] perf: interrupt took too long (3935 > 3927), lowering kernel.perf_event_max_sample_rate to 50750 [ 9670.047072] drm/i915: Resetting chip after gpu hang [ 9678.047415] drm/i915: Resetting chip after gpu hang [10408.064806] drm/i915: Resetting chip after gpu hang [10416.097168] drm/i915: Resetting chip after gpu hang [10416.097181] [drm:i915_reset] *ERROR* GPU recovery failed pavel@duo:/data/film$
Umm. Dmesg wants me to attach card0/error, but it looks like it contains quite a lot of data. If it contains actual framebuffer content, it may not be wise to post to mailing list....
Best regards, Pavel
On Tue, Feb 28, 2017 at 03:34:53PM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at usb-0000:00:1d.0-1.2, CDC Ethernet Device, 22:1b:e4:4e:56:f5 [ 3190.767227] [drm] GPU HANG: ecode 6:0:0xbb409fff, in chromium [4597], reason: Hang on render ring, action: reset [ 3190.767311] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 3190.767313] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 3190.767315] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 3190.767317] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 3190.767320] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 3190.767427] drm/i915: Resetting chip after gpu hang [ 3228.329384] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been dropped [ 3228.329604] cdc_ether 2-1.2:1.0 usb0: kevent 12 may have been dropped [ 3877.246261] perf: interrupt took too long (3142 > 3133), lowering kernel.perf_event_max_sample_rate to 63500 [ 4802.784478] drm/i915: Resetting chip after gpu hang [ 4810.784851] drm/i915: Resetting chip after gpu hang [ 4829.829795] drm/i915: Resetting chip after gpu hang [ 4837.826154] drm/i915: Resetting chip after gpu hang [ 5125.026814] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308257 end=308258) time 203 us, min 763, max 767, scanline start 761, end 771 [ 5125.192602] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe B (start=307385 end=307386) time 204 us, min 1073, max 1079, scanline start 1071, end 1086 [ 5125.309992] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308274 end=308275) time 203 us, min 763, max 767, scanline start 758, end 768 [ 5125.460013] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308283 end=308284) time 204 us, min 763, max 767, scanline start 761, end 771 [ 5125.493340] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308285 end=308286) time 202 us, min 763, max 767, scanline start 761, end 771 [ 5125.526684] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308287 end=308288) time 204 us, min 763, max 767, scanline start 762, end 772 [ 5125.593245] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308291 end=308292) time 203 us, min 763, max 767, scanline start 758, end 768 [ 5125.676636] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308296 end=308297) time 202 us, min 763, max 767, scanline start 762, end 772 [ 5125.709960] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308298 end=308299) time 203 us, min 763, max 767, scanline start 762, end 772 [ 5126.093109] [drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=308321 end=308322) time 204 us, min 763, max 767, scanline start 759, end 770 [ 5647.879171] drm/i915: Resetting chip after gpu hang [ 5655.879507] drm/i915: Resetting chip after gpu hang [ 5850.864464] drm/i915: Resetting chip after gpu hang [ 5858.864853] drm/i915: Resetting chip after gpu hang [ 5904.850879] drm/i915: Resetting chip after gpu hang [ 5912.851252] drm/i915: Resetting chip after gpu hang [ 5949.876973] drm/i915: Resetting chip after gpu hang [ 5957.877460] drm/i915: Resetting chip after gpu hang [ 6018.872153] drm/i915: Resetting chip after gpu hang [ 6030.872646] drm/i915: Resetting chip after gpu hang [ 7108.362610] perf: interrupt took too long (3935 > 3927), lowering kernel.perf_event_max_sample_rate to 50750 [ 9670.047072] drm/i915: Resetting chip after gpu hang [ 9678.047415] drm/i915: Resetting chip after gpu hang [10408.064806] drm/i915: Resetting chip after gpu hang [10416.097168] drm/i915: Resetting chip after gpu hang [10416.097181] [drm:i915_reset] *ERROR* GPU recovery failed pavel@duo:/data/film$
Umm. Dmesg wants me to attach card0/error, but it looks like it contains quite a lot of data. If it contains actual framebuffer content, it may not be wise to post to mailing list....
It contains command and register states. No pixel data unless userspace got particularly creative with its memory corruption. -Chris
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
Pavel
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
- i915_gem_request_submit(request); - GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail); + + i915_gem_request_submit(request); }
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
On Mon, Mar 06, 2017 at 11:15:28AM +0000, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail);
i915_gem_request_submit(request);
Hmm. request->tail is not set until i915_gem_request_submit() Uh oh. -Chris
On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail);
i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
I applied it as:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 91bc4ab..9c49c7a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
- i915_gem_request_submit(request); - I915_WRITE_TAIL(request->engine, request->tail); + + i915_gem_request_submit(request); }
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
Hmm. But your next mail suggest that it may not be smart to try to boot it? :-).
Pavel
On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail);
i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
I applied it as:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 91bc4ab..9c49c7a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
- i915_gem_request_submit(request);
- I915_WRITE_TAIL(request->engine, request->tail);
- i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
Hmm. But your next mail suggest that it may not be smart to try to boot it? :-).
Don't bother, it'll promptly hang. -Chris
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail);
i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
I applied it as:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 91bc4ab..9c49c7a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
- i915_gem_request_submit(request);
- I915_WRITE_TAIL(request->engine, request->tail);
- i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
Hmm. But your next mail suggest that it may not be smart to try to boot it? :-).
Don't bother, it'll promptly hang.
Any news here?
Is there something I can revert to get back to working system?
Thanks, Pavel
On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
...
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
Hmm. But your next mail suggest that it may not be smart to try to boot it? :-).
Don't bother, it'll promptly hang.
Any news here? Is there chance this is fixed in -rc4? Pavel
On Mon 2017-03-06 12:23:41, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 01:10:48PM +0100, Pavel Machek wrote:
On Mon 2017-03-06 11:15:28, Chris Wilson wrote:
On Mon, Mar 06, 2017 at 12:01:51AM +0100, Pavel Machek wrote:
Hi!
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
I came the conclusion that #99671 is the ring HEAD overtaking the TAIL, and under the presumption that your bug matches (as the symptoms do):
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 4ffa35faff49..62e31a7438ac 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -782,10 +782,10 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
i915_gem_request_submit(request);
GEM_BUG_ON(!IS_ALIGNED(request->tail, 8)); I915_WRITE_TAIL(request->engine, request->tail);
i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req, u32 *cs)
I applied it as:
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 91bc4ab..9c49c7a 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1338,9 +1338,9 @@ static void i9xx_submit_request(struct drm_i915_gem_request *request) { struct drm_i915_private *dev_priv = request->i915;
- i915_gem_request_submit(request);
- I915_WRITE_TAIL(request->engine, request->tail);
- i915_gem_request_submit(request);
}
static void i9xx_emit_breadcrumb(struct drm_i915_gem_request *req,
Hmm. But your next mail suggest that it may not be smart to try to boot it? :-).
Don't bother, it'll promptly hang.
Any news here? 4.11-rc5 is actually usable on the hardware (unlike -rc1), not sure what changed.
On 06.03.2017 00:01, Pavel Machek wrote:
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
@Pavel, @Chris: What's the status of this?
I added this report to the list of regressions for Linux 4.11. I'll try to watch this thread for further updates on this issue to document progress in my weekly reports. Please let me know in case the discussion moves to a different place (bugzilla or another mail thread for example). tia!
Ciao, Thorsten
On Tue 2017-03-14 10:08:23, Thorsten Leemhuis wrote:
On 06.03.2017 00:01, Pavel Machek wrote:
mplayer stopped working after a while. Dmesg says:
[ 3000.266533] cdc_ether 2-1.2:1.0 usb0: register 'cdc_ether' at
Now I'm pretty sure it is a regression in v4.11-rc0. Any ideas what to try? Bisect will be slow and nasty :-(.
@Pavel, @Chris: What's the status of this?
I added this report to the list of regressions for Linux 4.11. I'll try to watch this thread for further updates on this issue to document progress in my weekly reports. Please let me know in case the discussion moves to a different place (bugzilla or another mail thread for example). tia!
We know where the bug is, but there's no fix for it. There was one patch, but it was quickly withdrawn.
Pavel
dri-devel@lists.freedesktop.org