4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a system freeze when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb PGD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6 Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 task: ffff8802149d6c00 ti: ffff880206df4000 task.ti: ffff880206df4000 RIP: 0010:[<ffffffffbd3447bb>] [<ffffffffbd3447bb>] 0xffffffffbd3447bb RSP: 0018:ffff880206df7b08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88021578f480 RCX: ffff88021578f4d0 RDX: 0000000000000000 RSI: ffff88021630b000 RDI: ffff880214a68000 RBP: ffff88021630b000 R08: ffff88021578f4e0 R09: ffff88021578f4f0 R10: 0000000000003c18 R11: 00000000fffffff2 R12: ffff880214a68000 R13: ffff88021634e800 R14: 0000000000000000 R15: 0000000000000000 FS: 00007ff3caa60880(0000) GS:ffff88021f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000009 CR3: 0000000206e07000 CR4: 00000000001407e0 Stack: ffff880200010000 ffff880200010000 ffff880200000000 ffff880200000001 ffff88021578f500 ffffffffbd2df135 ffff880213f71c00 ffff880214a68000 0000000000000000 ffff880214a70000 0000000000000001 ffff880214a68000 Call Trace: [<ffffffffbd2df135>] ? 0xffffffffbd2df135 [<ffffffffbd2b6ca8>] ? 0xffffffffbd2b6ca8 [<ffffffffbd33cc7e>] ? 0xffffffffbd33cc7e [<ffffffffbd343673>] ? 0xffffffffbd343673 [<ffffffffbd2d0728>] ? 0xffffffffbd2d0728 [<ffffffffbd2d088e>] ? 0xffffffffbd2d088e [<ffffffffbd2d10c5>] ? 0xffffffffbd2d10c5 [<ffffffffbd2c6976>] ? 0xffffffffbd2c6976 [<ffffffffbd2d0fe0>] ? 0xffffffffbd2d0fe0 [<ffffffffbd0c6a1f>] ? 0xffffffffbd0c6a1f [<ffffffffbd0e79e1>] ? 0xffffffffbd0e79e1 [<ffffffffbd0e7ed1>] ? 0xffffffffbd0e7ed1 [<ffffffffbd6df557>] ? 0xffffffffbd6df557 Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44 24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41> 80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00 RIP [<ffffffffbd3447bb>] 0xffffffffbd3447bb RSP <ffff880206df7b08> CR2: 0000000000000009 ---[ end trace dd0931f7f0d43d12 ] ---
2015-07-12 10:03 GMT+02:00 Jörg Otte jrg.otte@gmail.com:
4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a system freeze when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb PGD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6 Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 task: ffff8802149d6c00 ti: ffff880206df4000 task.ti: ffff880206df4000 RIP: 0010:[<ffffffffbd3447bb>] [<ffffffffbd3447bb>] 0xffffffffbd3447bb RSP: 0018:ffff880206df7b08 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88021578f480 RCX: ffff88021578f4d0 RDX: 0000000000000000 RSI: ffff88021630b000 RDI: ffff880214a68000 RBP: ffff88021630b000 R08: ffff88021578f4e0 R09: ffff88021578f4f0 R10: 0000000000003c18 R11: 00000000fffffff2 R12: ffff880214a68000 R13: ffff88021634e800 R14: 0000000000000000 R15: 0000000000000000 FS: 00007ff3caa60880(0000) GS:ffff88021f280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000009 CR3: 0000000206e07000 CR4: 00000000001407e0 Stack: ffff880200010000 ffff880200010000 ffff880200000000 ffff880200000001 ffff88021578f500 ffffffffbd2df135 ffff880213f71c00 ffff880214a68000 0000000000000000 ffff880214a70000 0000000000000001 ffff880214a68000 Call Trace: [<ffffffffbd2df135>] ? 0xffffffffbd2df135 [<ffffffffbd2b6ca8>] ? 0xffffffffbd2b6ca8 [<ffffffffbd33cc7e>] ? 0xffffffffbd33cc7e [<ffffffffbd343673>] ? 0xffffffffbd343673 [<ffffffffbd2d0728>] ? 0xffffffffbd2d0728 [<ffffffffbd2d088e>] ? 0xffffffffbd2d088e [<ffffffffbd2d10c5>] ? 0xffffffffbd2d10c5 [<ffffffffbd2c6976>] ? 0xffffffffbd2c6976 [<ffffffffbd2d0fe0>] ? 0xffffffffbd2d0fe0 [<ffffffffbd0c6a1f>] ? 0xffffffffbd0c6a1f [<ffffffffbd0e79e1>] ? 0xffffffffbd0e79e1 [<ffffffffbd0e7ed1>] ? 0xffffffffbd0e7ed1 [<ffffffffbd6df557>] ? 0xffffffffbd6df557 Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44 24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41> 80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00 RIP [<ffffffffbd3447bb>] 0xffffffffbd3447bb RSP <ffff880206df7b08> CR2: 0000000000000009 ---[ end trace dd0931f7f0d43d12 ] ---
I can fix the problem for me by reverting:
commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2 Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Tue Jul 7 11:15:47 2015 +0200
drm/i915: Use crtc_state->active in primary check_plane func Since commit 8c7b5ccb729870e606321b3703e2c2e698c49a95 Author: Ander Conselvan de Oliveira ander.conselvan.de.oliveira@intel.com Date: Tue Apr 21 17:13:19 2015 +0300 drm/i915: Use atomic helpers for computing changed flags
Thanks, Jörg
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
- if (intel_crtc->active) { + if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Linus
Op 12-07-15 om 18:52 schreef Linus Torvalds:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Linus
More symbols would be nice.
With the transitional helpers when crtc_state == NULL you don't want to update the scalers or funny things happen. Fix is probably something like this:
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ba9321998a41..830e07b23a15 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane, if (ret) return ret;
- if (crtc_state->base.active) { + if (crtc_state ? crtc_state->base.active || crtc->state->active) { struct intel_plane_state *old_state = to_intel_plane_state(plane->state);
On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Indeed embarrassing. I've missed that we still have 1 caller left that's using the transitional helpers, and those don't fill out plane_state->state backpointers to the global atomic update since there is no global atomic update for transitional helpers. Below diff should fix this - we need to preferentially check crts_state->active and if that's not set intel_crtc->active should yield the right result for the one remaining caller (it's in the crtc_disable paths).
For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision to transition to a different QA team ("we'll do this in 1 week without upfront planing") I essentially don't have proper QA support for 1-2 months by now. The other trouble in this area specifically is that this code is already completely changed in -next again, so any testing done on integration trees (like -next or drm-intel-nightly) won't test any patches for 4.2. -Daniel
Oh and Signed-off-by: Daniel Vetter daniel.vetter@intel.com in case you decide to apply this right away. --- diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ba9321998a41..85ac6d85dc39 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -13276,7 +13276,7 @@ intel_check_primary_plane(struct drm_plane *plane, if (ret) return ret;
- if (crtc_state->base.active) { + if (crtc_state ? crtc_state->base.active : intel_crtc->active) { struct intel_plane_state *old_state = to_intel_plane_state(plane->state);
Op 13-07-15 om 08:22 schreef Daniel Vetter:
On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Indeed embarrassing. I've missed that we still have 1 caller left that's using the transitional helpers, and those don't fill out plane_state->state backpointers to the global atomic update since there is no global atomic update for transitional helpers. Below diff should fix this - we need to preferentially check crts_state->active and if that's not set intel_crtc->active should yield the right result for the one remaining caller (it's in the crtc_disable paths).
For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision to transition to a different QA team ("we'll do this in 1 week without upfront planing") I essentially don't have proper QA support for 1-2 months by now. The other trouble in this area specifically is that this code is already completely changed in -next again, so any testing done on integration trees (like -next or drm-intel-nightly) won't test any patches for 4.2. -Daniel
Oh and Signed-off-by: Daniel Vetter daniel.vetter@intel.com in case you decide to apply this right away.
Well your version has the benefit of compiling without errors. :-)
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
2015-07-13 9:23 GMT+02:00 Maarten Lankhorst maarten.lankhorst@linux.intel.com:
Op 13-07-15 om 08:22 schreef Daniel Vetter:
On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Indeed embarrassing. I've missed that we still have 1 caller left that's using the transitional helpers, and those don't fill out plane_state->state backpointers to the global atomic update since there is no global atomic update for transitional helpers. Below diff should fix this - we need to preferentially check crts_state->active and if that's not set intel_crtc->active should yield the right result for the one remaining caller (it's in the crtc_disable paths).
For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision to transition to a different QA team ("we'll do this in 1 week without upfront planing") I essentially don't have proper QA support for 1-2 months by now. The other trouble in this area specifically is that this code is already completely changed in -next again, so any testing done on integration trees (like -next or drm-intel-nightly) won't test any patches for 4.2. -Daniel
Oh and Signed-off-by: Daniel Vetter daniel.vetter@intel.com in case you decide to apply this right away.
Well your version has the benefit of compiling without errors. :-)
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
Just noticed another problem: On each resume I get the following error: -----------[ cut here ]------------ WARNING: CPU: 2 PID: 2663 at /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319 0xffffffff9a33d5e9() WARN_ON(!crtc->state->enable) CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15 ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 orkqueue: events_unbound 0xffffffff9a055750 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000 ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000 ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664 all Trace: [<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2 [<ffffffff9a03c416>] ? 0xffffffff9a03c416 [<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5 [<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9 [<ffffffff9a343ac3>] ? 0xffffffff9a343ac3 [<ffffffff9a34444a>] ? 0xffffffff9a34444a [<ffffffff9a345518>] ? 0xffffffff9a345518 [<ffffffff9a3246f0>] ? 0xffffffff9a3246f0 [<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8 [<ffffffff9a236170>] ? 0xffffffff9a236170 [<ffffffff9a38b28d>] ? 0xffffffff9a38b28d [<ffffffff9a38b784>] ? 0xffffffff9a38b784 [<ffffffff9a38baa4>] ? 0xffffffff9a38baa4 [<ffffffff9a05577d>] ? 0xffffffff9a05577d [<ffffffff9a04dc47>] ? 0xffffffff9a04dc47 [<ffffffff9a04dfab>] ? 0xffffffff9a04dfab [<ffffffff9a04dea0>] ? 0xffffffff9a04dea0 [<ffffffff9a05331c>] ? 0xffffffff9a05331c [<ffffffff9a053260>] ? 0xffffffff9a053260 [<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f [<ffffffff9a053260>] ? 0xffffffff9a053260 --[ end trace 1b6d28ee34071679 ]---
Nervertheless resume works, so it doesn't hurt me.
BTW: I get also up to 40..50! compile warnings like: i915/i915_drv.h: In function 'i915_debugfs_connector_add': i915/i915_drv.h:3119:53: warning: no return statement in function returning non-void [-Wreturn-type]
which may cause yet uncovered troubles.
Thanks, Jörg
Op 13-07-15 om 09:42 schreef Jörg Otte:
2015-07-13 9:23 GMT+02:00 Maarten Lankhorst maarten.lankhorst@linux.intel.com:
Op 13-07-15 om 08:22 schreef Daniel Vetter:
On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Indeed embarrassing. I've missed that we still have 1 caller left that's using the transitional helpers, and those don't fill out plane_state->state backpointers to the global atomic update since there is no global atomic update for transitional helpers. Below diff should fix this - we need to preferentially check crts_state->active and if that's not set intel_crtc->active should yield the right result for the one remaining caller (it's in the crtc_disable paths).
For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision to transition to a different QA team ("we'll do this in 1 week without upfront planing") I essentially don't have proper QA support for 1-2 months by now. The other trouble in this area specifically is that this code is already completely changed in -next again, so any testing done on integration trees (like -next or drm-intel-nightly) won't test any patches for 4.2. -Daniel
Oh and Signed-off-by: Daniel Vetter daniel.vetter@intel.com in case you decide to apply this right away.
Well your version has the benefit of compiling without errors. :-)
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
Just noticed another problem: On each resume I get the following error: -----------[ cut here ]------------ WARNING: CPU: 2 PID: 2663 at /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319 0xffffffff9a33d5e9() WARN_ON(!crtc->state->enable) CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15 ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 orkqueue: events_unbound 0xffffffff9a055750 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000 ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000 ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664 all Trace: [<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2 [<ffffffff9a03c416>] ? 0xffffffff9a03c416 [<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5 [<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9 [<ffffffff9a343ac3>] ? 0xffffffff9a343ac3 [<ffffffff9a34444a>] ? 0xffffffff9a34444a [<ffffffff9a345518>] ? 0xffffffff9a345518 [<ffffffff9a3246f0>] ? 0xffffffff9a3246f0 [<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8 [<ffffffff9a236170>] ? 0xffffffff9a236170 [<ffffffff9a38b28d>] ? 0xffffffff9a38b28d [<ffffffff9a38b784>] ? 0xffffffff9a38b784 [<ffffffff9a38baa4>] ? 0xffffffff9a38baa4 [<ffffffff9a05577d>] ? 0xffffffff9a05577d [<ffffffff9a04dc47>] ? 0xffffffff9a04dc47 [<ffffffff9a04dfab>] ? 0xffffffff9a04dfab [<ffffffff9a04dea0>] ? 0xffffffff9a04dea0 [<ffffffff9a05331c>] ? 0xffffffff9a05331c [<ffffffff9a053260>] ? 0xffffffff9a053260 [<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f [<ffffffff9a053260>] ? 0xffffffff9a053260 --[ end trace 1b6d28ee34071679 ]---
Nervertheless resume works, so it doesn't hurt me.
BTW: I get also up to 40..50! compile warnings like: i915/i915_drv.h: In function 'i915_debugfs_connector_add': i915/i915_drv.h:3119:53: warning: no return statement in function returning non-void [-Wreturn-type]
which may cause yet uncovered troubles.
Thanks, Jörg
kallsyms please!
Looks like intel_crtc_disable being called with a mode change on a already disabled crtc, it's gone in 4.3 because of the atomic rework.
Does something like below work?
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ba9321998a41..725d2b727704 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc) struct drm_connector *connector; struct drm_i915_private *dev_priv = dev->dev_private;
- /* crtc should still be enabled when we disable it. */ - WARN_ON(!crtc->state->enable); - intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc); dev_priv->display.off(crtc); @@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc, continue;
if (!crtc_state->enable) { - intel_crtc_disable(crtc); + if (crtc->state->enable) + intel_crtc_disable(crtc); } else if (crtc->state->enable) { intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc);
2015-07-13 9:58 GMT+02:00 Maarten Lankhorst maarten.lankhorst@linux.intel.com:
Op 13-07-15 om 09:42 schreef Jörg Otte:
2015-07-13 9:23 GMT+02:00 Maarten Lankhorst maarten.lankhorst@linux.intel.com:
Op 13-07-15 om 08:22 schreef Daniel Vetter:
On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte jrg.otte@gmail.com wrote:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000009 IP: [<ffffffffbd3447bb>] 0xffffffffbd3447bb
Ugh. Please enable KALLSYMS to get sane symbols.
But yes, "crtc_state->base.active" is at offset 9 from "crtc_state", so it's pretty clearly just that change frm
if (intel_crtc->active) {
if (crtc_state->base.active) {
and "crtc_state" is NULL.
And the code very much knows that crtc_state can be NULL, since it's initialized with
crtc_state = state->base.state ? intel_atomic_get_crtc_state(state->base.state,
intel_crtc) : NULL;
Tssk. Daniel? Should I just revert that commit dec4f799d0a4 ("drm/i915: Use crtc_state->active in primary check_plane func") for now, or is there a better fix? Like just checking crtc_state for NULL?
Indeed embarrassing. I've missed that we still have 1 caller left that's using the transitional helpers, and those don't fill out plane_state->state backpointers to the global atomic update since there is no global atomic update for transitional helpers. Below diff should fix this - we need to preferentially check crts_state->active and if that's not set intel_crtc->active should yield the right result for the one remaining caller (it's in the crtc_disable paths).
For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision to transition to a different QA team ("we'll do this in 1 week without upfront planing") I essentially don't have proper QA support for 1-2 months by now. The other trouble in this area specifically is that this code is already completely changed in -next again, so any testing done on integration trees (like -next or drm-intel-nightly) won't test any patches for 4.2. -Daniel
Oh and Signed-off-by: Daniel Vetter daniel.vetter@intel.com in case you decide to apply this right away.
Well your version has the benefit of compiling without errors. :-)
Reviewed-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com
Just noticed another problem: On each resume I get the following error: -----------[ cut here ]------------ WARNING: CPU: 2 PID: 2663 at /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319 0xffffffff9a33d5e9() WARN_ON(!crtc->state->enable) CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15 ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012 orkqueue: events_unbound 0xffffffff9a055750 0000000000000000 ffffffff9a98ea28 ffffffff9a6d84d2 0000000000000000 ffffffff9a03c416 ffff88020951c4e0 0000000000000000 0000000000000000 ffff8802141cb800 ffff88021630c000 ffffffff9a03c4d5 ffffffff9a9c3664 all Trace: [<ffffffff9a6d84d2>] ? 0xffffffff9a6d84d2 [<ffffffff9a03c416>] ? 0xffffffff9a03c416 [<ffffffff9a03c4d5>] ? 0xffffffff9a03c4d5 [<ffffffff9a33d5e9>] ? 0xffffffff9a33d5e9 [<ffffffff9a343ac3>] ? 0xffffffff9a343ac3 [<ffffffff9a34444a>] ? 0xffffffff9a34444a [<ffffffff9a345518>] ? 0xffffffff9a345518 [<ffffffff9a3246f0>] ? 0xffffffff9a3246f0 [<ffffffff9a2e1ce8>] ? 0xffffffff9a2e1ce8 [<ffffffff9a236170>] ? 0xffffffff9a236170 [<ffffffff9a38b28d>] ? 0xffffffff9a38b28d [<ffffffff9a38b784>] ? 0xffffffff9a38b784 [<ffffffff9a38baa4>] ? 0xffffffff9a38baa4 [<ffffffff9a05577d>] ? 0xffffffff9a05577d [<ffffffff9a04dc47>] ? 0xffffffff9a04dc47 [<ffffffff9a04dfab>] ? 0xffffffff9a04dfab [<ffffffff9a04dea0>] ? 0xffffffff9a04dea0 [<ffffffff9a05331c>] ? 0xffffffff9a05331c [<ffffffff9a053260>] ? 0xffffffff9a053260 [<ffffffff9a6dfa0f>] ? 0xffffffff9a6dfa0f [<ffffffff9a053260>] ? 0xffffffff9a053260 --[ end trace 1b6d28ee34071679 ]---
Nervertheless resume works, so it doesn't hurt me.
BTW: I get also up to 40..50! compile warnings like: i915/i915_drv.h: In function 'i915_debugfs_connector_add': i915/i915_drv.h:3119:53: warning: no return statement in function returning non-void [-Wreturn-type]
which may cause yet uncovered troubles.
Thanks, Jörg
kallsyms please!
Looks like intel_crtc_disable being called with a mode change on a already disabled crtc, it's gone in 4.3 because of the atomic rework.
Does something like below work?
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index ba9321998a41..725d2b727704 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc) struct drm_connector *connector; struct drm_i915_private *dev_priv = dev->dev_private;
/* crtc should still be enabled when we disable it. */
WARN_ON(!crtc->state->enable);
intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc); dev_priv->display.off(crtc);
@@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc, continue;
if (!crtc_state->enable) {
intel_crtc_disable(crtc);
if (crtc->state->enable)
intel_crtc_disable(crtc); } else if (crtc->state->enable) { intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc);
The patch works for me.
Thanks, Jörg
When resuming with dpms off, the following warn can happen:
[ 118.334082] ------------[ cut here ]------------ [ 118.334105] WARNING: CPU: 2 PID: 2274 at drivers/gpu/drm/i915/intel_display.c:6319 __intel_set_mode+0xae5/0xb90 [i915]() [ 118.334106] WARN_ON(!crtc->state->enable) [ 118.334137] Modules linked in: i915 [ 118.334139] CPU: 2 PID: 2274 Comm: kworker/u16:117 Not tainted 4.2.0-rc2-fixes+ #4148 [ 118.334140] Hardware name: LENOVO 2349AV8/2349AV8, BIOS G1ETA5WW (2.65 ) 04/15/2014 [ 118.334144] Workqueue: events_unbound async_run_entry_fn [ 118.334147] ffffffffc017eef0 ffff8800ada93998 ffffffff817aa62a 0000000080000001 [ 118.334149] ffff8800ada939e8 ffff8800ada939d8 ffffffff810807e1 ffff8800ada939c8 [ 118.334151] ffff8800cea3b3d8 0000000000000000 ffff8800ad86b008 ffff880117705668 [ 118.334151] Call Trace: [ 118.334155] [<ffffffff817aa62a>] dump_stack+0x4f/0x7b [ 118.334157] [<ffffffff810807e1>] warn_slowpath_common+0x81/0xc0 [ 118.334158] [<ffffffff81080861>] warn_slowpath_fmt+0x41/0x50 [ 118.334173] [<ffffffffc0120375>] __intel_set_mode+0xae5/0xb90 [i915] [ 118.334188] [<ffffffffc0121312>] ? intel_modeset_compute_config+0x52/0xb40 [i915] [ 118.334191] [<ffffffff8144de53>] ? drm_atomic_set_fb_for_plane+0x63/0x80 [ 118.334205] [<ffffffffc01269d9>] intel_set_mode+0x29/0x60 [i915] [ 118.334219] [<ffffffffc012730a>] intel_crtc_restore_mode+0x13a/0x1f0 [i915] [ 118.334232] [<ffffffffc0101160>] ? gen6_write16+0x250/0x250 [i915] [ 118.334246] [<ffffffffc01283ec>] intel_modeset_setup_hw_state+0x89c/0xcd0 [i915] [ 118.334248] [<ffffffff8137d260>] ? pci_pm_thaw+0x90/0x90 [ 118.334255] [<ffffffffc00ac11b>] i915_drm_resume+0xcb/0x160 [i915] [ 118.334262] [<ffffffffc00ac1d2>] i915_pm_resume+0x22/0x30 [i915] [ 118.334263] [<ffffffff8137d2c3>] pci_pm_resume+0x63/0xa0 [ 118.334266] [<ffffffff81467550>] dpm_run_callback+0x70/0x420 [ 118.334267] [<ffffffff81467cbd>] device_resume+0x9d/0x1c0 [ 118.334269] [<ffffffff814673d0>] ? initcall_debug_start+0x60/0x60 [ 118.334270] [<ffffffff81467dfc>] async_resume+0x1c/0x50 [ 118.334271] [<ffffffff810a6a94>] async_run_entry_fn+0x34/0xd0 [ 118.334273] [<ffffffff8109d4ad>] process_one_work+0x1dd/0x7e0 [ 118.334275] [<ffffffff8109d41a>] ? process_one_work+0x14a/0x7e0 [ 118.334276] [<ffffffff8109daf9>] worker_thread+0x49/0x450 [ 118.334278] [<ffffffff8109dab0>] ? process_one_work+0x7e0/0x7e0 [ 118.334280] [<ffffffff810a3cb9>] kthread+0xf9/0x110 [ 118.334282] [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90 [ 118.334284] [<ffffffff817b414f>] ret_from_fork+0x3f/0x70 [ 118.334286] [<ffffffff810a3bc0>] ? insert_kthread_work+0x90/0x90 [ 118.334287] ---[ end trace 01f2cf6371b82d7a ]---
This warn is harmless, and can be fixed by not calling intel_crtc_disable when the crtc is already disabled.
Reported-and-Tested-by: Jörg Otte jrg.otte@gmail.com Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com --- diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 85ac6d85dc39..30e0f54ba19d 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -6315,9 +6315,6 @@ static void intel_crtc_disable(struct drm_crtc *crtc) struct drm_connector *connector; struct drm_i915_private *dev_priv = dev->dev_private;
- /* crtc should still be enabled when we disable it. */ - WARN_ON(!crtc->state->enable); - intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc); dev_priv->display.off(crtc); @@ -12591,7 +12588,8 @@ static int __intel_set_mode(struct drm_crtc *modeset_crtc, continue;
if (!crtc_state->enable) { - intel_crtc_disable(crtc); + if (crtc->state->enable) + intel_crtc_disable(crtc); } else if (crtc->state->enable) { intel_crtc_disable_planes(crtc); dev_priv->display.crtc_disable(crtc);
dri-devel@lists.freedesktop.org