Lockdep splat in nouveau, 2.6.37-rc1+ - dri-devel - freedesktop.org experimental mailing list

5 Nov 2010


      Complete dmesg attached, but the highlight is:
[ 6100.827158] [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
[ 6100.827158] 2.6.37-rc1-00027-gff8b16d #1
[ 6100.827158] ------------------------------------------------------
[ 6100.827158] kwin_opengl_tes/8245 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 6100.827158]  (&(&dev_priv->ramin_lock)->rlock){+.+...}, at: [<ffffffff8161b9ff>] nouveau_gpuobj_del+0x6f/0x150
[ 6100.827158]
[ 6100.827158] and this task is already holding:
[ 6100.827158]  (&(&dev_priv->context_switch_lock)->rlock){-.....}, at: [<ffffffff816185c9>] nouveau_channel_free+0xf9/0x2c0
[ 6100.827158] which would create a new lock dependency:
[ 6100.827158]  (&(&dev_priv->context_switch_lock)->rlock){-.....} -> (&(&dev_priv->ramin_lock)->rlock){+.+...}
{snip /}
with a gpu lockup mixed in...
[ 6100.827158] [drm] nouveau 0000:01:00.0: GPU lockup - switching to software fbcon
[ 6100.827158] [<ffffffff81b689c3>] do_IRQ+0x73/0xf0
[ 6100.827158]   [<ffffffff81b682d3>] ret_from_intr+0x0/0xf
[ 6100.827158]   [<ffffffff81b62d48>] printk+0x41/0x43
[ 6100.827158]   [<ffffffff81618fd3>] nouveau_channel_alloc+0x703/0x7a0
{snip /}
The chipset is an NV50, so I've been putting up with occasional lockups anyways, but with v2.6.36-rc and above, they've become much more common (minutes vs. hours).  I can't tell for sure if they are the same kind of lockup.  I rebuilt my kernel with lockdep to try finding something useful.
One other note:  this gpu lockup leaves the cursor functional for a short time, during which I can use sysrq to blindly sync my drives and reboot.  If I wait, the cursor locks up too, and sysrq ceases to work.
I willing to take debug patches if anyone has any ideas.
Regards,
Phil Turmel