On Wed, 2011-04-27 at 23:20 -0600, Alex Williamson wrote:
We're often using a shared interrupt line for nouveau, so we have to be prepared that it could be called at any point in time. If we've suspended the device via vga switcheroo and get a stray interrupt on the line from another device, we'll read back -1 from the device and head down all sorts of strange paths, most of which eventually lock the system.
On my system (Asus UL30VT) the interrupt line is shared with USB. Attempting to disable the USB bluetooth device seems to trigger a stray interrupt that ends up in nv04_fifo_isr() where we eventually hit the "PFIFO still angry after 100 spins, halt", which kills the system.
Using free_irq/request_irq around the suspend seems to be a reliable fix. Attempting to flag the device state in nouvea_irq_handler(), similar to the intel_lid_notify() fix is too racy since we can power off the device as an interrupt is being processed.
The actual solution is to check if we read back all Fs and return from the irq handler. Robust irq handlers are generally considered a good idea esp around race conditions at suspend/resume time.
Dave.