On Wed, 26 Oct 2011, Michel [ISO-8859-1] D�nzer wrote:
Does it really need drm_global_mutex at all, as opposed to e.g. dev->struct_mutex?
It doesn't. Actually, it would probably suffice to have a mutex that is one-per-CRTC, because vlbank waits on different CRTCs don't step on each other, but I was going for a conservative fix. I tried to avoid having to hunt around in other sections of code for lines that possibly assume that a global lock is held while touching vblank counter structures (one of concerns raised by Daniel which is a non-issue because I didn't change the mutex being used).
The "urgent" part of this patch is to make sure we don't go to sleep while holding the mutex. In my response to Daniel, I sent an example of a simple userland application that can hang up the whole windowing system. That's a big deal for which we have to get the fix in quickly.
After that we can follow up with more polishing (e.g. use per-CRTC mutex, review and potentially fix other parts of the code that deal with vblank and maybe assume global mutex protection).
I agree with Daniel's sentiment on this. AFAICT add_wait_queue() protects against concurrent access to the wait queue, so I think it would be better to just drop the mutex explicitly before calling DRM_WAIT_ON.
The problem is that if I release the mutex just before calling DRM_WAIT_ON, the task can be scheduled away right at that point. Then another task can go all the way into the DRM_WAIT_ON and enter the queue. Then the first task wakes up and enters the queue.
Now the last task in the queue is *not* the task that updated the dev->last_vblank_wait[crtc] value last.
If you or someone who knows better than me can tell me that this potential reordering is guaranteed harmless, I'll gladly drop the mutex earlier and then I no longer need the DRM_WAIT_ON_LOCKED macro.
However, if the preservation of the order is actually important, then it will be an ugly race condition to track later.
-- Ilija