On Tue, Aug 25, 2015 at 09:36:47AM +0200, Lukas Wunner wrote:
Hi Dave,
as requested I've pushed the MacBook Pro GPU switching stuff to GitHub to ease browsing/reviewing, latest branch is gmux-v5: https://github.com/l1k/linux/commits/gmux-v5
(I had applied for a freedesktop.org account in April with bug 89906 but it's still pending, hence GitHub.)
I've overhauled locking once more because previously all EDID reads were serialized even on machines which don't use vga_switcheroo at all. Seems it's impossible to fix this with mutexes and still be race-free and deadlock-free, so I've changed vgasr_mutex to an rwlock: https://github.com/l1k/linux/commit/ffa4546d85244a53541eed6799f93e8eea7f29e8
There are a number of vga_switcheroo functions added by Takashi Iwai and yourself which don't lock anything at all, I believe this was in part to avoid deadlocks (vga_switcheroo_runtime_resume_hdmi_audio locks vgasr_mutex, wakes up the GPU with vga_switcheroo_runtime_resume which locks vgasr_mutex once again). After changing vgasr_mutex to an rwlock we can safely use locking in those functions as well: https://github.com/l1k/linux/commit/cdea8f5c92039329dde47cadf20b67c7d9af0d62
With this change, on machines which don't use vga_switcheroo, the overhead in drm_get_edid() is exactly 1 read_lock + 1 read_unlock and EDID reads can happen in parallel. Thierry Reding and Alex Deucher are in favor of adding a separate drm_get_edid_switcheroo() and changing the relevant drivers (i915, nouveau, radeon, amdgpu) to use that in lieu of drm_get_edid() so that drivers which don't use vga_switcheroo avoid the lock_ddc/unlock_ddc calls. Performance-wise these additional calls should be negligible, so I think the motivation can only be better readability/clarity. One should also keep in mind that drivers which don't use vga_switcheroo currently might do so in the future, e.g. if mobile GPUs use a big.LITTLE concept like ARM CPUs already do.
Just a quick aside: Switching to rwlocks to make lockdep happy is a fallacy - lockdep unfortunately doesn't accurately track read lock depencies. Which means very likely you didn't fix the locking inversions but just made lockdep shut up about them. Example:
thread A grabs read-lock 1 and mutex 2.
thread B grabs mutex 2 and write-lock 1.
lockdep won't complain here, but if thread A has tries to get mutex 2 while thread B tries to write-lock 1 then there's an obvious deadlock.
I'd highly suggest you try fixing the vga-switcheroo locking without resorting to rw lock primitives - that just means we need to manually prove the locking for many cases which is tons of work. -Daniel