On Tue, Oct 20, 2020 at 1:24 PM Viresh Kumar viresh.kumar@linaro.org wrote:
On 20-10-20, 12:56, Daniel Vetter wrote:
Yeah that's bad practice. Generally you shouldn't need to hold locks in setup/teardown code, since there's no other thread which can possible hold a reference to anything your touching anymore. Ofc excluding quickly grabbing/dropping a lock to insert/remove objects into lists and stuff.
The other reason is that especially with anything related to sysfs or debugfs, the locking dependencies you're pulling in are enormous: vfs locks pull in mm locks (due to mmap) and at that point there's pretty much nothing left you're allowed to hold while acquiring such a lock. For simple drivers this is no issue, but for fancy drivers (like gpu drivers) which need to interact with core mm) this means your subsystem is a major pain to use.
Usually the correct fix is to only hold your subsystem locks in setup/teardown when absolutely required, and fix any data inconsistency issues by reordering your setup/teardown code: When you register as the last step and unregister as the first step, there's no need for any additional locking. And hence no need to call debugfs functions while holding your subsystem locks.
The catch phrase I use for this is "don't solve object lifetime issues with locking". Instead use refcounting and careful ordering in setup/teardown code.
This is exactly what I have done in the OPP core, the locks were taken only when really necessary, though as we have seen now I have missed that at a single place and that should be fixed as well. Will do that, thanks.
Excellent. If the fix is small enough can you push it into 5.10? That way drm/msm doesn't have to carry the temporary solution for 5.11 (the issue only pops up with the locking rework, which teaches lockdep a few more things about what's going on as a side effect). -Daniel