On Wed, May 04, 2022 at 11:17:02AM -0700, Linus Torvalds wrote:
On Wed, May 4, 2022 at 1:19 AM Byungchul Park byungchul.park@lge.com wrote:
Hi Linus and folks,
I've been developing a tool for detecting deadlock possibilities by tracking wait/event rather than lock(?) acquisition order to try to cover all synchonization machanisms.
So what is the actual status of reports these days?
I'd like to mention one important thing here. Reportability would get stronger if the more wait-event pairs get tagged everywhere DEPT can work.
Everything e.g. HW-SW interface, any retry logic and so on can be a wait-event pair if they work wait or event anyway. For example, polling on an IO mapped read register and initiating the HW to go for the event also can be a pair. Definitely those make DEPT more useful.
---
The way to use the APIs:
1. Define SDT(Simple Dependency Tracker)
DEFINE_DEPT_SDT(my_hw_event); <- add this
2. Tag on the waits
sdt_wait(&my_hw_event); <- add this ... retry logic until my hw work done ... <- the original code
3. Tag on the events
sdt_event(&my_hw_event); <- add this run_my_hw(); <- the original code
---
These are all we should do. I believe DEPT would be a very useful tool once all wait-event pairs get tagged by the developers in all subsystems and device drivers.
Byungchul
Last time I looked at some reports, it gave a lot of false positives due to mis-understanding prepare_to_sleep().
For this all to make sense, it would need to not have false positives (or at least a very small number of them together with a way to sanely get rid of them), and also have a track record of finding things that lockdep doesn't.
Maybe such reports have been sent out with the current situation, and I haven't seen them.
Linus