Quoting Daniel Vetter (2020-02-21 15:17:24)
On Fri, Feb 21, 2020 at 3:38 PM Chris Wilson chris@chris-wilson.co.uk wrote:
dma_fence_get_rcu() is used to acquire a reference to under a dma-fence under racey conditions -- a perfect recipe for a disaster. As we know the caller may be handling stale memory, use kasan to confirm the dma-fence, or rather its memory block, is valid before attempting to acquire a reference. This should help us to more quickly and clearly identify lost races.
Hm ... I'm a bit lost on the purpose, and what this does. Fences need to be rcu-freed, and I have honestly no idea how kasan treats those. Are we throwing false positives, because kasan thinks the stuff is freed, but we're still accessing it (while the grace period hasn't passed, so anything freed is still guaranteed to be at least in the slab cache somewhere).
I'm not seeing how this catches lost races quicker, since the refcount should get to 0 way before we get to the kfree. So the refcount check on the next line should catch strictly more races than the kasan check.
It's not about the fence itself, but those pointing to the fence. That's where we may find garbage, and by returning NULL the kernel keeps working for a bit longer as you try to piece together the race. -Chris