On Sat, 02 Oct 2021, Hugh Dickins hughd@google.com wrote:
On Sat, 2 Oct 2021, Linus Torvalds wrote:
On Sat, Oct 2, 2021 at 5:17 AM Steven Rostedt rostedt@goodmis.org wrote:
On Sat, 2 Oct 2021 03:17:29 -0700 (PDT) Hugh Dickins hughd@google.com wrote:
Yes (though bisection doesn't work right on this one): the fix
Interesting, as it appeared to be very reliable. But I didn't do the "try before / after" on the patch.
Well, even the before/after might well have worked, since the problem depended on how that sw_fence_dummy_notify() function ended up aligned. So random unrelated changes could re-align it just by mistake.
Yup.
Patch applied directly.
Great, thanks a lot.
Thanks & sorry, really looks like we managed to drop this between the cracks. :(
I'd also like to point out how that BUG_ON() actually made things worse, and made this harder to debug. If it had been a WARN_ON_ONCE(), this would presumably not even have needed bisecting, it would have been obvious.
BUG_ON() really is pretty much *always* the wrong thing to do. It onl;y results in problems being harder to see because you end up with a dead machine and the message is often hidden.
Jani made the same point. But I guess they then went off into the weeds of how to recover when warning, that the fix itself did not progress.
Yes. That, as well as removing the entire alignment thing to reuse a couple of bits for flags. Too fragile for its own good.
BR, Jani.