(In reply to Tom St Denis from comment #15) > If you can't reproduce on a newer version of mesa then it's "been fixed" :-) My (probably incorrect) understanding is roughly this: +-------+-------+ 1.) | Application | +-------+-------+ | | Possibly sending bad commands/calls to Mesa | v +------+---------+ 2.) | Mesa | +------+---------+ | | Passing on bad calls from the application | or | There is a bug in Mesa itself where it is sending bad calls/commands to the kernel v +--------+--------+ 3.) | Kernel/amdgpu | +--------+--------+ | | amdgpu puts the physical device in a bad state due to bad commands from Mesa v +--------+--------+ 4.) | GPU | +--------+--------+ Given that mesa 18.3.3+ "fixes" the issue, it sounds like a specific case of mesa sending garbage to the kernel (step 2 to 3) has been fixed. But in general shouldn't the kernel driver (ideally) be able to handle mesa passing malformed/bad commands rather than freezing the device (step 3 to 4)? I understand not every case can be covered, and I also understand that GPU resets need to be supported in user space for seamless recovery, but shouldn't the driver "unstick" itself enough so the computer can be rebooted normally? Thanks for your time and patience.