(In reply to Tom St Denis from comment #15)
> If you can't reproduce on a newer version of mesa then it's "been fixed" :-)

My (probably incorrect) understanding is roughly this:

    +-------+-------+
1.) |  Application  |
    +-------+-------+
       |
       | Possibly sending bad commands/calls to Mesa
       |
       v
    +------+---------+
2.) |     Mesa       |
    +------+---------+
       |
       | Passing on bad calls from the application
       |     or
       | There is a bug in Mesa itself where it is sending bad calls/commands
to the kernel
       v
    +--------+--------+
3.) |  Kernel/amdgpu  |
    +--------+--------+
       |
       | amdgpu puts the physical device in a bad state due to bad commands
from Mesa
       v
    +--------+--------+
4.) |       GPU       |
    +--------+--------+

Given that mesa 18.3.3+ "fixes" the issue, it sounds like a specific case of
mesa sending garbage to the kernel (step 2 to 3) has been fixed.

But in general shouldn't the kernel driver (ideally) be able to handle mesa
passing malformed/bad commands rather than freezing the device (step 3 to 4)? 
I understand not every case can be covered, and I also understand that GPU
resets need to be supported in user space for seamless recovery, but shouldn't
the driver "unstick" itself enough so the computer can be rebooted normally?

Thanks for your time and patience.