https://bugs.freedesktop.org/show_bug.cgi?id=83500
Priority: medium Bug ID: 83500 Assignee: dri-devel@lists.freedesktop.org Summary: si_dma_copy_tile causes GPU hangs Severity: normal Classification: Unclassified OS: Linux (All) Reporter: greg@chown.ath.cx Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Drivers/Gallium/radeonsi Product: Mesa
Created attachment 105745 --> https://bugs.freedesktop.org/attachment.cgi?id=105745&action=edit Workaround
Async DMA linear to tiled copies are causing GPU hangs in some cases. On Cape Verde, I can easily triggers this as described in [1]. The game Brutal Legend also triggers similar hangs when it streams assets while gameplay.
Disabling usage of this function and using the resource_copy_region fallback instead fixes all hangs. The attached patch does that.
[1] https://bugs.freedesktop.org/show_bug.cgi?id=79980#c124
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #1 from Marek Olšák maraeo@gmail.com --- Thank you very much for tracking this down.
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #2 from Grigori Goronzy greg@chown.ath.cx --- Created attachment 105755 --> https://bugs.freedesktop.org/attachment.cgi?id=105755&action=edit Better fix
This is a possibly better fix that only disables DMA if 1D tiling is involved. Please give it a try.
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #3 from Michel Dänzer michel@daenzer.net --- Maybe we need to determine the other tiling parameters differently for 1D tiling? IIRC Marek fixed things like that before.
Anyway, in the command stream dump you provided before, it looked like the tiling parameters were totally bogus, mostly all 0. I suspect this needs more investigation.
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #4 from Christian König deathsimple@vodafone.de --- (In reply to comment #3)
Anyway, in the command stream dump you provided before, it looked like the tiling parameters were totally bogus, mostly all 0. I suspect this needs more investigation.
Yeah, that's what I noticed immediately as well. Maybe attach a gdb to X, set a breakpoint to si_dma_copy_tile and check what those parameters usually look like.
If they aren't usually all zero (which is likely) we should figure out why they are zero in this special case.
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #5 from Michel Dänzer michel@daenzer.net --- You can just make the breakpoint conditional, e.g.:
b si_dma.c:228 if array_mode == 0
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #6 from Grigori Goronzy greg@chown.ath.cx --- The tiling parameters don't look bogus and they certainly aren't zero. In the dumped IB there's mt = 1, num_banks = 3, tile_split = 3 in DW 7. Looking at DW 3, there is array_mode = 2, bankw = 0, bankh = 0, mtilea = 0. That might as well be completely wrong for the given surface, but it's not bogus in the sense that the values are invalid.
At first I thought bankw/bankh/mtilea being all set to zero was strange, but this seems to match how libdrm sets up 1D tiled surfaces.
For reference, this is the IB we're talking about: http://pastebin.com/jFWk9bU5
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #7 from Michel Dänzer michel@daenzer.net --- Yeah, sorry, I misread DW 3 as having array_mode == 0 when it's actually 2.
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #8 from Christian König deathsimple@vodafone.de --- Oh, yeah you're right got that wrong as well.
But what I hoped to have checked as well is that it sounded like this copy command worked correctly in 90% of all cases and only in a minority just locked up.
Is that correct? Or in other words what makes this special case lock up? Does it work for other resolutions? etc...
https://bugs.freedesktop.org/show_bug.cgi?id=83500
--- Comment #9 from Michel Dänzer michel@daenzer.net --- Do the fixes from http://lists.freedesktop.org/archives/mesa-dev/2014-September/068738.html help?
https://bugs.freedesktop.org/show_bug.cgi?id=83500
Michel Dänzer michel@daenzer.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #10 from Michel Dänzer michel@daenzer.net --- Module: Mesa Branch: master Commit: ae4536b4f71cbe76230ea7edc7eb4d6041e651b4 URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=ae4536b4f71cbe76230ea7edc7e...
Author: Michel Dänzer michel.daenzer@amd.com Date: Tue Nov 11 16:10:20 2014 +0900
radeonsi: Disable asynchronous DMA except for PIPE_BUFFER
dri-devel@lists.freedesktop.org