https://bugs.freedesktop.org/show_bug.cgi?id=60929
Priority: medium Bug ID: 60929 Assignee: dri-devel@lists.freedesktop.org Summary: [r600-llvm] mono games with opengl are blocking on start Severity: normal Classification: Unclassified OS: All Reporter: lordheavym@gmail.com Hardware: Other Status: NEW Version: git Component: Drivers/DRI/R600 Product: Mesa
Mono games using opengl (Rochard, Bastion, Splice, ...) are blocking on start when llvm is in use. The only way to make them start is to define R600_LLVM=0
* Mesa from git * llvm from tstellar repo * radeon HD6870
I've tried to trace through strace/MONO_LOG_LEVEL=debug/apitrace without much success. It seem (for bastion) always stuck in 'glXGetCurrentContext'
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Hardware|Other |x86-64 (AMD64) OS|All |Linux (All)
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Andreas Boll andreas.boll.dev@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|Drivers/DRI/R600 |Drivers/Gallium/r600
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #1 from Laurent carlier lordheavym@gmail.com --- This seem fixed with --enable-shared-llvm
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |hadack@gmx.de
--- Comment #2 from Laurent carlier lordheavym@gmail.com --- *** Bug 64788 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #3 from Tom Stellard tstellar@gmail.com --- Is this still an issue?
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #4 from Laurent carlier lordheavym@gmail.com --- Yes, it's still an issue, i have to disable R600_LLVM to play mono games
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #5 from hadack@gmx.de --- Any news on this one? Its really bad on radeonsi, since theres no workaround. Still happens with llvm and mesa from git.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #6 from Michel Dänzer michel@daenzer.net --- (In reply to comment #5)
Any news on this one? Its really bad on radeonsi, since theres no workaround.
Doesn't --enable-shared-llvm work for you?
Any pointers to freely downloadable games for testing?
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #7 from Laurent carlier lordheavym@gmail.com --- (In reply to comment #6)
(In reply to comment #5)
Any news on this one? Its really bad on radeonsi, since theres no workaround.
Doesn't --enable-shared-llvm work for you?
Any pointers to freely downloadable games for testing?
It's reproducible with Surgeon Simulator 2013 Demo: http://downloads.bossastudios.com/ss2013/surgeonsimulator2013_linux.zip
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #8 from hadack@gmx.de --- (In reply to comment #6)
Doesn't --enable-shared-llvm work for you?
Any pointers to freely downloadable games for testing?
--enable-shared-llvm doesn't make a difference. Here is a small and free game: http://www.desura.com/games/battlemass/download
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #9 from Tom Stellard tstellar@gmail.com --- I'm not really sure what's happening here, but I don't think these closed source games are good enough tests cases to diagnose the problem. Could you try to find a very simple Open Source mono program that will reproduce this bug?
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #10 from Laurent carlier lordheavym@gmail.com --- (In reply to comment #9)
I'm not really sure what's happening here, but I don't think these closed source games are good enough tests cases to diagnose the problem. Could you try to find a very simple Open Source mono program that will reproduce this bug?
It seems i'm able to reproduce the problem with opentk opengl examples: http://www.opentk.com/ http://sourceforge.net/projects/opentk/
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #11 from Tom Stellard tstellar@gmail.com --- (In reply to comment #10)
(In reply to comment #9)
I'm not really sure what's happening here, but I don't think these closed source games are good enough tests cases to diagnose the problem. Could you try to find a very simple Open Source mono program that will reproduce this bug?
It seems i'm able to reproduce the problem with opentk opengl examples: http://www.opentk.com/ http://sourceforge.net/projects/opentk/
Can you point me to instructions for how to compile this code. There are no makefiles, only visual studio project files.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #12 from Laurent carlier lordheavym@gmail.com --- (In reply to comment #11)
It seems i'm able to reproduce the problem with opentk opengl examples: http://www.opentk.com/ http://sourceforge.net/projects/opentk/
Can you point me to instructions for how to compile this code. There are no makefiles, only visual studio project files.
I've used the "package" from AUR: https://aur.archlinux.org/packages/opentk/ where you can find a tarball with the source package
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #13 from hadack@gmx.de --- It is described in the building from source section here: http://www.opentk.com/doc/chapter/1/linux
I did this in the opentk folder:
xbuild OpenTK.sln /p:Configuration=Debug cd Binaries/OpenTK/Debug mono Examples.exe
Some OpenGL examples show similar symptoms, they just stop at some point, others are quitting with a timelimit exceeded message.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #14 from Torsten Kaiser x11@ariolc.dyndns.org --- Created attachment 84467 --> https://bugs.freedesktop.org/attachment.cgi?id=84467&action=edit apitrace from hanging startup of OpenRA
I'm seeing the same problem with the mono game OpenRA from http://open-ra.org/ on an RV730 PRO [Radeon HD 4650] with mesa-9.2-rc1 (but early mesa versions showed the same behaviour).
With Gentoo I'm able to switch the R600_LLVM via useflag, but as soon as I'm using a mesa version with this enabled OpenRA will no longer start. It will just display a black window, the loading symbols never apear.
Running apitrace gives (full apitrace as attachment): 10 glXChooseVisual(dpy = 0x15fbef0, [snip]) = &{visual = 0x1661f58, [snip]} 11 glXCreateContext(dpy = 0x15fbef0, vis = &{visual = 0x1661f58, [snip]) = 0x16734e0 12 glXMakeCurrent(dpy = 0x15fbef0, drawable = 20971535, ctx = 0x16734e0) = True 43 glXMakeCurrent(dpy = 0x15fbef0, drawable = 20971535, ctx = 0x16734e0) = True
Trying gdb it seems one of the mono threads get stuck in radeon_drm_cs_emit_ioctl(), the other 7 threads look like mono internal things relating to its garbage collector.
strace gives: socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 7 connect(7, {sa_family=AF_LOCAL, sun_path=@"/tmp/.X11-unix/X0"}, 20) = 0 [snip] open("/dev/dri/card0", O_RDWR|O_CLOEXEC) = 9 [snip] ioctl(9, 0xc010640b, 0x7fffeb471ea0) = 0 ioctl(9, 0xc00c6469, 0x7fffeb471ec0) = 0 ioctl(9, 0xc020645d, 0x7fffeb471d10) = 0 ioctl(9, 0xc020645d, 0x7fffeb471b10) = 0 ioctl(9, 0xc020645e, 0x7fffeb471b20) = 0 mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_SHARED, 9, 0x112992000) = 0x7f921dfe9000 ioctl(9, 0xc020645d, 0x7fffeb471b20) = 0 ioctl(9, 0xc020645e, 0x7fffeb471b30) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, 9, 0x1129a2000) = 0x7f921dfe8000 ioctl(9, VIDIOC_INT_RESET, 0x24460b0) = 0 ioctl(9, 0xc020645d, 0x7fffeb471db0) = 0 Then some more interactions with fd=7 until it gets stuck with: futex(0x984280, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
At that point only kill -9 helps.
Do you have anything I should try or any info I should provide?
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #15 from Nicholas Miell nmiell@gmail.com --- r600g initializes LLVM without first setting the llvm::DisablePrettyStackTrace variable to true. If this variable is false (the default), LLVM will register a bunch of signal handlers, including for SIGXCPU and SIGPWR, both of which are used by Mono's garbage collector.
gallivm correctly sets llvm::DisablePrettyStackTrace to true, but it runs after r600g has already started calling into LLVM and the signal handlers have been registered.
If you set a breakpoint on r600_create_context, manually set llvm::DisablePrettyStackTrace to true and then continue, the application will function correctly. I tested this using Fractal (a Unity game which deadlocks in sem_wait on startup), Bastion (a MonoGame, also deadlocks in sem_wait), and RepetierHost (an OpenTK app which dies in the SIGXCPU handler at startup).
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #16 from Nicholas Miell nmiell@gmail.com --- Created attachment 84675 --> https://bugs.freedesktop.org/attachment.cgi?id=84675&action=edit temporary workaround patch
Here's a temporary workaround patch. Not for merging, obviously.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #17 from Tom Stellard tstellar@gmail.com --- (In reply to comment #15)
r600g initializes LLVM without first setting the llvm::DisablePrettyStackTrace variable to true. If this variable is false (the default), LLVM will register a bunch of signal handlers, including for SIGXCPU and SIGPWR, both of which are used by Mono's garbage collector.
gallivm correctly sets llvm::DisablePrettyStackTrace to true, but it runs after r600g has already started calling into LLVM and the signal handlers have been registered.
If you set a breakpoint on r600_create_context, manually set llvm::DisablePrettyStackTrace to true and then continue, the application will function correctly. I tested this using Fractal (a Unity game which deadlocks in sem_wait on startup), Bastion (a MonoGame, also deadlocks in sem_wait), and RepetierHost (an OpenTK app which dies in the SIGXCPU handler at startup).
Thanks for tracking this down. I think we'll need to extend the LLVM C API in order to get access to this variable. However, looking through the LLVM code it looks like the PrettyStackTrace handler is registered by a static initializer, so I wonder if setting this variable is enough and if we can guarantee that r600g will set this variable before the handler is initialized.
Also, this seems to me like it is a bug in LLVM. Is it common practice for libraries to override signal handlers of applications?
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #18 from Nicholas Miell nmiell@gmail.com --- (In reply to comment #17)
Thanks for tracking this down. I think we'll need to extend the LLVM C API in order to get access to this variable. However, looking through the LLVM code it looks like the PrettyStackTrace handler is registered by a static initializer, so I wonder if setting this variable is enough and if we can guarantee that r600g will set this variable before the handler is initialized.
I don't think this is true -- IIRC, all the stack traces I saw were the result of one of the runOnFunction methods (either BBPassManager or FPPassManager, I wasn't paying attention) creating a PassManagerPrettyStackEntry object.
Also, this seems to me like it is a bug in LLVM. Is it common practice for libraries to override signal handlers of applications?
Common enough that both Mono and LLVM stomp on each other, but its unambiguously wrong for a shared library to globally modify signal handlers. (Temporarily setting a new handler on entry to your library and later restoring the saved handler before returning is fine, but that only works in the single-threaded case since handlers aren't per-thread. Arguably modern applications shouldn't use any signals at all.)
Mono (generally) gets away with it because it uses crazy signals that applications never touch (SIGPWR is only sent to PID 1 by the kernel on power failure, SIGXCPU is relic of obsolete job billing infrastructure that nobody uses), but had the bad luck of LLVM deciding to future-proof itself against all possible fatal signals.
If I were to be prescriptive, llvm::DisablePrettyStackTrace should be true by default, should only ever be set by clang, and shouldn't be a global variable.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #19 from hadack@gmx.de --- I can confirm that changing DisablePrettyStackTrace to true generally in llvm fixes the startup hang. Tested with different mono based Games(Expedition Conquistador, Rochard, Bastion) on radeonsi. And i have to say I'm quite happy with the performance in the games. Thanks, guys!
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Johannes Obermayr johannesobermayr@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |http://llvm.org/bugs/show_b | |ug.cgi?id=12109, | |https://bugzilla.novell.com | |/show_bug.cgi?id=839074
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |openproggerfreak@gmail.com
--- Comment #20 from Laurent carlier lordheavym@gmail.com --- *** Bug 70650 has been marked as a duplicate of this bug. ***
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #21 from Radist Morse radist.morse@gmail.com --- Confirm the bug on the several unity3d games.
Quickfix works.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #22 from Heiko P zuxez@cs.tu-berlin.de --- Created attachment 88452 --> https://bugs.freedesktop.org/attachment.cgi?id=88452&action=edit Another approach
I can confirm the issue with Mono's signal handling and the hanging of the applications. After digging around Mesa, I came up with the attached patch. Used it successfully with Mesa 9.2.{0,1,2} and git. Probably the fine semantics are still up to the devs, though the static llvm variable is set in a static context and thus hopefully early enough.
Anyway I guess the default llvm value for that flag should probably be inverted.
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Kai kai@dev.carbon-project.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |kai@dev.carbon-project.org
--- Comment #23 from Kai kai@dev.carbon-project.org --- I can confirm this bug with radeonsi with various Unity-based games. With attachment 88452 applied everything works.
Stack: GPU: "PITCAIRN" (ChipID = 0x6819) Linux: 3.11.6 libdrm: 2.4.47 LLVM: SVN:trunk/r193475 libclc: Git:master/4c18120c1a Mesa: Git:master/fa8b1514d3 GLAMOR: Git:master/ba209eeef2 DDX: Git:master/f1dc677e79
https://bugs.freedesktop.org/show_bug.cgi?id=60929
--- Comment #24 from Laurent carlier lordheavym@gmail.com --- Fixed since llvm-3.4svn rev193971. Now the default behavior in LLVM is to have PrettyStackTrace disabled.
Mesa needs also the following patches to build: http://lists.freedesktop.org/archives/mesa-dev/2013-November/047501.html http://lists.freedesktop.org/archives/mesa-dev/2013-November/047625.html
https://bugs.freedesktop.org/show_bug.cgi?id=60929
Laurent carlier lordheavym@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
dri-devel@lists.freedesktop.org