https://bugs.freedesktop.org/show_bug.cgi?id=64600
Priority: medium Bug ID: 64600 Assignee: dri-devel@lists.freedesktop.org Summary: r600g pyrit OpenCL issue on HD6850 Severity: normal Classification: Unclassified OS: Linux (All) Reporter: spamjunkeater@gmail.com Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Drivers/Gallium/r600 Product: Mesa
Using openSuSE 12.3 with 3.9.2 kernel, mesa-trunk and llvm-trunk. Pyrit generates this output on radeon HD6850.
pyrit benchmark
Pyrit 0.4.1-dev (svn r308) (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3+
Calibrating... LLVM ERROR: Not supported instr: <MCInst 206 <MCOperand Reg:1046> <MCOperand Reg:1031> <MCOperand Imm:0> <MCOperand Imm:0>>
Detailed debug log is here: https://bugs.freedesktop.org/attachment.cgi?id=78839
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #1 from Tom Stellard tstellar@gmail.com --- This patch should fix the error: http://lists.freedesktop.org/archives/mesa-dev/2013-May/039375.html however there is still another bug that causes pyrit to hang, even with this patch.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #2 from darkbasic darkbasic@linuxsystems.it --- Any news? With latest stack and HD 7950 I get
~ $ pyrit benchmark Pyrit 0.4.0 (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3+
Calibrating... 0x7f981cb8b690: i64 = ExternalSymbol'__muldi3' Stack dump: 0. Running pass 'Function Pass Manager' on module 'radeon'. 1. Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@opencl_pmk_kernel' Errore di segmentazione
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #3 from Erdem U. Altınyurt spamjunkeater@gmail.com --- My HD6850 is stalls at Calibrating...
death@triQuad:/home/compile/svn/pyrit_svn> pyrit benchmark Pyrit 0.4.1-dev (svn r308) (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3+
Calibrating...
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #4 from Tom Stellard tstellar@gmail.com --- Created attachment 86005 --> https://bugs.freedesktop.org/attachment.cgi?id=86005&action=edit Possible Fix
Can you try this patch, and if it doesn't work post the output of R600_DEBUG=cs.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #5 from darkbasic darkbasic@linuxsystems.it --- It hangs at calibrating now:
~ $ R600_DEBUG=cs pyrit benchmark Pyrit 0.4.0 (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3+
Calibrating... ^CTerminato
Where should I find the output of R600_DEBUG=cs? There is nothing even in dmesg.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #6 from Tom Stellard tstellar@gmail.com --- (In reply to comment #5)
It hangs at calibrating now:
~ $ R600_DEBUG=cs pyrit benchmark Pyrit 0.4.0 (C) 2008-2011 Lukas Lueg http://pyrit.googlecode.com This code is distributed under the GNU General Public License v3+
Calibrating... ^CTerminato
Where should I find the output of R600_DEBUG=cs? There is nothing even in dmesg.
This bug was filed against Evergreen/NI GPUs. If you are using SI, you will need to use RADEON_DUMP_SHADERS=1
Otherwise, R600_DEBUG=cs will print output to stderr.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #7 from darkbasic darkbasic@linuxsystems.it --- Created attachment 86011 --> https://bugs.freedesktop.org/attachment.cgi?id=86011&action=edit pyrit debug
Here it is, thanks.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #8 from Tom Stellard tstellar@gmail.com --- (In reply to comment #7)
Created attachment 86011 [details] pyrit debug
Here it is, thanks.
There are no compute shaders in this output. You should let the program run longer, if possible, to give it time to dump all the output.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #9 from darkbasic darkbasic@linuxsystems.it --- Created attachment 86014 --> https://bugs.freedesktop.org/attachment.cgi?id=86014&action=edit pyrit debug 10 minutes
I let it run for 10 minutes and still no compute shaders.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #10 from Erdem U. Altınyurt spamjunkeater@gmail.com --- Created attachment 86118 --> https://bugs.freedesktop.org/attachment.cgi?id=86118&action=edit Debug output
I am attaching debug output generated with R600_DEBUG=cs pyrit benchmark 2> debug.txt
without the patch (attachment 86005).
With the latest proposed patch, there are no output on debug. Just stalling. Don't know why. Thanks.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #11 from darkbasic darkbasic@linuxsystems.it --- Created attachment 86166 --> https://bugs.freedesktop.org/attachment.cgi?id=86166&action=edit debug radeonsi nopatch
I here my
RADEON_DUMP_SHADERS=1 pyrit benchmark 2> debug.txt
with radeonsi (HD 7950) without the patch.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #12 from Tom Stellard tstellar@gmail.com --- These patches should fix the crash on Evergreen/NI GPUs: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131007/190695....
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #13 from Tom Stellard tstellar@gmail.com --- (In reply to comment #12)
These patches should fix the crash on Evergreen/NI GPUs: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131007/190695. html
These patches won't apply to llvm master any more. Here they are in a branch:
http://cgit.freedesktop.org/~tstellar/llvm/log/?h=r600-private-mem-fixes
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #14 from Peter Wu lekensteyn@gmail.com --- Created attachment 87536 --> https://bugs.freedesktop.org/attachment.cgi?id=87536&action=edit piglit test kernel with pyrit
The patches did avoid the crash, but the results are still invalid.
For convenience, I have added some test vectors and glue to the pyrit kernel. I don't know if such a complex testcase is acceptable for piglit, hence I am posting it here. Let me know if such a case is suitable for piglit and if it should be posted to the piglit list too.
The test passes with the POCL implementation but not with the R600 one.
Without the patches the output is: ## Test: Pyrit WPA2-PSK accelerator (/src/piglit/tests/cl/program/program-tester.c) ##
# Running on: # Platform: Default # Device: AMD BARTS # OpenCL version: 1.1 # OpenCL C version: 1.1 # Build options: -cl-std=CL1.1 Program has been built successfully
Running kernel test:
Using kernel opencl_pmk_kernel Setting kernel arguments... Running the kernel... cl-program-tester: /src/llvm/include/llvm/MC/MCRegisterInfo.h:65: unsigned int llvm::MCRegisterClass::getRegister(unsigned int) const: Assertion `i < getNumRegs() && "Register number out of range!"' failed. Stack dump: 0. Running pass 'Function Pass Manager' on module 'radeon'. 1. Running pass 'R600 Handle indirect addressing' on function '@opencl_pmk_kernel' Aborted (core dumped)
With branch tsellar/r600-private-mem-fixes, up to commit cf5d9a2: [..] Running the kernel... Validating results... Expecting 3201012614 (0xbecb9386) with tolerance 0, but got 844928435 (0x325c95b3) Error at uint[0] Argument 1: FAIL PIGLIT:subtest {'' : 'fail'}
Some or all of the tests FAILED
# Result: PIGLIT: {'result': 'fail' }
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #15 from Peter Wu lekensteyn@gmail.com --- Created attachment 87607 --> https://bugs.freedesktop.org/attachment.cgi?id=87607&action=edit Piglit: test copying a struct (with at least three fields)
This is a smaller test case that highlights one specific issue: copying structures (consisting of at least three integers) fails.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #16 from Peter Wu lekensteyn@gmail.com --- Created attachment 87609 --> https://bugs.freedesktop.org/attachment.cgi?id=87609&action=edit R600_DEBUG=cs cl-program-tester struct-copy.cl
The parameter tests passes when the struct contains one member, but the struct assignment still fails. Attached is the shader debug output for the original test case.
LLVM: r600-private-mem-fixes rebased on master (01436ba3066b99547c1138edf5c36ef2ad467e71, SVN rev 192587) Mesa: git snb-magic-18531-ge6c2afa
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #17 from Tom Stellard tstellar@gmail.com --- Thanks for adding the tests. It would be great if you could send these as patches to piglit@lists.freedesktop.org, so we can add them to the test suite.
This patch should fix the struct copy tests: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20131014/191163....
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #18 from Peter Wu lekensteyn@gmail.com --- Created attachment 87706 --> https://bugs.freedesktop.org/attachment.cgi?id=87706&action=edit piglit test (messy, with comments) that fails on R600
I'll await the feedback in your thread before submitting it to piglit.
So the struct bug is fixed, yet I encountered another issue that seems to be related to alignment. Please find attached piglit test, it fails on R600 but passes with POCL and a C wrapper.
The Pyrit kernel has been stripped, only the sha1_process function remains (with one macro expanded and every macro thereafter undef'd). See the comments in the test.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #19 from Tom Stellard tstellar@gmail.com --- (In reply to comment #18)
Created attachment 87706 [details] piglit test (messy, with comments) that fails on R600
I'll await the feedback in your thread before submitting it to piglit.
So the struct bug is fixed, yet I encountered another issue that seems to be related to alignment. Please find attached piglit test, it fails on R600 but passes with POCL and a C wrapper.
The Pyrit kernel has been stripped, only the sha1_process function remains (with one macro expanded and every macro thereafter undef'd). See the comments in the test.
I have a lot of patches that are on the mailing list waiting for review. Here is a branch containing all of these patches: http://cgit.freedesktop.org/~tstellar/llvm/log/?h=master-testing I will try to keep this branch updated as I submit more patches. The sha-process test you posted passes for me with this branch.
https://bugs.freedesktop.org/show_bug.cgi?id=64600
--- Comment #20 from Tom Stellard tstellar@gmail.com --- Created attachment 87757 --> https://bugs.freedesktop.org/attachment.cgi?id=87757&action=edit Additional Bug fix - Apply to my master-testing branch
https://bugs.freedesktop.org/show_bug.cgi?id=64600
Peter Wu lekensteyn@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED
--- Comment #21 from Peter Wu lekensteyn@gmail.com --- Works now with the latest mesa and llvm. If the performance is still bad, you are probably affected by a bad workgroup size. In that case, try Tom's pyrit-perfs[1][2] patches on top of ac81b6f2be8779022e8641984b09118b57263128.
The second patch[2] is already upstreamed, but patch [1] does not apply on the current master because of other changes in the area.
[1]: http://people.freedesktop.org/~tstellar/pyrit-perf/0001-XXX-clover-Calculate... [2]: http://people.freedesktop.org/~tstellar/pyrit-perf/0001-radeon-llvm-Specify-...
dri-devel@lists.freedesktop.org