https://bugs.freedesktop.org/show_bug.cgi?id=27901
Summary: GLSL cos/sin functions broken on Mesa R600 driver Product: Mesa Version: git Platform: All OS/Version: All Status: NEW Severity: normal Priority: medium Component: Drivers/DRI/R600 AssignedTo: dri-devel@lists.freedesktop.org ReportedBy: alain.perrot@gmail.com
As I was playing with OpenGL examples from Joe Groff's blog at http://duriansoftware.com/joe/An-intro-to-modern-OpenGL.-Table-of-Contents.h..., I noticed that the cos and sin GLSL functions are broken on the Mesa R600 driver. Other trigonometric functions may be broken as well.
The blog entries provide two versions of a sample program that blend two images together using GLSL. The blend factor is computed from a timestamp using the sin function.
There's no problem with the example from chapter 2 where the blend factor is computed on the CPU (source code at http://github.com/jckarter/hello-gl).
Mesa R600 driver fails to properly run the example from chapter 3 where the blend factor is computed on the GPU using the GLSL sin function. Mesa software renderer has no problem running this example (source code at http://github.com/jckarter/hello-gl-ch3).
Looking at GLSL 1.10 spec, R600 ISA doc and R600 driver code (function assemble_TRIG in r700_assembler.c), I guess there is two problems :
1. both the GLSL cos/sin functions and R600 hardware expects the angle to be specified in radians, but the assemble_TRIG function divides the specified angle by 2 * PI (the result is no more radians).
2. R600 hardware expects the angle to be in the range [-PI, PI] while GLSL does not specify a range, and the assemble_TRIG function does not clip the angle in the required range.
A quick and dirty hack to make the example from chapter 3 work with R600 driver is to clip the timestamp to the [-PI, PI] range and then to multiply it by 2 * PI :
--- hello-gl.c.old 2010-04-29 20:55:55.000000000 +0200 +++ hello-gl.c 2010-04-29 20:56:12.000000000 +0200 @@ -202,7 +202,10 @@ static void update_timer(void) { int milliseconds = glutGet(GLUT_ELAPSED_TIME); - g_resources.timer = (float)milliseconds * 0.001f; + g_resources.timer = fmodf((float)milliseconds * 0.001f, 2.0f * 3.1415926535897f); + if(g_resources.timer > 3.1415926535897f) + g_resources.timer -= 2.0f * 3.1415926535897f; + g_resources.timer *= 2.0f * 3.1415926535897f; glutPostRedisplay(); }
My config: RV670 (Radeon HD 3870) Kubuntu 10.04 "Lucid Lynx" 64-bit Linux kernel 2.6.34-rc5 from Ubuntu Mainline Kernel PPA libdrm, radeon, Mesa, Xorg from Xorg-edgers PPA
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #1 from Conn Clark conn.o.clark@gmail.com 2010-05-17 15:10:30 PDT --- A proper fix would be to fix the r700_assembler.c code so that it generates code to translate the radians input to a corresponding value between PI and -PI before passing it to the SIN or COS instructions.
I *think* that this could be achieved through the following r600 instructions
inst count unit opcode DEST SRC0, SRC1, SRC3 0 x: MUL tmp.x, ang, 1/(2.0 * pi) 1 x: FRACT tmp.x, tmp.x 2 x: ADD tmp.x, tmp.x, -0.5F y: MULADD tmp.y, tmp.x, (2.0 * pi), -(2.0 * pi) z: MUL tmp.z, tmp.x, (2.0 * pi) 3 x: CMOVEGT tmp.x, tmp.x, tmp.y, tmp.z 4 t: SIN or COS dest, tmp.x
Unfortunately I can't quite figure out how to do it in the code yet to try and see.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #2 from Conn Clark conn.o.clark@gmail.com 2010-05-18 19:25:02 PDT --- Created an attachment (id=35743) View: https://bugs.freedesktop.org/attachment.cgi?id=35743 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35743
r700_assembler.c Trig function patch
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #3 from Conn Clark conn.o.clark@gmail.com 2010-05-18 19:33:26 PDT --- Alain,
Okay, The patch I just posted might fix this bug. It doesn't cause any additional errors in piglit either. I think its working right with http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it works for you.
Note: it could probably be done so its faster but I'm still not to sure on how everything works yet in the software
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #4 from Alain Perrot alain.perrot@gmail.com 2010-05-19 15:58:12 PDT --- (In reply to comment #3)
Alain,
Okay, The patch I just posted might fix this bug. It doesn't cause any additional errors in piglit either. I think its working right with http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it works for you.
Note: it could probably be done so its faster but I'm still not to sure on how everything works yet in the software
Conn
Thanks for your help.
Unfortunately, your patch does not fix the cos/sin functions (at least on my Radeon HD 3870 / RV670). The hello-gl-ch3 example works better but still not as expected, you can compare with Mesa software rendering.
I tried to play with your patch to make it work without success for now. A note is that there may be a mistake on the last operand to the CNDGT instruction :
+ setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE); + pAsm->S[1].src.rtype = DST_REG_TEMPORARY; + pAsm->S[1].src.reg = tmp2; + setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
should probably be :
+ setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE); + pAsm->S[2].src.rtype = DST_REG_TEMPORARY; + pAsm->S[2].src.reg = tmp2; + setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
But this update does not make the cos/sin functions work.
I will try again tomorrow.
On Wed, May 19, 2010 at 3:58 PM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #4 from Alain Perrot alain.perrot@gmail.com 2010-05-19 15:58:12 PDT --- (In reply to comment #3)
Alain,
Okay, The patch I just posted might fix this bug. It doesn't cause any additional errors in piglit either. I think its working right with http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it works for you.
Note: it could probably be done so its faster but I'm still not to sure on how everything works yet in the software
Conn
Thanks for your help.
Unfortunately, your patch does not fix the cos/sin functions (at least on my Radeon HD 3870 / RV670). The hello-gl-ch3 example works better but still not as expected, you can compare with Mesa software rendering.
I tried to play with your patch to make it work without success for now. A note is that there may be a mistake on the last operand to the CNDGT instruction :
- setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
- pAsm->S[1].src.rtype = DST_REG_TEMPORARY;
- pAsm->S[1].src.reg = tmp2;
- setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
should probably be :
- setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
- pAsm->S[2].src.rtype = DST_REG_TEMPORARY;
- pAsm->S[2].src.reg = tmp2;
- setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
But this update does not make the cos/sin functions work.
I will try again tomorrow.
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Okay I'll look at it some more tonight. At least I know I'm on the right track. Thanks for testing.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #5 from Conn Clark conn.o.clark@gmail.com 2010-05-19 16:46:22 PDT --- On Wed, May 19, 2010 at 3:58 PM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #4 from Alain Perrot alain.perrot@gmail.com 2010-05-19 15:58:12 PDT --- (In reply to comment #3)
Alain,
Okay, The patch I just posted might fix this bug. It doesn't cause any additional errors in piglit either. I think its working right with http://github.com/jckarter/hello-gl-ch3 too. Of course I have only tested it on my RadeonHD 3100 and its my first attempt at r600 assembly so let me know if it works for you.
Note: it could probably be done so its faster but I'm still not to sure on how everything works yet in the software
Conn
Thanks for your help.
Unfortunately, your patch does not fix the cos/sin functions (at least on my Radeon HD 3870 / RV670). The hello-gl-ch3 example works better but still not as expected, you can compare with Mesa software rendering.
I tried to play with your patch to make it work without success for now. A note is that there may be a mistake on the last operand to the CNDGT instruction :
- setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
- pAsm->S[1].src.rtype = DST_REG_TEMPORARY;
- pAsm->S[1].src.reg = tmp2;
- setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
should probably be :
- setaddrmode_PVSSRC(&(pAsm->S[2].src), ADDR_ABSOLUTE);
- pAsm->S[2].src.rtype = DST_REG_TEMPORARY;
- pAsm->S[2].src.reg = tmp2;
- setswizzle_PVSSRC(&(pAsm->S[2].src), SQ_SEL_X);
But this update does not make the cos/sin functions work.
I will try again tomorrow.
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Okay I'll look at it some more tonight. At least I know I'm on the right track. Thanks for testing.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Conn Clark conn.o.clark@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #35743|0 |1 is obsolete| |
--- Comment #6 from Conn Clark conn.o.clark@gmail.com 2010-05-20 12:39:38 PDT --- Created an attachment (id=35776) --> (https://bugs.freedesktop.org/attachment.cgi?id=35776) r700_assembler.c Trig function & sincos patch
This patch fixes the assemble_TRIG function and eliminates two errors in piglit when running the glean/glsl1 test case.
I took the liberty of applying the same change to the assemble_SCS function as well since it to would suffer from the same problem.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #7 from Conn Clark conn.o.clark@gmail.com 2010-05-20 12:46:03 PDT --- Alain,
The patch I just posted fixes the problem on my machine completely. The problem with the previous version of the patch (other than the source mistake you found ;) )has something to do me coding in the 0.5 special constant. Replacing it with a literal 0.5 constant works.
I also extended the definition of PI 3 places to avoid drift of the values when it gets an input significantly greater or less than PI and -PI.
Let me know how it works for you.
Thanks.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Joshua Roys roysjosh@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #35776|application/octet-stream |text/plain mime type| | Attachment #35776|0 |1 is patch| |
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #8 from Alain Perrot alain.perrot@gmail.com 2010-05-20 17:40:21 PDT --- Created an attachment (id=35777) View: https://bugs.freedesktop.org/attachment.cgi?id=35777 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35777
Alternative assemble_TRIG fix
I can confirm that your patch seems to work for me too.
By the way, you beat me at posting a working patch here :-)
I also figured out that the 0.5 special constant was an issue in your patch, and I managed to get a working assemble_TRIG function which implements the following instruction sequence (lightly different of yours) to normalize the angle:
MULADD tmp.x, angle, 1/(2*PI), 0.5 FRACT tmp.x, tmp.x ADD tmp.y, tmp.x, 1 CNDGE tmp.x, tmp.x, tmp.x, tmp.y MULADD tmp.x, tmp.x, 2*PI, -PI
I don't known if it is better or worse than yours beside the fact that it use only one helper variable.
I attached my patch (updated to use the same extended value of PI than yours) which fix the assemble_TRIG function, but not the assemble_SCS one.
On Thu, May 20, 2010 at 5:40 PM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #8 from Alain Perrot alain.perrot@gmail.com 2010-05-20 17:40:21 PDT --- Created an attachment (id=35777) View: https://bugs.freedesktop.org/attachment.cgi?id=35777 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35777
Alternative assemble_TRIG fix
I can confirm that your patch seems to work for me too.
By the way, you beat me at posting a working patch here :-)
I also figured out that the 0.5 special constant was an issue in your patch, and I managed to get a working assemble_TRIG function which implements the following instruction sequence (lightly different of yours) to normalize the angle:
MULADD tmp.x, angle, 1/(2*PI), 0.5 FRACT tmp.x, tmp.x ADD tmp.y, tmp.x, 1 CNDGE tmp.x, tmp.x, tmp.x, tmp.y MULADD tmp.x, tmp.x, 2*PI, -PI
I don't known if it is better or worse than yours beside the fact that it use only one helper variable.
I attached my patch (updated to use the same extended value of PI than yours) which fix the assemble_TRIG function, but not the assemble_SCS one.
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Its a tough call on who's is the better solution. Yours uses one less temp reg and mine will allow for a couple of operations to be done in parallel in the future. I guess we both deserve a pat on the back and leave it to someone more experienced to make the call on which one to choose.
Good job
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #9 from Conn Clark conn.o.clark@gmail.com 2010-05-20 18:08:28 PDT --- On Thu, May 20, 2010 at 5:40 PM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #8 from Alain Perrot alain.perrot@gmail.com 2010-05-20 17:40:21 PDT --- Created an attachment (id=35777)
View: https://bugs.freedesktop.org/attachment.cgi?id=35777 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35777
View: https://bugs.freedesktop.org/attachment.cgi?id=35777 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35777
Alternative assemble_TRIG fix
I can confirm that your patch seems to work for me too.
By the way, you beat me at posting a working patch here :-)
I also figured out that the 0.5 special constant was an issue in your patch, and I managed to get a working assemble_TRIG function which implements the following instruction sequence (lightly different of yours) to normalize the angle:
MULADD tmp.x, angle, 1/(2*PI), 0.5 FRACT tmp.x, tmp.x ADD tmp.y, tmp.x, 1 CNDGE tmp.x, tmp.x, tmp.x, tmp.y MULADD tmp.x, tmp.x, 2*PI, -PI
I don't known if it is better or worse than yours beside the fact that it use only one helper variable.
I attached my patch (updated to use the same extended value of PI than yours) which fix the assemble_TRIG function, but not the assemble_SCS one.
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Its a tough call on who's is the better solution. Yours uses one less temp reg and mine will allow for a couple of operations to be done in parallel in the future. I guess we both deserve a pat on the back and leave it to someone more experienced to make the call on which one to choose.
Good job
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #10 from Alain Perrot alain.perrot@gmail.com 2010-05-21 00:48:34 PDT ---
Alain,
Its a tough call on who's is the better solution. Yours uses one less temp reg and mine will allow for a couple of operations to be done in parallel in the future. I guess we both deserve a pat on the back and leave it to someone more experienced to make the call on which one to choose.
Good job
Conn
You're probably right (and I known nothing about parallelization on GPUs).
In any way, many thanks for your help.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #11 from Conn Clark conn.o.clark@gmail.com 2010-05-21 09:32:33 PDT --- On Fri, May 21, 2010 at 12:48 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #10 from Alain Perrot alain.perrot@gmail.com 2010-05-21 00:48:34 PDT ---
Alain,
Its a tough call on who's is the better solution. Yours uses one less temp reg and mine will allow for a couple of operations to be done in parallel in the future. I guess we both deserve a pat on the back and leave it to someone more experienced to make the call on which one to choose.
Good job
Conn
You're probably right (and I known nothing about parallelization on GPUs).
In any way, many thanks for your help.
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
For the current code I think your patch a tad better. It uses one less instruction. Short of benchmarking the two solutions I think yours is the one that should go in. Would you please submit a patch that includes the assemble_SCS function as well. After you submit it with that change and a notice that you have singed off on it. I'll nominate it over my own to go in.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Alain Perrot alain.perrot@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #35777|0 |1 is obsolete| |
--- Comment #12 from Alain Perrot alain.perrot@gmail.com 2010-05-21 11:13:05 PDT --- Created an attachment (id=35787) View: https://bugs.freedesktop.org/attachment.cgi?id=35787 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35787
Alternative assemble_TRIG and assemble_SCS fix
Conn,
Attached is the updated patch which includes the assemble_SCS function. If it is ok for you, I will submit it (I guess that it should be sent to the dri-devel mailing list ?)
On Fri, May 21, 2010 at 11:13 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Alain Perrot alain.perrot@gmail.com changed:
What |Removed |Added
Attachment #35777|0 |1 is obsolete| |
--- Comment #12 from Alain Perrot alain.perrot@gmail.com 2010-05-21 11:13:05 PDT --- Created an attachment (id=35787) View: https://bugs.freedesktop.org/attachment.cgi?id=35787 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35787
Alternative assemble_TRIG and assemble_SCS fix
Conn,
Attached is the updated patch which includes the assemble_SCS function. If it is ok for you, I will submit it (I guess that it should be sent to the dri-devel mailing list ?)
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Its on the mailing list.
I'll inform them to merge it after I run piglit and verify it works.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #13 from Conn Clark conn.o.clark@gmail.com 2010-05-21 11:56:36 PDT --- On Fri, May 21, 2010 at 11:13 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Alain Perrot alain.perrot@gmail.com changed:
What |Removed |Added
Attachment #35777|0 |1 is obsolete| |
--- Comment #12 from Alain Perrot alain.perrot@gmail.com 2010-05-21 11:13:05 PDT --- Created an attachment (id=35787)
View: https://bugs.freedesktop.org/attachment.cgi?id=35787 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35787
View: https://bugs.freedesktop.org/attachment.cgi?id=35787 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=35787
Alternative assemble_TRIG and assemble_SCS fix
Conn,
Attached is the updated patch which includes the assemble_SCS function. If it is ok for you, I will submit it (I guess that it should be sent to the dri-devel mailing list ?)
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Alain,
Its on the mailing list.
I'll inform them to merge it after I run piglit and verify it works.
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Conn Clark conn.o.clark@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED
--- Comment #14 from Conn Clark conn.o.clark@gmail.com 2010-05-21 13:45:19 PDT --- Alain,
Your patch passes piglit.
I'm going to change the status of this bug now to FIXED
Conn
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #15 from Alain Perrot alain.perrot@gmail.com 2010-05-21 14:04:24 PDT --- (In reply to comment #14)
Alain,
Your patch passes piglit.
I'm going to change the status of this bug now to FIXED
Conn
Ok, thanks.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Conn Clark conn.o.clark@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |
--- Comment #16 from Conn Clark conn.o.clark@gmail.com 2010-06-07 09:17:41 PDT --- Reopened because patch has not yet been applied
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Alex Deucher agd5f@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |amaasikas@gmail.com
--- Comment #17 from Alex Deucher agd5f@yahoo.com 2010-06-07 09:30:18 PDT --- Andre wanted to review this before applying.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #18 from Andre Maasikas amaasikas@gmail.com 2010-06-08 05:53:36 PDT --- dont' have much net this week to review/test:( but i'm ok with it if you make last mul conditional on r700 as it has -1..1 range it seems, also amd shader analyzer gives this difference:
RV610 hd2400
; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(8) 0 y: MULADD R123.y, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 z: MOV R0.z, 0.0f w: MOV R0.w, 1.0f 1 x: FRACT ____, PV0.y 2 z: MULADD R123.z, PV1.x, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 t: SIN R0.x, PV2.z 01 EXP_DONE: PIX0, R0.xzzw END_OF_PROGRAM
4870 RV770 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) 0 y: MOV R0.y, 0.0f z: MOV R0.z, 1.0f w: MULADD R123.w, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 1 y: FRACT ____, PV0.w 2 x: MULADD R123.x, PV1.y, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 z: MUL ____, PV2.x, (0x3E22F983, 0.1591549367f).x 4 t: SIN R0.x, PV3.z 01 EXP_DONE: PIX0, R0.xyyz END_OF_PROGRAM
On Tue, Jun 8, 2010 at 5:53 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #18 from Andre Maasikas amaasikas@gmail.com 2010-06-08 05:53:36 PDT --- dont' have much net this week to review/test:( but i'm ok with it if you make last mul conditional on r700 as it has -1..1 range it seems, also amd shader analyzer gives this difference:
RV610 hd2400
; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(8) 0 y: MULADD R123.y, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 z: MOV R0.z, 0.0f w: MOV R0.w, 1.0f 1 x: FRACT ____, PV0.y 2 z: MULADD R123.z, PV1.x, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 t: SIN R0.x, PV2.z 01 EXP_DONE: PIX0, R0.xzzw END_OF_PROGRAM
4870 RV770 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) 0 y: MOV R0.y, 0.0f z: MOV R0.z, 1.0f w: MULADD R123.w, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 1 y: FRACT ____, PV0.w 2 x: MULADD R123.x, PV1.y, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 z: MUL ____, PV2.x, (0x3E22F983, 0.1591549367f).x 4 t: SIN R0.x, PV3.z 01 EXP_DONE: PIX0, R0.xyyz END_OF_PROGRAM
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
This is very very strange that amd would change the instruction. I wonder if it is a bug in their code.... Perhaps we need someone with an r700 to run the sin and cos tests in piglit . The proposed patch passes on my rs780 (rv610) .
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #19 from Conn Clark conn.o.clark@gmail.com 2010-06-08 09:09:49 PDT --- On Tue, Jun 8, 2010 at 5:53 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #18 from Andre Maasikas amaasikas@gmail.com 2010-06-08 05:53:36 PDT --- dont' have much net this week to review/test:( but i'm ok with it if you make last mul conditional on r700 as it has -1..1 range it seems, also amd shader analyzer gives this difference:
RV610 hd2400
; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(8) 0 y: MULADD R123.y, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 z: MOV R0.z, 0.0f w: MOV R0.w, 1.0f 1 x: FRACT ____, PV0.y 2 z: MULADD R123.z, PV1.x, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 t: SIN R0.x, PV2.z 01 EXP_DONE: PIX0, R0.xzzw END_OF_PROGRAM
4870 RV770 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) 0 y: MOV R0.y, 0.0f z: MOV R0.z, 1.0f w: MULADD R123.w, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 1 y: FRACT ____, PV0.w 2 x: MULADD R123.x, PV1.y, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 z: MUL ____, PV2.x, (0x3E22F983, 0.1591549367f).x 4 t: SIN R0.x, PV3.z 01 EXP_DONE: PIX0, R0.xyyz END_OF_PROGRAM
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
This is very very strange that amd would change the instruction. I wonder if it is a bug in their code.... Perhaps we need someone with an r700 to run the sin and cos tests in piglit . The proposed patch passes on my rs780 (rv610) .
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #20 from Conn Clark conn.o.clark@gmail.com 2010-06-08 09:26:01 PDT --- On Tue, Jun 8, 2010 at 9:09 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #19 from Conn Clark conn.o.clark@gmail.com 2010-06-08 09:09:49 PDT --- On Tue, Jun 8, 2010 at 5:53 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #18 from Andre Maasikas amaasikas@gmail.com 2010-06-08 05:53:36 PDT --- dont' have much net this week to review/test:( but i'm ok with it if you make last mul conditional on r700 as it has -1..1 range it seems, also amd shader analyzer gives this difference:
RV610 hd2400
; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(8) 0 y: MULADD R123.y, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 z: MOV R0.z, 0.0f w: MOV R0.w, 1.0f 1 x: FRACT ____, PV0.y 2 z: MULADD R123.z, PV1.x, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 t: SIN R0.x, PV2.z 01 EXP_DONE: PIX0, R0.xzzw END_OF_PROGRAM
4870 RV770 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) 0 y: MOV R0.y, 0.0f z: MOV R0.z, 1.0f w: MULADD R123.w, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 1 y: FRACT ____, PV0.w 2 x: MULADD R123.x, PV1.y, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 z: MUL ____, PV2.x, (0x3E22F983, 0.1591549367f).x 4 t: SIN R0.x, PV3.z 01 EXP_DONE: PIX0, R0.xyyz END_OF_PROGRAM
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
This is very very strange that amd would change the instruction. I wonder if it is a bug in their code.... Perhaps we need someone with an r700 to run the sin and cos tests in piglit . The proposed patch passes on my rs780 (rv610) .
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Nope Andre, is right.
the evergreen docs describe the sine function as follows
Scalar sine. Input must be normalized from radians by dividing by 2*PI. The valid input domain is [-256, +256], which corresponds to an un-normalized input domain [-512*PI, +512*PI]. Out-of-range input results in float 0.
while the r600 describes the sine function as follows
Scalar sine. Valid input domain [-PI, +PI].
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #21 from Conn Clark conn.o.clark@gmail.com 2010-06-08 09:36:57 PDT --- On Tue, Jun 8, 2010 at 9:25 AM, Conn Clark conn.o.clark@gmail.com wrote:
On Tue, Jun 8, 2010 at 9:09 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #19 from Conn Clark conn.o.clark@gmail.com 2010-06-08 09:09:49 PDT --- On Tue, Jun 8, 2010 at 5:53 AM, bugzilla-daemon@freedesktop.org wrote:
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #18 from Andre Maasikas amaasikas@gmail.com 2010-06-08 05:53:36 PDT --- dont' have much net this week to review/test:( but i'm ok with it if you make last mul conditional on r700 as it has -1..1 range it seems, also amd shader analyzer gives this difference:
RV610 hd2400
; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(8) 0 y: MULADD R123.y, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 z: MOV R0.z, 0.0f w: MOV R0.w, 1.0f 1 x: FRACT ____, PV0.y 2 z: MULADD R123.z, PV1.x, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 t: SIN R0.x, PV2.z 01 EXP_DONE: PIX0, R0.xzzw END_OF_PROGRAM
4870 RV770 ; -------- Disassembly -------------------- 00 ALU: ADDR(32) CNT(10) 0 y: MOV R0.y, 0.0f z: MOV R0.z, 1.0f w: MULADD R123.w, R0.x, (0x3E22F983, 0.1591549367f).x, 0.5 1 y: FRACT ____, PV0.w 2 x: MULADD R123.x, PV1.y, (0x40C90FDB, 6.283185482f).y, -(0x40490FDB, 3.141592741f).x 3 z: MUL ____, PV2.x, (0x3E22F983, 0.1591549367f).x 4 t: SIN R0.x, PV3.z 01 EXP_DONE: PIX0, R0.xyyz END_OF_PROGRAM
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
This is very very strange that amd would change the instruction. I wonder if it is a bug in their code.... Perhaps we need someone with an r700 to run the sin and cos tests in piglit . The proposed patch passes on my rs780 (rv610) .
-- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Nope Andre, is right.
the evergreen docs describe the sine function as follows
Scalar sine. Input must be normalized from radians by dividing by 2*PI. The valid input domain is [-256, +256], which corresponds to an un-normalized input domain [-512*PI, +512*PI]. Out-of-range input results in float 0.
while the r600 describes the sine function as follows
Scalar sine. Valid input domain [-PI, +PI].
--
Conn O. Clark
Observation: In formal computer science advances are made by standing on the shoulders of giants. Linux has proved that if there are enough of you, you can advance just as far by stepping on each others toes.
I guess I'll modify Alain's patch tonight. If r700 has a valid range -256 to 256 we can get away with just using a FRACT instruction with 2*PI.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #22 from Andre Maasikas amaasikas@gmail.com 2010-07-29 07:58:26 PDT --- Created an attachment (id=37437) View: https://bugs.freedesktop.org/attachment.cgi?id=37437 Review: https://bugs.freedesktop.org/review?bug=27901&attachment=37437
patch to try
If you could try the attached patch on r600, it passes fine for me on r700.
https://bugs.freedesktop.org/show_bug.cgi?id=27901
--- Comment #23 from Alain Perrot alain.perrot@gmail.com 2010-07-29 09:12:45 PDT --- I have just tested your patch with my RV670 GPU, and the hello-gl-ch3 example program works fine here.
The piglit glean/glsl1-cos(vec4) and glean/glsl1-sin(vec4) tests are also fixed with this patch.
Thanks!
https://bugs.freedesktop.org/show_bug.cgi?id=27901
Alain Perrot alain.perrot@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED
--- Comment #24 from Alain Perrot alain.perrot@gmail.com 2010-08-03 10:05:39 PDT --- I've seen that Andre's patch has been committed to Mesa repository (commit d6a5f94ea4d03b05c434fcad125d1f9c50c638e8), so I'm closing this bug as fixed.
Thanks again!
dri-devel@lists.freedesktop.org