| Summary: | we missed some important fp ops | ||
|---|---|---|---|
| Product: | Libre-SOC's first SoC | Reporter: | Jacob Lifshay <programmerjake> |
| Component: | Specification | Assignee: | Jacob Lifshay <programmerjake> |
| Status: | RESOLVED FIXED | ||
| Severity: | major | CC: | libre-soc-isa, lkcl |
| Priority: | --- | ||
| Version: | unspecified | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| URL: | https://libre-soc.org/openpower/transcendentals | ||
| See Also: |
https://bugs.libre-soc.org/show_bug.cgi?id=127 https://bugs.libre-soc.org/show_bug.cgi?id=961 |
||
| NLnet milestone: | --- | total budget (EUR) for completion of task and all subtasks: | 0 |
| budget (EUR) for this task, excluding subtasks' budget: | 0 | parent task for budget allocation: | |
| child tasks for budget allocation: | The table of payments (in EUR) for this task; TOML format: | ||
| Bug Depends on: | |||
| Bug Blocks: | 1027, 899 | ||
|
Description
Jacob Lifshay
2022-09-08 08:50:21 BST
technically there's also x86's maxss operations, they implement the C function:
float f(float a, float b) {
return a < b ? a : b;
}
specifically if either input is a NaN or if both inputs are equal or if both inputs are zero of either sign they always return b. They never convert a signalling NaN to a quiet Nan.
If we also add that, it would fill out the min/max/minmag/maxmag variants to 8, fitting nicely in a 3-bit mode field. or if we decide we don't want minmag/maxmag, it would fill out the variants to 4, fitting in a 2-bit mode field.
(In reply to Jacob Lifshay from comment #1) > technically there's also x86's maxss operations, they implement the C > function: > float f(float a, float b) { > return a < b ? a : b; > } as best i can tell that's fsel - p168 v3.0B 4.6.9 fsel FRT,FRA,FRC,FRB (Rc=0) fsel. FRT,FRA,FRC,FRB (Rc=1) if (FRA) >= 0.0 then FRT <- (FRC) else FRT <- (FRB) The floating-point operand in register FRA is compared to the value zero. If the operand is greater than or equal to zero, register FRT is set to the contents of register FRC. If the operand is less than zero or is a NaN, regis- ter FRT is set to the contents of register FRB. The com- parison ignores the sign of zero (i.e., regards +0 as equal to -0). (In reply to Luke Kenneth Casson Leighton from comment #2) > (In reply to Jacob Lifshay from comment #1) > > technically there's also x86's maxss operations, they implement the C > > function: > > float f(float a, float b) { > > return a < b ? a : b; > > } > > as best i can tell that's fsel - p168 v3.0B 4.6.9 it isn't actually, x86 minss/maxss compare the inputs with each other, not against zero. (In reply to Jacob Lifshay from comment #0) > reading through the opencl list of ops, I realized we forgot to add some fp > ops: > fmax > fmin > fmod > maxmag > minmag > remainder i'd like to add these new ops as part of doing the initial implementation of fptrans, there's space. what do you think? (In reply to Jacob Lifshay from comment #3) > (In reply to Luke Kenneth Casson Leighton from comment #2) > > (In reply to Jacob Lifshay from comment #1) > > > technically there's also x86's maxss operations, they implement the C > > > function: > > > float f(float a, float b) { > > > return a < b ? a : b; > > > } > > > > as best i can tell that's fsel - p168 v3.0B 4.6.9 > > it isn't actually, x86 minss/maxss compare the inputs with each other, not > against zero. ahh yes. sigh. ok. let's take a look and see if the others are there.(In reply to Jacob Lifshay from comment #4) > (In reply to Jacob Lifshay from comment #0) > > reading through the opencl list of ops, I realized we forgot to add some fp > > ops: > > fmax > > fmin > > fmod > > maxmag > > minmag > > remainder > > i'd like to add these new ops as part of doing the initial implementation of > fptrans, there's space. > > what do you think? yes good idea, search for them first though, and in VSX as well, the section has to say "this is in VSX as {vxxxxxx} but not in scalar" hang on... p181 v3.1 there's notes if a >= b then x <- y fsub fs,fa,fb else x <- z fsel fx,fs,fy,fz so no, we can't add fmaxss or fminss, they'll get rejected because of the macro-fusion advice. the only reason to add fminss/fmaxss/fmin/fmax would be because updating to IEEE754-2019. which is probably good enough. ---- https://stackoverflow.com/questions/30618991/simd-minmag-and-maxmag minmag(a,b) = |a|<|b| ? a : b maxmag(a,b) = |a|>|b| ? a : b not seeing anything like this - good idea to add them. --- fmod https://codebrowser.dev/glibc/glibc/sysdeps/ieee754/flt-32/e_fmodf.c.html blerk! that's awful! (i mean, the software). yep, good call. ---- remainder https://stackoverflow.com/questions/26671975/why-do-we-need-ieee-754-remainder blerk. i don;t get it. but i can understand other people do :) ---- yep all good here. I added all the ops to the spec, I'll leave opcode allocation to #899: https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=3e363081ee0142d6948d5b6c66523d833d0a7711 I found x86-style min/max ops in VSX (xsmincdp), so I'll take that as sufficient justification to add scalar ops. I named them fminc/fmaxc to mirror the VSX ops and to avoid sticking x86 in the opcode mnemonics. |