| Summary: | add v3.0B BCD instructions to simulator | ||
|---|---|---|---|
| Product: | Libre-SOC's first SoC | Reporter: | Luke Kenneth Casson Leighton <lkcl> |
| Component: | Source Code | Assignee: | Dmitry Selyutin <ghostmansd> |
| Status: | RESOLVED FIXED | ||
| Severity: | enhancement | CC: | libre-soc-bugs, maciej.pijanowski, programmerjake |
| Priority: | --- | ||
| Version: | unspecified | ||
| Hardware: | PC | ||
| OS: | Linux | ||
| See Also: | https://bugs.libre-soc.org/show_bug.cgi?id=657 | ||
| NLnet milestone: | NLNet.2019.10.046.Standards | total budget (EUR) for completion of task and all subtasks: | 200 |
| budget (EUR) for this task, excluding subtasks' budget: | 200 | parent task for budget allocation: | 241 |
| child tasks for budget allocation: | The table of payments (in EUR) for this task; TOML format: |
ghostmansd={amount=100, paid=2021-11-01}
maciej_3mdeb=100
|
|
| Bug Depends on: | 657 | ||
| Bug Blocks: | |||
|
Description
Luke Kenneth Casson Leighton
2021-07-14 12:50:22 BST
dmitry the pseudocode for cdtbcd and cbcdtd uses a couple of helper functions
which need to be added before they can "work". addg6s on the other hand
you can do straight away without this.
i found in "v3.0B p787 Book I Appendix B.1 and B.2" the following
pseudocode:
B.1 BCD-to-DPD Translation
The translation from a 3-digit BCD number to a 10-bit
DPD can be performed through the following Boolean
operations.
p = (f & a & i & ¬e) | (j & a & ¬i) | (b & ¬a)
q = (g & a & i & ¬e) | (k & a & ¬i) | (c & ¬a)
r = d
s = (j & ¬a & e & ¬i) | (f & ¬i & ¬e) |
(f & ¬a & ¬e) | (e & i)
t = (k & ¬a & e & ¬i) | (g & ¬i & ¬e) |
(g & ¬a & ¬e) | (a & i)
u = h
v = a | e | i
w = (¬e & j & ¬i) | (e & i) | a
x = (¬a & k & ¬i) | (a & i) | e
y = m
B.2 DPD-to-BCD Translation
The translation from a 10-bit DPD to a 3-digit BCD
number can be performed through the following Bool-
ean operations.
a = (¬s & v & w) | (t & v & w & s) | (v & w & ¬x)
b = (p & s & x & ¬t) | (p & ¬w) | (p & ¬v)
c = (q & s & x & ¬t) | (q & ¬w) | (q & ¬v)
d = r
e = (v & ¬w & x) | (s & v & w & x) |
(¬t & v & x & w)
f = (p & t & v & w & x & ¬s) | (s & ¬x & v) |
(s & ¬v)
g = (q & t & w & v & x & ¬s) | (t & ¬x & v) |
(t & ¬v)
h = u
i = (t & v & w & x) | (s & v & w & x) |
(v & ¬w & ¬x)
j = (p & ¬s & ¬t & w & v) | (s & v & ¬w & x) |
(p & w & ¬x & v) | (w & ¬v)
k = (q & ¬s & ¬t & v & w) | (t & v & ¬w & x) |
(q & v & w & ¬x) | (x & ¬v)
m = y
after removing the silliness of *yet another* non-ASCII symbol for "¬"
being used, those will need to go into openpower/isafunctions/bcd.mdwn,
along with comments saying where they come from. see here for example
to cut/paste:
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=openpower/isafunctions/fpfromint.mdwn;hb=HEAD
then you can run "pyfnwriter" to generate the actual python code
then, in pywriter.py you will need to *add* those helper functions
*manually* (at the moment, really they should be done automatically)
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/decoder/pseudo/pywriter.py;hb=HEAD
at line 33, add
from openpower.decoder.isafunctions.bcd import BCD_TO_DPD, DPD_TO_BCD
then re-run pywriter noall bcd in order to get the decoder/isa/bcd.py
to be re-compiled with the import of the helper function added.
or, cheat, and manually edit decoder/isa/bcd.py :)
(In reply to Luke Kenneth Casson Leighton from comment #0) > to get the carry-out bit, the pseudocode for addg6s will need to be > rewritten into a "programmatic" way rather than a "here's the general > idea of how to get the 65th bit from a 64-bit add" > > do i = 0 to 15 > dc[i] <- carry_out(RA[4*i:63] + RB[4*i:63]) the above pseudocode is copied from the spec and I'm assuming it is correct. > will need to be changed to something like: > > temp <- (0b0 || RA) + (0b0 || RB) > dc[16] <- [0]*16 > do i = 0 to 15 > dc[i] <- temp[4*i] That rewrite is incorrect, since if you add 0x10 + 0x20, it sets dc[15] = 1 even though 0 + 0 has no carry out. corrected code (afaict): dc <- [0]*16 carry <- 0b0 do i = 15 to 0 sum <- (0b0 || RA[4*i:4*i+3]) + (0b0 || RB[4*i:4*i+3]) + (0b0000 || carry) carry <- sum[0] dc[i] <- carry (In reply to Jacob Lifshay from comment #2) > do i = 15 to 0 > sum <- (0b0 || RA[4*i:4*i+3]) + (0b0 || RB[4*i:4*i+3]) + (0b0000 || > carry) note 4*i: >>>63<<< not 4*i: 4*i+3 > > do i = 0 to 15 > > dc[i] <- carry_out(RA[4*i:63] + RB[4*i:63]) therefore, each of the selections of RA and RB parts must be expanded in length by 1 bit: temp <- (0b0 || RA[4*i:63]) + (0b0 || RB[4*i:63]) and because MSB0 order you take the bit numbered ZERO to get the MSB: dc[i] <- temp[0] (In reply to Luke Kenneth Casson Leighton from comment #3) > (In reply to Jacob Lifshay from comment #2) > > > do i = 15 to 0 > > sum <- (0b0 || RA[4*i:4*i+3]) + (0b0 || RB[4*i:4*i+3]) + (0b0000 || > > carry) > > note 4*i: >>>63<<< > not 4*i: 4*i+3 the code I wrote intentionally uses RA[4*i:4*i+3] since it's doing the add 4-bits at a time and manually propagating the carries rather than doing sequence of 60, 56, 52, 48, ... 8, and 4 bit adds and relying on the add to propagate carry implicitly. So, yes, 4*i+3 is correct here. It uses a different algorithm than in the openpower spec. (In reply to Jacob Lifshay from comment #4) > the code I wrote intentionally uses RA[4*i:4*i+3] since it's doing the add > 4-bits at a time and manually propagating the carries rather than doing > sequence of 60, 56, 52, 48, ... 8, and 4 bit adds and relying on the add to > propagate carry implicitly. So, yes, 4*i+3 is correct here. > > It uses a different algorithm than in the openpower spec. then given that we are using the pseudocode as a recommendation for changes to the OpenPOWER spec it is inadviseable to make unnecessary changes regardless of "efficiency". making such a change should go through the OPF ISA WG where we pick it up from there once authorised and approved. otherwise we have yet another discrepancy that requires yet more time and yet more resources. the reason why we had to make changes to mul, div and mod pseudocode is because they were plain unuseable. the existing pseudocode is useable. if you want to propose a change to the Authorised ISA Spec through the process soon to be established please feel free to do so, however i would very much prefer that the LibreSOC pseudocode not contain discrepancies that place a larger burden of time and energy on us when the existing spec pseudocode generates the right values. (In reply to Luke Kenneth Casson Leighton from comment #5) > (In reply to Jacob Lifshay from comment #4) > > It uses a different algorithm than in the openpower spec. > > then given that we are using the pseudocode as a recommendation for > changes to the OpenPOWER spec it is inadviseable to make unnecessary > changes regardless of "efficiency". well, my reason for changing algorithms was understandability due to not having different bitwidths for every iteration...that said, I did kinda forget about trying to keep the same pseudocode (In reply to Jacob Lifshay from comment #6) > well, my reason for changing algorithms was understandability due to not > having different bitwidths for every iteration...that said, I did kinda > forget about trying to keep the same pseudocode in the HDL it's a different matter, we're free there to do whatever we see fit. So, to summarize, does the following work for us?
dc[16] = [0]*16
do i = 0 to 15
temp <- (0b0 || RA[4*i:63]) + (0b0 || RB[4*i:63])
dc[i] <- temp[0]
c <- ([dc[0]]*4 || [dc[1]]*4 || [dc[2]]*4 || [dc[3]]*4 ||
[dc[4]]*4 || [dc[5]]*4 || [dc[6]]*4 || [dc[7]]*4 ||
[dc[8]]*4 || [dc[9]]*4 || [dc[10]]*4 || [dc[11]]*4 ||
[dc[12]]*4 || [dc[13]]*4 || [dc[14]]*4 || [dc[15]]*4)
RT <- (¬c) & 0x6666_6666_6666_6666
(In reply to dmitry.selyutin from comment #8) > So, to summarize, does the following work for us? > > dc[16] = [0]*16 > do i = 0 to 15 > temp <- (0b0 || RA[4*i:63]) + (0b0 || RB[4*i:63]) > dc[i] <- temp[0] > c <- ([dc[0]]*4 || [dc[1]]*4 || [dc[2]]*4 || [dc[3]]*4 || > [dc[4]]*4 || [dc[5]]*4 || [dc[6]]*4 || [dc[7]]*4 || > [dc[8]]*4 || [dc[9]]*4 || [dc[10]]*4 || [dc[11]]*4 || > [dc[12]]*4 || [dc[13]]*4 || [dc[14]]*4 || [dc[15]]*4) > RT <- (¬c) & 0x6666_6666_6666_6666 Yes, it should for now, however it will likely be a total pain later when we generate C code due to `temp` changing its type each iteration. The pseudocode I suggested doesn't have that issue. https://libre-soc.org/irclog/%23libre-soc.2021-07-28.log.html#t2021-07-28T19:18:02 dmitry: we have no idea how these instructions work. therefore, unit tests can be written which bounce off of qemu, by way of comparison. you can experiment with random values and/or fixed values, using this as a base: https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/simulator/test_sim.py;hb=HEAD that code will co-simulate the instruction against both qemu *and* the simulator ISACaller, extracting the full contents of the register files *and* memory, performing a byte-level comparison. if the results are different, you know one of two things: 1) qemu is f*****d. 2) ISACaller is borked. we have actually had (1) occur, it has actually produced the wrong answers in certain circumstances. if that happens, we will need to investigate further. diff --git a/openpower/isa/bcd.mdwn b/openpower/isa/bcd.mdwn
index 30bb161..04753ed 100644
--- a/openpower/isa/bcd.mdwn
+++ b/openpower/isa/bcd.mdwn
@@ -33,13 +33,15 @@ XO-Form
Pseudo-code:
+ dc[16] = [0]*16
do i = 0 to 15
this is a comparison rather than an assignment (can't count the number
of times i've made that error), and it should be dc <- [0]*16.
"dc[16] <- [0]*16" will be trying to assign a 16-bit value into an
implicit 17th bit of dc, which is only 1 bit wide
SelectableInt is quite specific about bit-width matching its inputs
and outputs, so would throw a runtime exception.
Hi Luke, thanks for the fix! It'll probably take me a while until I get used to the notation. I've pushed the fix; do we have some syntax-related test? Or `=` instead of `<-` is considered to be acceptable in this context? Ah, I misread part of your comment.
> "dc[16] <- [0]*16" will be trying to assign a 16-bit value into an implicit 17th bit of dc, which is only 1 bit wide.
Now it's clear. Thank you again!
no problem. it's quite minimalist and spartan: if this was gcc there would be massive amounts of error checking, nice error messages etc. we don't have a huge userbase, nor the resources for that. you have to adjust accordingly, take care, make minimal changes and carefully inspect the output and rely on unit tests to ensure the pseudocode is correct. (In reply to dmitry.selyutin from comment #13) > Ah, I misread part of your comment. > > > "dc[16] <- [0]*16" will be trying to assign a 16-bit value into an implicit 17th bit of dc, which is only 1 bit wide. > > Now it's clear. Thank you again! the assignment operator is now being correctly used, however this is still trying to assign 16 bits of zeros into one single bit (the 17th) of dc. * one bit wide, 17th bit: dc[16] * 16 bits wide (zeros) : [0]*16 * 16 bits trying to be assigned into the 17th bit: dc[16] <- [0]*16 what you want is just dc <- [0] * 16 and the compiler is smart enough (note, only for very simple assignments like this) to note the bit-width of the assignment and automatically create a variable, dc, of width 16 bit. you would have found this when writing the unit test which executed this code. diff --git a/openpower/isa/bcd.mdwn b/openpower/isa/bcd.mdwn index 04753ed..63c42b2 100644 --- a/openpower/isa/bcd.mdwn +++ b/openpower/isa/bcd.mdwn @@ -33,7 +33,7 @@ XO-Form Pseudo-code: - dc[16] = [0]*16 + dc[16] <- [0]*16 do i = 0 to 15 temp <- (0b0 || RA[4*i:63]) + (0b0 || RB[4*i:63]) dc[i] <- temp[0] Spec Appendix B p 787 states:
which supports the representation of decimal integers
of arbitrary length. Translation operates on three
Binary Coded Decimal (BCD) digits at a time com-
pressing the 12 bits into 10 bits with an algorithm that
can be applied or reversed using simple Boolean oper-
ations. In the following examples, a 3-digit BCD num-
ber is represented as (abcd)(efgh)(ijkm), a 10-bit DPD
number is represented as (pqr)(stu)(v)(wxy), and the
Boolean operations, & (AND), | (OR), and ¬ (NOT) are
used.
therefore, in the "Translation" boolean stuff, let us assume that
the input (val) is 12 bits:
def BCD_TO_DPD(val):
# first digit, let us assume (guess) that the digits are in
# left-is-Most-significant, right-is-least-significant order
# i.e it is:
# ....represented as (abcd)(efgh)(ijkm)
# Top Middle Least significant digit
# therefore:
# top BCD digit
a = val[0] # MSB0 order
b = val[1] ...
c = val[2]
d = val[3]
...
# least-significant BCD digit
i = val[8]
j = val[9]
k = val[10]
l = val[11]
and that likewise p q r .... are similarly ordered such that
* p is MSB0-ordered bit zero (0)
* q is MSB0-ordered bit one
* ...
* y is MSB0-ordered bit nine (9)
that should be sufficient to complete the function *WHICH MUST BE IN
PSEUDOCODE NOT CONVERTED TO PYTHON*.
repeat: the PSEUDOCODE from section B.1 shall be placed into a
markdown file.
NOT, repeat, NOT: "take the pseudocode and waste time converting it
by hand to a python function".
added SINGLE function to double2single.mdwn: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=55163c2d1f5487e4d5213b256652527843ce002b add it into pywriter.py so that the compiler uses it: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=2fae1ac20baefa05c1f1aac00fa7f0e88d68242c remove the function which should never have been converted by hand in the first place from pseudocode manually to python: https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=4c83f3216f802de3c6eb8dd2729107ba0a499dea Pushed the changes into the master. Hopefully we can move to 657. Luke, should I mark it as resolved, or this should be done by you? (In reply to dmitry.selyutin from comment #19) > Pushed the changes into the master. Hopefully we can move to 657. Luke, > should I mark it as resolved, or this should be done by you? once resolved we "lose" it in searches, and it makes tracking payments very difficult. to help deal with that, we create user pages, see http://libre-soc.org/lkcl for an example (to template), and that helps when creating the RFPs (Requests for Payment) when the unit tests pass, and it's cross-ref'd on the wiki, this one can be marked "resolved". (In reply to Luke Kenneth Casson Leighton from comment #20) > (In reply to dmitry.selyutin from comment #19) > > Pushed the changes into the master. Hopefully we can move to 657. Luke, > > should I mark it as resolved, or this should be done by you? > > once resolved we "lose" it in searches, and it makes tracking payments > very difficult. to help deal with that, we create user pages, > see http://libre-soc.org/lkcl for an example (to template), > and that helps when creating the RFPs (Requests for Payment) Actually, for the payment stuff, marking it as resolved with dmitry as the assignee and some money assigned to this bug will make it show up on the markdown generated by budget-sync (in /task_db/mdwn/ on the website) next time someone runs and uploads budget-sync's output. Example generated markdown (exact bugs out-of-date): ## Completed but not yet added to payees list * [Bug #145](https://bugs.libre-soc.org/show_bug.cgi?id=145): reference FP emulation using algebraic numbers explanation of why addg6s is useful: https://libre-soc.org/irclog/%23libre-soc.2021-08-01.log.html#t2021-08-01T10:10:13 > addg6s is used for doing a BCD addition, like x86's daa instruction: > https://www.felixcloutier.com/x86/daa > afaict to do a bcd add of a and b, you do `add r, a, b` then > `addg6s t, a, b` then `add r, r, t` (In reply to Jacob Lifshay from comment #21) > Actually, for the payment stuff, marking it as resolved with dmitry as the > assignee and some money assigned to this bug will make it show up on the > markdown generated by budget-sync (in /task_db/mdwn/ on the website) next > time someone runs and uploads budget-sync's output. this forces me, as the site administrator, to spend my time being the one to run that script and run the upload. this is not a reasonable expectation. additionally, the user page can be used by each team member to contain work-in-progress notes, where the sync program is completely devoid of any such information, containing (sterile) information solely and exclusively about payments. (In reply to Luke Kenneth Casson Leighton from comment #23) > (In reply to Jacob Lifshay from comment #21) > > > Actually, for the payment stuff, marking it as resolved with dmitry as the > > assignee and some money assigned to this bug will make it show up on the > > markdown generated by budget-sync (in /task_db/mdwn/ on the website) next > > time someone runs and uploads budget-sync's output. > > this forces me, as the site administrator, to spend my time being the one > to run that script and run the upload. > > this is not a reasonable expectation. True, hence why I think the script should be automatically run, say in a daily/hourly cron job. Normally this would be taken care of by CI (GitLab CI can be set to run on a time-based schedule), but we're still waiting on getting mailing lists set up: see bug #190 Is this task considered finished? According to Dmitry it is (correct me if I'm wrong please). Luke, what is the procedure then to close the task? (In reply to Maciej Pijanowski from comment #25) > Is this task considered finished? > According to Dmitry it is (correct me if I'm wrong please). > > Luke, what is the procedure then to close the task? see comment #20, i have a mild preference for it to be closed when the unit tests confirm that it's correct. |