Bug 78

Summary: IEEE754 FP "div" needed
Product: Libre-SOC's first SoC Reporter: Luke Kenneth Casson Leighton <lkcl>
Component: ALU (including IEEE754 16/32/64-bit FPU)Assignee: Luke Kenneth Casson Leighton <lkcl>
Status: PAYMENTPENDING FIXED    
Severity: enhancement CC: libre-soc-bugs, programmerjake
Priority: ---    
Version: unspecified   
Hardware: PC   
OS: Linux   
NLnet milestone: NLnet.2019.02.012 total budget (EUR) for completion of task and all subtasks: 1000
budget (EUR) for this task, excluding subtasks' budget: 1000 parent task for budget allocation: 48
child tasks for budget allocation: The table of payments (in EUR) for this task; TOML format:
"lkcl"={amount=1000, paid=2019-06-16}
Bug Depends on:    
Bug Blocks: 48    

Description Luke Kenneth Casson Leighton 2019-04-26 21:43:37 BST
an IEEE754 compliant "DIV" unit is needed, as a FSM (state machine)
rather than a pipeline.  It must still conform to the pipeline API.
Must support FP16/32/64.
Comment 1 Luke Kenneth Casson Leighton 2019-06-16 13:52:37 BST
operational again (after elaboratable rework)
Comment 2 Luke Kenneth Casson Leighton 2019-06-16 14:57:26 BST
> FDIV is often implemented with a FRECIP
> (reciprocal) followed by a FMUL.

we *might* need a pipelined fdiv, yet to be evaluated.  where in a
standard processor, time is not really critical, for a GPU it definitely
is.

FSQRT and ISQRT are definitely going to be done as pipelines, jacob knows
if we need FDIV to be pipelined.

if an FRECIP can be tracked down and it can be done as a pipeline
(that's if we need DIV to be pipelined), that would be good.
Comment 3 Jacob Lifshay 2019-06-16 21:51:32 BST
(In reply to Luke Kenneth Casson Leighton from comment #2)
> > FDIV is often implemented with a FRECIP
> > (reciprocal) followed by a FMUL.
We can't use FRECIP followed by FMUL without additional intermediate precision since the RISCV spec requires FDIV to have correctly rounded results.
> 
> we *might* need a pipelined fdiv, yet to be evaluated.  where in a
> standard processor, time is not really critical, for a GPU it definitely
> is.
> 
> FSQRT and ISQRT are definitely going to be done as pipelines, jacob knows
> if we need FDIV to be pipelined.
having a pipelined fdiv is more important than sqrt or rsqrt, since divisions are much more common (every pixel needs at least 1 division)
> 
> if an FRECIP can be tracked down and it can be done as a pipeline
> (that's if we need DIV to be pipelined), that would be good.
Comment 4 Luke Kenneth Casson Leighton 2019-06-17 03:01:01 BST
(In reply to Jacob Lifshay from comment #3)

> > FSQRT and ISQRT are definitely going to be done as pipelines, jacob knows
> > if we need FDIV to be pipelined.
> having a pipelined fdiv is more important than sqrt or rsqrt, since
> divisions are much more common (every pixel needs at least 1 division)

rats.  ok i should be able to knock something together quite quickly,
however it's going to need a whopping *fourteen* stages even if done as
a 4x combinatorial chain of 14x pipelines.

or it could be *26* pipelines stages of 2x combinatorial blocks.  that's just
for 32-bit FP.  64-bit FP would be a staggering 56 pipeline stages.  luckily
we don't need that, as the focus isn't 64-bit.

will raise a separate bugreport for this one.