| Summary: | gcc compiler, binutils and assembly macros for OpenPOWER-SV | ||
|---|---|---|---|
| Product: | Libre-SOC's first SoC | Reporter: | Luke Kenneth Casson Leighton <lkcl> |
| Component: | Source Code | Assignee: | Luke Kenneth Casson Leighton <lkcl> |
| Status: | RESOLVED FIXED | ||
| Severity: | enhancement | CC: | ghostmansd, libre-soc-bugs, oliva, veerakumar.r |
| Priority: | --- | ||
| Version: | unspecified | ||
| Hardware: | Other | ||
| OS: | Linux | ||
| See Also: |
https://bugs.libre-soc.org/show_bug.cgi?id=558 https://bugs.libre-soc.org/show_bug.cgi?id=615 https://bugs.libre-soc.org/show_bug.cgi?id=871 https://bugs.libre-soc.org/show_bug.cgi?id=211 https://bugs.libre-soc.org/show_bug.cgi?id=917 https://bugs.libre-soc.org/show_bug.cgi?id=836 |
||
| NLnet milestone: | NLNet.2019.10.032.Formal | total budget (EUR) for completion of task and all subtasks: | 12000 |
| budget (EUR) for this task, excluding subtasks' budget: | 925 | parent task for budget allocation: | 158 |
| child tasks for budget allocation: | 550 578 834 847 857 | The table of payments (in EUR) for this task; TOML format: |
ghostmansd = { amount = 525, submitted = 2022-09-25, paid = 0222-10-06 }
veera = {amount=400, submitted=2022-09-29, paid=2022-10-04}
|
| Bug Depends on: | 579, 836, 550, 578, 834, 907 | ||
| Bug Blocks: | 158 | ||
|
Description
Luke Kenneth Casson Leighton
2021-01-19 17:04:20 GMT
in issue #615 i am keeping notes from various conversations with ppc binutils and gcc maintainers, as well as OPF. summary of OPF advice: an architectural fork inside gcc will not be well received due to the implication of ecosystem fragmentation. one idea came up from David to use the same trick intended for v3.1: there they intend mark entries in rs6000.md as "v3.1prefixableto64bit", and David said he would have no problem with us doing the same thing: set attribute "svp64vectoriseable". for us this would indicate that when it came to assembly output there would be a special 32bit EXT01 assembly instruction outputted at the front of any instruction marked with the attribute. Segher then suggested *redefining* the underlying data structure that is used by the macro system for representing registers. this combination effectively empowers all svp64-marked macro patterns to have a massive addition set of matching capabilities. on registers alone this would be: * RT=s RA=s RB=s * RT=v RA=s RB=s * .... * RT=v RA=v RB=v when element-width overrides are introduced these permutations multiply by 4 for source elwidth override *and another* four for dest elwidth override. when additional capabilities such as a saturation are also added, the thought of creating a macro file even one that is autogenerated with all these permutations *per macro* listed explicitly is, at best, described as insane and, frankly, stupid. a little intelligent thought shows that the pattern-matching can be done implicitly (using existing rs6000.md patterns) when marked with an appropriate attribute. this will allow us to do very basic (and i mean very basic) matching between vector patterns and svp64-attribute-marked rs6000.md macros. anything not part of a conditional if/else computation for example: straight unconditional for-loops. where it gets more complicated is anything that's computed which is to be used for a branch decision. this requires predication (like is used in arm32bit) which is not a "normal" part of ppc except in very special unique circumstances. avoiding that situation for now and simply doing unconditional for-loop expansion would still be a huge leap forward. I observe a change with lfs.
.desc = {
.in1 = SVP64_IN1_SEL_RA_OR_ZERO,
- .in2 = SVP64_IN2_SEL_CONST_SVD,
- .in3 = SVP64_IN3_SEL_RC,
+ .in2 = SVP64_IN2_SEL_CONST_SI,
+ .in3 = SVP64_IN3_SEL_NONE,
.out = SVP64_OUT_SEL_FRT,
- .out2 = SVP64_OUT_SEL_NONE,
+ .out2 = SVP64_OUT_SEL_FRT,
.cr_in = SVP64_CR_IN_SEL_NONE,
.cr_out = SVP64_CR_OUT_SEL_NONE,
.sv_ptype = SVP64_PTYPE_P2,
- .sv_etype = SVP64_ETYPE_EXTRA3,
- .sv_in1 = SVP64_EXTRA_IDX1,
+ .sv_etype = SVP64_ETYPE_EXTRA2,
+ .sv_in1 = SVP64_EXTRA_NONE,
.sv_in2 = SVP64_EXTRA_NONE,
.sv_in3 = SVP64_EXTRA_NONE,
.sv_out = SVP64_EXTRA_IDX0,
This breaks the remapping algorithm, it was not ready at all for such change. Apparently I miss how to remap this stuff. Ideas/suggestions?
|