| Summary: | svp64 vector loads: sub-dword selection before or after byte-reversal | ||
|---|---|---|---|
| Product: | Libre-SOC's first SoC | Reporter: | Alexandre Oliva <oliva> |
| Component: | Specification | Assignee: | Luke Kenneth Casson Leighton <lkcl> |
| Status: | RESOLVED INVALID | ||
| Severity: | enhancement | CC: | libre-soc-isa |
| Priority: | --- | ||
| Version: | unspecified | ||
| Hardware: | PC | ||
| OS: | Other | ||
| See Also: | https://bugs.libre-soc.org/show_bug.cgi?id=571 | ||
| NLnet milestone: | --- | total budget (EUR) for completion of task and all subtasks: | 0 |
| budget (EUR) for this task, excluding subtasks' budget: | 0 | parent task for budget allocation: | |
| child tasks for budget allocation: | The table of payments (in EUR) for this task; TOML format: | ||
| Bug Depends on: | |||
| Bug Blocks: | 213 | ||
|
Description
Alexandre Oliva
2021-01-06 21:58:34 GMT
(In reply to Alexandre Oliva from comment #0) > Last night, while going over > https://libre-soc.org/simple_v_extension/appendix/ with a particular focus > on ld's operation with an elwidth overrider for the src, I missed various > details in the specification. apologies i should have mentioned, that was the older version of the spec, relevant exclusively to RISC-V. given that RV had BE removed at around version 3 (RISC-III) it was not discussed, at all. the reason i referred to that older spec was to illustrate to Cole that there did exist walkthroughs for twin element width overrides. > whole register. this suggested to me that the sub-indexing of the value > loaded memory would take place after byte-reversion. the pseudocode as modified and derived in the other bugreport will be the correct pseudocode. what you are looking at in the RV variant is unfortunately not relevant, i apologise, as far as bytereversal is concerned, only elwidths and extension. given that this is the case can i recommend closing this one and starting again, from the pseudocode listed in 567 https://bugs.libre-soc.org/show_bug.cgi?id=567#c2 unfortunately almost all of what you wrote is invalid when viewed without the addition of the (fully OpenPOWER v3.0B Compliant) ld/brx-LE/BE bytereversing that gets everything into NEON-style internal representation and ordering as far as Vectorisation is concerned. i will go over it thoroughly and make sure nothing was missed but, realistically, we need to close this one as invalid. sorry about that. (will raise a new one, immediately) |