Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Porting SBCL to the RISC-V (rhodes.io)
133 points by pome on Aug 12, 2018 | hide | past | favorite | 38 comments


If there is any assembly-language programming needed, or a code generator, then I suggest you start with the code for MIPSle. Many of the instructions and mnemonics are the same.

The biggest differences are immediate arithmetic and load/store offsets are 12 bit on RISC-V vs 16-bit on MIPS. To compensate, LUI loads 20 bits on RISC-V vs 16 bits on MIPS. So it's only immediates or offsets between +/-2K and +/-32K that are different.

Also RISC-V does compare two registers for ordering and branch in one instruction, which older MIPS can't do.


From my (slow) hobby work on a V8 port, I can say that there are differences in loading large (48-64 bit) arbitrary constants without a pool (it takes many instructions) or a scratch register, the FPU is also used differently (it's more janky and bolted-on with MIPSel). It's nice not having exposed delay slots, and the additional pc-relative addressing range is very convenient (since, for example, in V8 there is a maximum code heap size known at compile time, and the addresses are contiguous, so you can use pc-relative immediate addressing for anything in that heap [as long as you keep track of sources for relocation] at a penalty of one word [which is not a big deal in the grand scheme of things]).


> I can say that there are differences in loading large (48-64 bit) arbitrary constants without a pool.

How arbitrary is arbitrary? ARMv7's Thumb2 format immediates are composed of a 8-bit field shifted by up to 5 bits. So you can form any 32-bit variable, but with limited precision.

ARMv8 modified immediates can describe a contiguous run of ones followed by a contiguous run of zeros, and SWAR variations of the same. So you can describe things like a repeating 0x3f... for example.

Do either of those formats encompass the kinds of literals that you need in the V8 JIT?

> so you can use pc-relative addressing ... at a penalty of one word

Since the RISC-V PC-relative addressing capabilities are similar to ARMv8 (adrp) and x86-64 (rip-relative addressing), I would have though that this is basically a non-problem. You pay one more live register to hold the page address, but you also get more registers, so I would think it mostly washes out. Where do you pay a penalty?


> How arbitrary is arbitrary? ARMv7's Thumb2 format immediates are composed of a 8-bit field shifted by up to 5 bits. So you can form any 32-bit variable, but with limited precision.

When it comes to encoding the address of an entry point, every bit of precision you lose (above the first two or three) in the address loses you memory compactness (and adds a certain amount of complexity to compilation and relocation).

On the ARM and AArch64 V8 ports they use a constant pool for target addresses, on RISC-V you can probably just use AUIPC to compute the target address in place with no pool address register. You can, of course, do the exact same thing on RISC-V that they do in the ARM ports, but RISC-V has the considerable advantage of four extra bits (totalling 20) in U immediates vs. MIPS (16-bit U immediates), and eight extra bits vs. ARM in some cases (though ARM's immediate encodings are various and sundry, and produce a huge variety of corner cases and microoptimizations which are mostly useless to JITs [in my mostly amateur opinion]; to a lesser extent MIPS also has some interesting features for loading immediates which make up for the shortfalls in AOT code, but are harder [it seems to me, an amateur] to use effectively in a JIT).


AArch64 uses adrp in almost exactly the same way that RISC-V uses auipc to access literal pools. It isn't a strongly distinguishing feature between those architectures.

The difference between them is that ADRP computes a 4 kB page-aligned pc-relative address, which complements the 12-bit unsigned address offsets in its base+disp addressing mode to get a uniform +(2 GB -1) to -2GB reach. RISC-V doesn't compute a page-aligned address, in order to partially compensate for the use of signed offsets in its base+disp12 addressing mode. I say partially because RISC-V's PC-relative reach remains asymmetric +(2 GB - 2k - 1) to -(2 GB + 2k), but that probably doesn't matter much as long as you establish an appropriate red zone.

ARM distinguishes immediate operands used for data processing and immediate operands used for address generation. The alternative formats I was referring to are mostly just available for the logical operations (although Thumb2 sometimes also uses them for arithmetic). I was thinking that they might make pointer tagging a smidge easier to deal with.

*edit: Whoops, 12-bit signed, not 10-bit signed on the asymmetry of the RISC-V reach.


Your figures seem a little off there. Yes, RISC-V is a little asymmetric, with the AUIPC being able to subtract exactly 2 GB from the PC or add 2GB-4KB to it, and then a jalr/lb/sb can subtract an additional 2 KB or add 2KB-1.

But the AArch64 adrp is also asymmetric because the relative reach depends on where in the 4 KB page the original PC is. It's only symmetric if the PC is 4 KB aligned. If you're part way through the page then there is more -ve reach and less +ve reach.

A couple of KB fuzziness in what is basically a 32 bit reach in a 64 bit address space is pretty much completely irrelevant in both cases.


ADRP operates on the 4kB page of the pc (by truncation), not the entire pc. RISC-V could have implicitly added 2k in auipc and balanced out the bias. But they didn't.


I know how ADRP operates. It's symmetric about the truncated PC. It's not symmetric about the actual PC.

As I said in the last post.


I think you are misunderstanding the benefit of having symmetric reach by the page.

On AArch64, you can define a 2 GB contiguous slice of address space, built up out of whatever page size you find convenient for your system and plant a relocatable binary into it, up to 2 GB of size. Any instruction anywhere in the last page can reach any address in the first page, and vice versa.

In RISC-V, if you try to do the same thing, you'd find that while any instructions in the last page can reach any address in the first page with room to spare. But some instructions in the first page cannot reach portions of the last page.

Sure, it doesn't matter most of the time. It isn't ever really an obstacle in practice for the feller writing application code for the platform. But the linker has to be aware of it in the 'medium' code model as a special case for just this particular platform. Somebody had to write that special case to work around the hardware.


The linker code to calculate the necessary auipc and remaining offset for a relocation and do something else when out of bounds, was written years ago, is two lines of code, and no doubt took less time than this conversation.

I don't even know of any application that has 1 GB of code, let alone 2 GB minus 2 KB (2,147,481,600 bytes).


> AArch64 uses adrp in almost exactly the same way that RISC-V uses auipc to access literal pools.

I meant that you could forego the constant pool (ARM calls them literal pools?) for branch targets in some use cases due to the added range of RISC-V U immediates, and in more cases than you can with ARMv8 (which only now offers 16-bit immediates in some cases, AFAIK [ARM's infocenter timed out, and I couldn't find it in the manual with search, so I'm basing this on some forum post by some dude]). With auipc you can compute an address 1GiB two ways from the pc, to the byte, and with no base register to be set in your JIT entry frames and allocated around.

In the V8 "arm64" port, it seems they manage 128MB as a practical matter:

https://github.com/v8/v8/blob/master/src/arm64/constants-arm...


For the most part, ARMv8 immediates are 12 bits for [base + displacement] addressing. However, most of the modes also allow you to use natural scaling of the displacement (or index in [base + index] modes). So you can reach the 4096th byte, or the 4096th pointer in a struct (and other sizes in between).

That 128 MB limit is interesting. Its the limit for direct pc-relative calls in ARMv8 without resorting to adrp.


Looks like the disparaging ARM-fronted website about RISC-V might have been https://riscv-basics.com/ which has disappeared down the memory hole. One mention at http://www.osnews.com/comments/30562 sheds a bit of light.

edited to add: here's HN's discussion at the time: https://news.ycombinator.com/item?id=17489504


ARM got sort of a Streisand Effect: this project probably wouldn't have happened if ARM hadn't drawn attention to its new competitor.


We changed the url changed from http://christophe.rhodes.io/notes/tag/riscv/ to the introductory article in the series. The other article listed there is http://christophe.rhodes.io/notes/blog/posts/2018/first_risc....


This has got to be the first processor for which the software is available before a complete computer exists; RISC-V 19” rack mountable servers remain distant science fiction.


Nah, certainly not the first. Lots of software was ported to AArch64 before any chips existed, only ARM's "Foundation Model" (a rather slow emulator).


Not to mention, there is actual RISC-V hardware capable of running this software; and you can buy it right now for a known public price (which is more than could be said for AArch64 for a long time, and almost to this day) and integrate it with standard peripherals (PCIe, SATA, USB, etc.).

Granted, the hardware is somewhat limited for now, since it's only in-order.


> you can buy it right now for a known public price

Huh, the HiFive Unleashed has a "buy now" button on the website. I thought it's still in crowdfunding stage. Nice.

> almost to this day

PINE A64 was kickstarted in december 2015 with a known public price of $15. The RPi3 and SoftIron Overdrive 1000 appeared in 2016, SolidRun MACCHIATOBin in 2017, and various Rockchip RK3399 boards (and Chromebooks!) are shipping this year.

Oh, and you could buy an iPhone 5S back in 2013. No standard peripherals on that one though :D


Servers, I explicitly wrote “19” rack mountable servers”!

Where can they be bought? Link please!!!


You could mount a HiFive Unleashed into a rack case I guess. It's not a standard form factor though, so you'd need some custom mounting hardware (or, well, hot glue :D)


That would be hacking. I couldn’t build datacenters with that. I’m a professional engineer, not a hacker.


Ada Lovelace disagrees.


She most certaily does not, since she is dead. And she helped Babbage construct the first mechanical computer so I most certaily disagree with your implication, since it is wrong.


Yeah she’s dead. Well spotted.

Here’s some documentation https://en.m.wikipedia.org/wiki/Babbage_Machine

Feel free to provide year the machine was actually completed by Babbage and Lovelace. A photo or two would be nice.

Trolls these days are still cute to their mothers.


Wow. One can construct something on paper without actually building it, but since you’re content to argue, I expect no less than you now arguing on the semantics of “to construct”. I’ll discard the troll implication as immature, since this is “Hacker News”, so I can’t say I’m surprised.


I’m content knowing you won’t back up your claims because you know you’re wrong.

Have interesting times.


“Have interesting times.”

Always, but that’s not for you to tell me.


Linux has supported x86-64 in long mode since 2001, well before the processors were available two years later.


Interestingly, LISP for the PDP-6 was begin written while the machine was still being built.


Very interesting indeed! Do you have any sources?


Here [1], Richard Greenblatt wrote A Lisp interpreter/compiler for PDP-6 in 1963 (it would become MacLisp), while the computer was still being designed.

[1] http://www.softwarepreservation.org/projects/LISP/maclisp_fa...


Great, thanks!


Itanium and AArch64 both had emulators years before they were released and OSes were ported before any chips were available.


I've got two real RISC-V machines (HiFive Unleashed) under my desk.


“HiFive Unleashed is the ultimate RISC-V developer board.”

A development board is light years away from a 19” rack mountable server with out-of-band management (lights out management) and everything else that goes into such a design. You know, the kind of hardware that is actually usable in production and not just a cool toy that might or might not spawn production systems some day. I for one am not content to just tinker with toys; I want servers so I can get some real work done.


Erm yes, so do I. In fact I gave a talk about this and I'm involved in the software side. https://www.youtube.com/watch?v=HxbpJzU2gkw

These things don't happen by force of nature. They happen because people do the work, and we're doing the work, not moaning on HN about things that haven't happened yet.


Sorry but you (plural) are moaning about a very badly thought out processor architecture for which no hardware other than tinkertoys exist, to the point where one could equal it with the Rust evangelism strike force. I’m really sorry that is the case, especially when I think back on how awful the assembler mnemonics of RISC-V are, and how all the work being done could have made so much more difference if it had been put into OpenSPARC, as if we didn’t have enough ego trips and repetition in the computer hardware space throughout the decades already. Why reuse all the work and refine one CPU design when one can just sink countless hours of one’s life into inventing it again, right? Sorry but that is just awful. You (now in singular) will never get that time back. I silently weep for the wastefulness of humanity inside of me when I think upon this debacle that is RISC-V.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: