Well, we’re talking about emulation here, not a bytecode interpreter for a high ...

kragen · on July 14, 2024

i think it's unusual to hit 3-5 ipc sustained in normal code, but it seems like we're in broad agreement about quantitatively what kind of performance an interpreter can achieve. i mentioned three interpretive systems that do reach about 15% of native single-threaded performance with very different approaches: the ocaml bytecode interpreter, threaded interpretive code like gforth, and numpy

it's just that i call 15% of native performance 'very slow' and you call it 'very fast', because you're comparing the moped to the walker you're used to, and i'm comparing it to a sports car because that's what you're paying for

instruction set jitting getting a speedup instead of a slowdown goes back to last millennium with hp's dynamo https://dl.acm.org/doi/pdf/10.1145/349299.349303 and was central to transmeta's business plan. qemu's jit is sort of simpleminded to make it easier to maintain

ant6n · on July 14, 2024

A cpu emulator will only get to 15-20% with a lot of tricks. Usually a cpu interpreter would get like 5%. It’s not like Numpy, where you can increase performance by trying to offload as much execution as possible into native code. When emulating cpu instructions, the atomic instructions are tiny. But as I said, python is a poor comparison because the interpreter is very slow and there is a lot of overhead.

kragen · on July 14, 2024

although i've written compilers and interpreters, i don't think i've ever written an emulator for a real cpu. i'm interested to hear about your experience