Does anyone know how this compares with VLIW designs like the original Yale/Mult...

Symmetry · on Feb 8, 2014

Well, this seems to fall within the VLIW tradition and has an exposed pipeline like the original VLIW, but there are a bunch of differences. In the original VLIW every instruction pipeline was conceptually a different processor while the Mill is very much unified around it's single belt, though I wonder if you could have a similar design with separate integer and floating point belts.

And instead of having a fixed instruction format the Mill has variable length bundles, which is good. Instruction cache pressure is certainly a traditional weakness of VLIW. So maybe you could say Mill:VLIW::CISC:RISC? But the most important part of RISC was separating memory access from operations and the Mill still certainly does that.

outside1234 · on Feb 7, 2014

Or more recently, how does this compare to the Itanic from Intel?

jmz92 · on Feb 8, 2014

Some of the memory ideas are similar--Itanium had some good ideas about "hoisting" loads [1] which I think are more flexible than the Mill's solution. In general, this is a larger departure from existing architectures than Itanium was. Comparing it with Itanium, I doubt it will be successful in the marketplace for these reasons:

-Nobody could write a competitive compiler for Itanium, in large part because it was just different (VLIW-style scheduling is hard). The Mill is stranger still. -Itanium failed to get a foothold despite a huge marketing effort from the biggest player in the field. -Right now, everybody's needs are being met by the combination of x86 and ARM (with some POWER, MIPS, and SPARC on the fringes). These are doing well enough right now that very few people are going to want to go through the work to port to a wildly new architecture.

[1] http://en.wikipedia.org/wiki/Advanced_load_address_table

rcxdude · on Feb 8, 2014

The compiler part seems to be a core part of the mill's strategy: the representation and design seems to be oriented towards making it easy to compile for (the guy who gives the talks is a compiler writer). If the performance gains are half as good as advertised, and porting is not a complete pain (and it seems it won't be too bad), then they will have little difficulty attracting market share, even if only in niche applications at first.

jitl · on Feb 8, 2014

   > -Right now, everybody's needs are being met by the
   > combination of x86 and ARM (with some POWER, MIPS, and
   > SPARC on the fringes). These are doing well enough
   > right now that very few people are going to want to go
   > through the work to port to a wildly new architecture.

That's not true at all. The biggest high-performance compute is being done on special parallel architectures from Nvidia [1] (Tesla). Intel trying to bring X86 back into the race with its Xeon Phi co-processer boards [2].

[1] http://www.top500.org/lists/2013/11/

[2] http://www.intel.com/content/www/us/en/processors/xeon/xeon-...

Scaevolus · on Feb 8, 2014

The Mill aims to be good at general purpose computation. HPC is not general purpose computation, and is a tiny fraction of the market.

leoc · on Feb 8, 2014

> Right now, everybody's needs are being met by the combination of x86 and ARM (with some POWER, MIPS, and SPARC on the fringes).

I'm not sure. I think that a hard port to a new architecture must look a lot more like a worthwhile effort now that the wait-six-months Plan A no longer works, especially for single-threaded workloads. Provided the new architecture can actually deliver the goods, of course.

zanny · on Feb 8, 2014

Back when Itanium hit the market we didn't have LLVM, I wonder how hard it would be to write an assembler for Mill with it.

jmz92 · on Feb 8, 2014

LLVM intermediate representation and Mill code are going to be pretty different. The LLVM machine model is a register based machine (with an arbitrary number of registers--the backends do the work of register allocation). Basically, an easier RISC-ish assembly.

So, while LLVM would be helpful for porting things to the Mill, as it's largely a "solve once use everywhere" problem, it's still not trivial. It could take a lot of effort to make it competitive.

auvrw · on Feb 8, 2014

as someone who knows next to nothing about cpu architecture but has watched most of the videos, it seems as though all the concepts are broadly familiar ones to experienced architecture people, but the details of every corner are slightly rearranged.

the position of the translation lookahead buffer is one example of this. that portion of the memory talk goes something like

Ivan: usually the TLB is located here [points to slide]. in the mill it's here [flips to next slide].

Audience: gasp!

solarexplorer · on Feb 7, 2014

It sure does look like a Multiflow on steroids...