On one hand this is impressive, and I've been wondering when something like this would appear. On the other hand, I am -- like others here have expressed -- saddened by the impact this has on real musicians. Music is human, music theory is deeply mathematical and fascinating -- "solving" it with a big hammer like generative AI is rather unsatisfying.
The other very real aspect here is "training data" has to come from somewhere, and the copyright implications of this are beyond solved.
In the past I worked on real algorithmic music composition: algorithmic sequencer, paired with hardware- or soft- synthesizers. I could give it feedback and it'd evolve the composition, all without training data. It was computationally cheap, didn't infringe anyone's copyright, and a human still had very real creative influence (which instruments, scale, tempo, etc.). Message me if anyone's still interested in "dumb" AI like that. :-)
Computer-assisted music is nothing new, but taking away the creativity completely is turning music into noise -- noise that sounds like music.
> "solving" it with a big hammer like generative AI is rather unsatisfying.
The reason is greed. They jump on the bandwagon to get rich, not to bring art. They don't care about long term effects on creativity. If it means that it kills motivation to create new music, or even learn how to play an instrument, that's fine by these people. As long as they get their money.
If our sole goal was to get rich we would have pivoted to some b2bsaas thing as many suggested to us. What we’ve actually seen is so much new creativity from people who otherwise would never have made music.
Nothing was stopping them from making music before other than laziness.
I’m so sick of hearing this excuse. “I can’t draw so I use AI,” as if the people who can draw were born that way.
No, they spent countless hours practicing and that’s what makes it art. Because it’s the product of hours of decision making and learning. You can not skip ahead in line. Full stop.
I think it's the opposite. They are not saying "those people shouldn't draw [using AI]", they are saying "those people should've been drawing all this time".
Describing music to an AI is not "making music" the same way that hiring a musician and asking them to write you a rock song about a breakup is not "making music"
I don't see any contact info in your profile, but I have an email in mine. I am interested in hearing more about your process and if you have music for sale anywhere, I like to support electronic artists doing interesting stuff.
Anyone with ears can find music satisfying. You don't need an artist's backstory or blessing for that. By all means use slow AI to get the same point fast AI can get to, but don't ask me to value it differently.
And AI doesn’t make satisfying music. Music is partially derivative in the human sector, but only derivative in AI. That’s why it sounds like shit to reasonable ears.
I'm genuinely curious and really like that you're giving this a shot, but am very sceptical, as hardly any idea in computer architecture is new: if you dig enough you'll find it has been tried, but failed. You'll have to understand why it failed. If it was timing, maybe you can succeed today, but if it wasn't timing, you'll have to understand why it failed and don't repeat the same mistakes. It would be great to see more comparisons.
Firstly, your claims about virtual memory in general purpose CPUs is misleading: its purpose is memory virtualization and I wouldn't want a system without it in the presence of multiple processes (how can you trust every process not to shoot down another by accidentally accessing a wrong memory location?).
Ultimately, our hardware will become more specialized/heterogeneous, and we'll have many accelerators for various tasks, but there will likely always be a general purpose CPU at the heart of the system (that will have virtual memory, caches, etc.); for an overview, I enjoyed [1]. I see what you're building as another accelerator for inherently parallel latency-insensitive workloads (like you find in HPC). In a way, GPUs (+ Xeon Phi) cater to these markets today (benchmarks against these would be useful).
Second, I remember the previous post [2], where you claimed the system you are building relies on a RISC ISA, but now you claim it has changed to VLIW. You said yourself before
"[...] stick to RISC, instead of some crazy VLIW or very long pipeline scheme. In doing this, we limit compiler complexity while still having very simple/efficient core design, and thus hopefully keeping every core's pipeline full and without hazards [...]"
What is the rationale behind this? Do you think you'll be able to manage compiler complexity now?
I would be just as skeptical as you and everyone else should be of our claims. While I have talked informally about our architecture to many people on and offline, we have not posted much when it comes to the actual architecture that we have proceeded with to silicon (which is very close, but not exactly what we will be eventually bringing to market). I don't honestly believe any random person to take us seriously based on a couple of online postings, but I will say that most are decently convinced (or at least intrigued enough to withhold immediate doubt) after a rundown of the architecture.
As for why we think "this time is different" is a mix of a combination of good ideas and timing. I 100% agree with you that in the 50 years of von Neumann derivatives, basically all the low hanging fruit (and many higher up) have been attempted, and thankfully I can saw I've learned from a lot of them. Rather than be an entirely new concept, I think we have gone back to some fairly old ideas in going back to the time before hardware managed caches, and thought about simplicity when it comes to what it takes to actually accomplish computational goals. A lot of the hardware complexity that was starting to be added back in the mid/late 80's around the memory system (our big focus at REX) was before much attention was put into the compiler. While I am proud of what we have done on the hardware side, I think most of the credit will go to the compiler and software tools if we are successful, as that is what enables us to have such a powerful and efficient architecture. Ergo, we have the advantage of ~30 years of compiler advancements (plus a good amount of our own) where we have the luxury to remake the decision for software complexity over hardware complexity... plus 30 years of fabrication improvement. Couple that with Intel's declining revenues, end of easy CMOS scaling, and established portability tools (e.g. LLVM, which we have used as the basis for our toolchain) and I think this is the best time possible for us.
When it comes to virtual memory: Why would you need to have your memory space virtualized (which requires address translation) in order to have segmentation? We use physical addresses since it saves a lot of time and energy at the hardware level, but that doesn't mean you can't have software implement the same features and benefits that virtual memory, garbage collection, etc provide. The way our memory system as a whole (and in particular our Network On Chip) behaves and its system of constraints plays a very large role in this, but I can't/don't want to go into the details of that publicly right now. It may seem a bit hand wavy right now, but we do not see this as a limitation/real concern for us, and unless you want to write everything in assembly, the toolchain will make this no different than C/C++ code running on todays machines.
In the case of GPGPUs for HPC, we have the advantage of being truly MIMD over a SIMD architecture, plus a big improvement in power efficiency, programmability, and cost. We'd win in the same areas (I guess tie on programmability) for the Xeon Phi for benchmarks like LINPACK and STREAM, but the one benchmark I am especially looking forward to is HPCG (and anything else that tries to stress the memory system along with compute). While NVIDIA and Intel systems on the TOP500 list struggle to get 2% of their LINPACK score on HPCG[0], we should be performing 25x+ better[0]. Based on our simulations, we should be performing roughly equally across all 3 BLAS levels, which has been unheard of in HPC since the days of the original (Seymour designed) Cray machines.
Of course, my naivety from 2 years ago haunts me now ;) When the linked comment was written, I had yet to "see the light". Once I understood (through my co founder, the brilliant Paul Sebexen) the 'magic' that is possible when a toolchain has enough information to make good compilation decisions, did I realize that the simplicity of a VLIW decoding system made the most sense (and gave us a lot of extra abilities). It was about ~3 months after I made that comment that we started to go down this path, which early prototyping that applied to existing VLIW and scratchpad based systems led to our DARPA and later seed funding. It is only because of the fact that our hardware is so simple (and mathematically elegant in its organization) that the compiler can efficiently schedule instructions and memory movement. While I've only lived through a small fraction of the last 50 years of computer architecture, I think of myself as a very avid historian of it, and it really shocks me that no one has gone about thinking of the memory system quite like we have. I totally agree with my younger self on long pipelines though.
TL;DR: We think we'll succeed because we are combining old hardware ideas with new software ideas to make (in our opinion) the best architecture, plus this is the best time for a new fabless semiconductor startup. We have actually built the mythical "sufficiently smart compiler" due to some very clever (but simple) hardware that enables people to actually effectively program for this. We think we will be more energy efficient, performant, and easier to program for than our competition in our target areas (HPC, high end DSP).
I wish you and your project all the best. Hardware, and especially CPUs and alike are tough and rare. We haven't seen much new competitors (any) in that area, especially relevant ones.
When you say you rest your high hopes on toolchain, aren't you a bit scared of what happened to Itanium? Intel had toolchain under their r&d and it failed because they couldn't deliver. I'm interested to hear more about "mythical 'sufficiently smart compiler' and how it relates to your architecture.
Based on our software results so far, I wouldn't say I'm scared, but am definitely anxious. Since our main focus up to this point has been building the first test chip along with software tool prototyping, our progress in compiling "real" libraries and small applications is fairly early, but we're happy with the results. Now that we've taped out, we can devote more resources, and once we have real hardware, we will be able to test our applications ~1000x faster than the cycle accurate software simulation capabilities we have right now.
All that being said, we have good reason to believe that our approach is valid and won't suffer the flaws of "Itanic" that I've mentioned on this page and many times elsewhere. Unlike any prior VLIW (Intel called their bastardized version implemented in Itanium "EPIC"), our hardware was built with an emphasis on hard real time guarantees and strict determinism at every level of the design, which allows for a level of optimization that is impossible on any other architecture.
Basically, if the compiler has to make worse case assumptions almost all the time to prevent control and data hazards (as did Itanium due to a very convoluted design), how do you expect to have any compiler generated programs to be at all performant/efficient?
Does this means that users have to recompile the world for every cpu generation because of microarchitectural changes? I.e. is the pipeline exposed? Are you planning a Mill-like intermediate level bytecode?
Yes, and in certain cases of the same generation of chip (e.g. same microarchitecture but fewer number of cores and/or less memory per core; no problem if you compiled for a small number of cores/less memory and it is run on a "bigger" chip) as the compiler would need to remap the program and data location based on the global address map.
It is a very simple pipeline, and we expose the exact latencies required for all operations, along with things like branches with delay slots. As I have mentioned ad infinitum, determinism is a key part of our architecture, and having a fixed pipeline is necessary. Plus, we want anyone crazy and skilled enough who wants to hand write assembly the freedom to be crazy ;)
For the applications (HPC and DSP-like stuff) we are targeting, source code is always available, there are very long periods between when you have to recompile due to source code change, and optimization is a key factor. Our customers aren't only accepting with recompiling for every new generation of hardware, they expect it and want to be able to take advantage of any new improvements that the compiler would be able to make.
The other very real aspect here is "training data" has to come from somewhere, and the copyright implications of this are beyond solved.
In the past I worked on real algorithmic music composition: algorithmic sequencer, paired with hardware- or soft- synthesizers. I could give it feedback and it'd evolve the composition, all without training data. It was computationally cheap, didn't infringe anyone's copyright, and a human still had very real creative influence (which instruments, scale, tempo, etc.). Message me if anyone's still interested in "dumb" AI like that. :-)
Computer-assisted music is nothing new, but taking away the creativity completely is turning music into noise -- noise that sounds like music.