So how many hardware systems does Apple silicon have for doing matrix multiplies...

throwaway31131 · 2025-10-15T19:13:12 1760555592

Doesn’t that make sense though as each manipulates a different layer in the memory hierarchy allowing the programmer to control the latency and throughput implications. I see it as a good thing.

gcr · 2025-10-19T11:43:35 1760874215

Oh I’m not complaining, I appreciate having so many knobs to tweak performance

twoodfin · 2025-10-15T17:43:13 1760550193

Is this really strange? Matmul is just a specialized kind of primitive compute, one that is seeing an explosion in practical uses.

A Mac Quadra in 1994 probably had floating point compute all over the place, despite the 1984 Mac having none.

nullbyte · 2025-10-15T17:11:23 1760548283

Thankfully I think libraries like Pytorch abstract this stuff away. But it seems very convoluted if you're building something from the ground up.

gardnr · 2025-10-15T18:52:09 1760554329

Does PyTorch support other acceleration? I thought they just support Metal.

joshuabaker2 · 2025-10-15T21:59:49 1760565589

You can convert a PyTorch model to an ONNX model that can use CoreML (or in some cases just convert it to a CoreML model directly)

jmrm · 2025-10-15T18:01:47 1760551307

I wonder if some Apple-made software, like Final Cut, make use of all of those "duplicated" instructions at the same time for getting a better performance...

I know how just the multitasking nature of the OS probably make this situation happens across different programs, but nonetheless would be pretty cool!

oskarkk · 2025-10-15T15:47:37 1760543257

Would it be possible to use all of them at the same time? Not necessarily in a practical way, but just for fun? Could different ways of doing this on CPU be done in some extent by one core at the same time, given it's superscalar?

staticfloat · 2025-10-15T17:09:09 1760548149

This is a very old answer about the M1, but yes what you’re saying is possible: https://stackoverflow.com/a/67590869/230778

RataNova · 2025-10-16T08:42:38 1760604158

Apple's clearly betting big on on-device AI workflows becoming the norm

llm_nerd · 2025-10-15T20:39:06 1760560746

>Apple also appears to be adding a “Neural Accelerator” to each core on the M5?

The "neural accelerator" is per GPU core, and is matmul. e.g. "Tensor cores".

HeckFeck · 2025-10-15T18:48:53 1760554133

Adding CPUs and GPUs on top of your CPUs and GPUs... Sounds like we've the spiritual successor of the Sega Saturn.

hannesfur · 2025-10-15T14:23:56 1760538236

I inferred that they meant the neural engine cores by neural accelerators or it could be a bigger/different AMX (which really should become a standard btw)