Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A single V100 board is vastly smaller than a TPU module with 4 TPU chips.

The closest comparison in terms of size would be 1 Volta DGX-1 (8x V100s) compared to 2 TPU modules (8x TPU2 chips).

Volta DGX-1: https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...

TPU Module: https://storage.googleapis.com/gweb-uniblog-publish-prod/ima...

And for completeness, this is the size of a single V100: https://cdn.arstechnica.net/wp-content/uploads/2017/05/34446...

You can see that 8x V100s are still more computationally dense than 8x TPU2 chips. Density is a very important factor in datacenter design.



From my read of that Volta picture (here's a close-up glamour shot of the V100 module: https://devblogs.nvidia.com/parallelforall/wp-content/upload... ), that's just the chip module itself. Note how none of the interconnect hardware is present? The V100 supports NVLink just like the P100. When you actually assemble that chip module onto a board, it looks more like this:

http://hothardware.com/ContentImages/NewsItem/37034/content/...

(That's a P100 server, obviously, but it's the same kind of design. The P100 module looks a lot like the V100 module.)


The interconnect hardware is present on the dgx-1 board.

The point of my previous comment is that it doesn't make sense to compare an entire TPU module (containing 4x TPUs) to a V100 board.

The closest comparison to a TPU module (with 4x TPUs) would be a dgx-1 board, which contains the nvlink bus that you mention, but also contains 8x V100 boards, hence why in my previous post I said you should compare the compute performance of a Volta DGX-1 (8x V100s) to 2 TPU modules (8x TPUs).

At the end of the day, it is simple, in a given area of space, you can get more compute performance from provisioning that area with V100s (in the form of using dgx-1s) versus provisioning it with TPUs (in the form of using TPU modules).


> Density is a very important factor in datacenter design.

That's true if you're putting the chips in your own data center, because density affects TCO.

But assuming that Google does not sell TPU hardware, this isn't an important metric to anyone other than Google. The question that is important is, what does the shape of the curve plotting "dollars spent" versus "time spent waiting for my model to train" look like?

The shape of that curve is affected by TCO, and TCO is certainly affected by density. But there's a lot more to it than that.


Sure, no argument there. For Google, producing and using TPUs, is probably cheaper than buying nvidia hardware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: