Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From thee link:

> Tortoise is a bit tongue in cheek: this model is insanely slow. It leverages both an autoregressive decoder and a diffusion decoder; both known for their low sampling rates. On a NVidia Tesla K80, expect to generate a medium sized sentence every 2 minutes.

I suspect that for a real(-ish) time TTS system, something else is needed. OTOH if you want to record some voice acting for a game or other multimedia product, it still may be more cost-effective than recording a bunch of live humans.

(K8 = NVidia Tesla K80, GPU, $800-900 for a 24GB version right now.)



I see 24GB Tesla K80s on ebay for $90...what am I missing?


a k80 is extremely old by now, so I'd expect this to be maybe an order of magnitude faster.


Would it still require a 3080 to run adequately, that is, with 1-2 seconds of delay? I've no idea what consumer-grade hardware works well for ML loads.


I haven't tried it, but the k80 is about 6 years old/5 generations. there have been massive leaps since then.


6 years old is nowadays more like 3 generations and it's definitely not a magnitude (10x) of difference.


Kepler, Maxwell, Turing, Volta, ampere, Lovelace, hopper. it's 6 generations old when you include the micro architectures. it would be about a 10x improvement.


Oh, if it's kepler, absolutely. Thought 6 years thus Ampere.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: