Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
syntaxing
1 day ago
|
parent
|
context
|
favorite
| on:
Pocket TTS: A high quality TTS that gives your CPU...
Is there something similar for STT? I’m using whisper distill models and they work ok. Sometimes it gets what I say completely wrong.
daemonologist
1 day ago
|
next
[–]
Parakeet is not really more accurate than Whisper, but it's much faster - faster than realtime even on CPU:
https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3
. You have to use Nemo though, or mess around with third-party conversions. (Also has a big brother Canary:
https://huggingface.co/nvidia/canary-1b-v2
. There's also the confusingly named/positioned Nemotron speech:
https://huggingface.co/nvidia/nemotron-speech-streaming-en-0...
)
reply
satvikpendem
1 day ago
|
parent
|
next
[–]
Keep in mind Parakeet is pretty limited in the number of languages it supports compared to Whisper.
reply
jokethrowaway
1 day ago
|
parent
|
prev
|
next
[–]
Parakeet feels much more accurate in practice than whisper, it was a real "a-ha" moment for me.
Of course, English only
reply
phoronixrly
1 day ago
|
prev
[–]
from the other day
https://github.com/cjpais/Handy
reply
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: