WellSaid Labs uses deep generative models to create hyperrealistic voice-overs for high-quality media content like books (i.e. audiobooks), videos, assistive devices, call centers, video games, resurrected celebrities, etc. The voice-over market alone is $5 billion dollars.
We have also secured substantial seed funding from top-tier VCs and are building out our funding team. Finally, we spun we are a spin-out from Allen Institute of Artificial Intelligence (a.k.a. Paul Allen's AI Lab).
You'll work in one of these roles:
- Full stack engineer (React / Node.js / GCP)
- Deep learning engineer/researcher (PyTorch / Python)
- Deep learning performance engineer (C++)
With WellSaid Labs, you'll help build one of the first commercial core deep learning products.
For context, it's important to know that these are probably cherry picked samples. The authors make no mention of attempting randomly select these samples. For as long as text-to-speech has existed, there have been impressive demos backed by cherry picking.
The 3 Dessa team members did not in 3 months of work create anything innovative probably. Rayhane Mamah, one of the Dessa team members, had previously published a Tacotron 2 (Google's 2017 research) implementation (https://github.com/Rayhane-mamah/Tacotron-2) that has similar noise/distortion and intonation/prosody issues as their "RealTalk model".
Following on the above, Google's TTS research already demonstrated human-parity as measured by MOS score in early 2018. That research was deployed as Google Duplex in mid 2018.
Google's TTS research also showed the deficiencies of this technology. Without the invention of AGI, the TTS models do not understand the underlying text; therefore, it'll be unable to do more "complex things with intonation/prosody". Furthermore, the models suffer from overfitting. The model performance degrades significantly when performing TTS on text not typically seen in the training data.
WellSaid Labs uses deep generative models to create hyperrealistic voice-overs for books (i.e. audiobooks), videos, assistive devices, call centers, video games, etc.
We have also secured substantial seed funding from top-tier VCs and are building out our funding team. We are a spin-out from Allen Institute of Artificial Intelligence (a.k.a. Paul Allen's AI Lab)
You'll work in one of these roles:
- Full stack engineer
- Infrastructure engineer
- Deep learning engineer / researcher
- Deep learning performance engineer
You'll pioneer in the first commercial deep generative model editor.
WellSaid Labs uses deep generative models to create hyperrealistic voice-overs for high-quality media content like books (i.e. audiobooks), videos, assistive devices, call centers, video games, resurrected celebrities, etc. The voice-over market alone is $5 billion dollars.
We have also secured substantial seed funding from top-tier VCs and are building out our funding team. Finally, we spun we are a spin-out from Allen Institute of Artificial Intelligence (a.k.a. Paul Allen's AI Lab).
You'll work in one of these roles:
- Full stack engineer (React / Node.js / GCP)
- Deep learning engineer/researcher (PyTorch / Python)
- Deep learning performance engineer (C++)
With WellSaid Labs, you'll help build one of the first commercial core deep learning products.
Email michael[at]wellsaidlabs[dot]com to apply.
----------------------------------
PRESS:
https://techcrunch.com/2019/03/07/wellsaid-aims-to-make-natu....
https://www.geekwire.com/2019/ai2s-incubator-gives-birth-wel....