Pretty cool. Rough and fuzzy around the edges, some of the usual speech synthesis cadence issues. Any insight into the techniques you're using to do this?
This uses http://hts.sp.nitech.ac.jp (HMM-based Speech Synthesis System). Yes, it is expected to be robotic sounding since real audio streams are not used during synthesis. Some improvements can be made I believe (larger training set and some post processing) - but don't think it will change dramatically. Thanks for trying.