Hacker Newsnew | past | comments | ask | show | jobs | submit | mgaudet's commentslogin

Eep.

So, on my M1 mac, did `uvx pocket-tts serve`. Plugged in

> It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way—in short, the period was so far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only

(Beginning of Tale of Two Cities)

but the problem is Javert skips over parts of sentences! Eg, it starts:

> "It was the best of times, it was the worst of times, it was the age of wisdom, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the spring of hope, it was the winter of despair, we had everything before us, ..."

Notice how it skips over "it was the age of foolishness,", "it was the winter of despair,"

Which... Doesn't exactly inspire faith in a TTS system.

(Marius seems better; posted https://github.com/kyutai-labs/pocket-tts/issues/38)


All the models I tried have similar problems. When trying to batch a whole audiobook, the only way is to run it, then run a model to transcribe and check you get the same text.

Yeah Javert mangled up those sentences for me as well, it skipped whole parts and then also moved words around

- "its noisiest superlative insisted on its being received"

Win10 RTX 5070 Ti


Václav from Kyutai here. Thanks for the bug report! A workaround for now is to chunk the text into smaller parts where the model is more reliable. We already do some chunking in the Python package. There is also a more fancy way to do this chunking in a way that ensures that the stitched-together parts continue well (teacher-forcing), but we haven't implemented that yet.

Is this just sort of expected for these models? Should users of this expect only truncation or can hallucinated bits happen too?

I also find Javert in particular seems to put in huge gaps and spaces... side effect of the voice?


Using your first text block 'Eponine' skips "we had nothing before us" and doesn't speak the final "that some of its noisiest"

I wonder what's going wrong in there


interesting; it skipped "we had everything before us," in my test. Yeah, not a good sign.


It's too bad the title prefix "75x Faster" got dropped.


I stumbled a bit using jj on a big repo [1], but I too am very interested in seeing it grow and evolve.

I plan to return to my experiment sometime, would love tips on large repos and making it more manageable.

[1]: https://www.mgaudet.ca/technical/2023/11/23/exploring-jujits...


For speed on large repos, you can try using `watchman`. It's briefly documented at https://martinvonz.github.io/jj/latest/config/#filesystem-mo....


https://www.mgaudet.ca/ for general interest stuff, https://www.mgaudet.ca/technical for tech stuff.

Some good ones IMO:

- Implementing Private Fields for JavaScript https://www.mgaudet.ca/technical/2021/5/4/implementing-priva...

- Histories, by Herodotus: https://www.mgaudet.ca/blog/2020/11/26/histories-by-herodotu...


Big thumbs up to the Landmark Histories project (http://thelandmarkancienthistories.com/). Excellent production value making reading these histories much more managable -- you're 100% right on the frequent maps and good appendixes.

Don't have Anabasis, but Herodotus was excellent.


Would also be interesting to see if this is a Java issue, or a JVM implementation issue: ie, does an OpenJ9 based JVM (Available under the weird IBM Semeru name here [1]) have similar behaviour?

[1]: https://developer.ibm.com/languages/java/semeru-runtimes/


Interesting. OpenJ9 has something similar in preview: https://blog.openj9.org/tag/jitserver/


So I thought this whole thing was bullshit, then I read the blogpost from OpenJ9

I changed my mind, this image is really what they ought to be showing. It gets the point across:

https://i0.wp.com/blog.openj9.org/wp-content/uploads/2021/09...

What you're looking at is allocation of multiple apps inside of nodes in a cluster. With a JIT Server in each node, the memory required for each instance of the app is reduced, such that the effect is more instances can be fit in the same node size than before.

It reminds me of "bonuses" on equipment in RPGs. You lose 1 equipment slot to have it taken up by the item, but in return the rest of your equipment gets a bonus that more than makes up for the slot you can't use now.


I suppose ahead-of-time compiling everything so there is no compiler in any application node is a technique that is still some years away.


The article addresses this actually. For a language like Java, the jit is the source of a ton of it's performance.

Think of it more like continuous PGO, but the delta perf improvement is much higher.


The problem is that requires a closed world approach, or constrained usage of reflection.

Yes there have been AOT compilers since around 2000, however you will notice that they target specific deployment cases, and also offer JIT caches in alternative.

Actually that is also how Android rebooted their AOT efforts in version 7.

You can see it in GraalVM and native image as well.


I too would love this icon, and would take a PR from teams who can speak to that.

Hard to judge often!


Yeah, I posted a blog post [1] that explains my motivation in curating this list. Seeing the broad range of companies that do this kind of work was really helpful early in my career.

[1]: https://www.mgaudet.ca/technical/2019/12/10/compiler-jobs


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: