Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Major improvements to audio at Zencoder, and why this matters (zencoder.com)
63 points by jon_dahl on July 20, 2011 | hide | past | favorite | 23 comments


General nitpick for articles like this: if you improved X about your service by switching from application A to application B, simply say it. Phrasing like that used in this article implies (falsely) that Zencoder wrote their own encoder.

"We made our web server 5 times faster!" might sound amazing, but if you really mean "we switched from Apache to Nginx", it's misleading.

Also, FAAC is widely known to be pretty bad. Even LAME MP3 is quite a bit better at 128kbps, to say nothing of Vorbis, good AAC encoders like Nero or Apple, and Opus/CELT.


Yep, we didn't write our own encoder - we evaluated and licensed a third party encoder. Article edited to make this more clear. The problem (at this point at least) is that our software licensing agreement with the third party includes confidentiality, so we can't name the software without permission. Stupid, I know. But you can probably guess, or figure out, what it is. ;)


I've seen confidentiality the other way (where a vendor can't say who their customers are) but why would someone not want a customer to tell the world how great their software is?


Only reason I can think is that if the customer messes up the implementation and creates a subpar service with it then it reflects poorly on the source provider.

Actually another reason might be that the source provider doesn't want other users to know about each other to compare licensing agreements and such.


I'd guess it was written by a lawyer who was more concerned about a customer telling the world how great their software isn't. And a blanket "do not mention" statement is probably safer than a "do not mention negatively".


I get what you're saying, but Zencoder's product isn't an encoder, its a cloud based encoding service (we're happy customers) as a consumer the actual implementation is a black box, as an engineer I appreciate the process they went through and also appreciate that they explained the way they got there.

If Heroku switched from Apache to Nginx and saw a marked improvement in the performance I'm find with them saying "We're serving pages 5x faster".


I thought it was a well-known fact that FAAC produced very bad output. Or at least well known by anyone who might be working for a company for whom AV compression is all they do.


I'd say this is well known with anyone working with open source video tools. And yet it's still in use all over the place, sadly.


So after all that, what encoder are you using?


Which aac encoder does iTunes use? Is it considered pretty good?


Yes - iTunes uses Quicktime, and Quicktime has one of the better AAC encoders. Unfortunately, it's not easily executable (or licensable) on Linux.


Can you comment on which actual encoder you're now using, or do you consider that a secret?


Yep, unfortunately we can't right now. Maybe down the road?


Looks like a secret: http://news.ycombinator.com/item?id=2787570 (a comment elsewhere on this page).


They use their own encoder. Apple's encoder is considered one of the best (slightly better than Nero).


Apple licensed an encoder. I forget the company that made it - it's mentioned in the dll somewhere.


I found the ringing in the new 16kHz sample to be a bit bothersome. It sounds like theres a tonal peak in the frequency response near the upper end of the frequency range. The peak would be less bothersome if it were spread out more.

Edit: is the new resampling done by something like Shibatch SSRC or libsamplerate's SRC?


It sounds a bit better outside of Flash, actually. Flash prefers 22050 or 44100 audio and doesn't do a great job of resampling others up to 44100 (which I think is what happens behind the scenes). Give it a listen in Chrome or some such and see what you think.

We aren't using those libraries right now, but we'll check them out. How do they compare to something like SoX?


I haven't used SoX extensively since Mandriva was called Mandrake. It looks like the newest version of SoX has a very good converter. The comparison graphs I just found and linked below show that the different converters seem to optimize for different parameters; Shibatch SSRC appears to focus on passband width, while libsamplerate's Secret Rabit Code has better time domain response. If you're using SoX VHQ, you probably can't do much better, unless you find that its time domain response causes undesirable artifacts.

As for the 16kHz sample, listening to the .m4a files directly did give much better results than playing them through Flash.

Anyway, good luck with ZenCoder.

http://sox.sourceforge.net/SoX/Resampling

http://src.infinitewave.ca/


This is awesome Jon! What audio processing tricks did you use?


Sorry for the slow response. For the most part, it's not tricks, at lest for now - more like using the best available software for resampling, channel mapping, etc. We don't (for instance) automatically mess with normalization/gain, though users can do that though our API.


Can I just have lossless audio, please?


Sure. FLAC is great, and it's on our roadmap.

We're in a funny place right now where both higher bitrate and lower bitrate video/audio are becoming increasingly important. Low bitrate because of mobile video and adaptive bitrate streaming, and high bitrate because bandwidth is getting cheaper.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: