Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's the principle, but not how it really works. As you said, it's only an analogy, but one that comes close to sounding like the real thing and is as such potentially confusing.

Audio codecs don't try to find functions that describe waves, they sample the waves.

What you get from a microphone or an electric guitar is a changing voltage, a wave of some sort. To digitize that wave (i.e. to save it using numbers and not vinyl discs or magnetic tape) you take samples (i.e. measure the voltage) at regular intervals. You typically might do that 48k times or 96k times per second (i.e. have a sampling frequency of 48kHz or 96kHz).

What you do then with those samples is quantize them, i.e. they get an integer depinding on how high or low the voltage is, for example a 16 Bit number.

That's how an analog-digital converter works on a basic level (you also need an filter to get rid of all the frequencies you don't want to digitize). This process is of course lossy. It's not possible to recontruct the wave that was digitized perfectly, it's only an approximation. All analog-digital converters are lossy (in some sense of the word). If you sample at 48kHz all frequencies higher than half of that (24kHz) are irrevocably lost (look up the Nyquist-Shannon Sampling Theorem if you want to know more). Quantization also means that you have to round the voltage you are measuring up or down, the audible effect this has on the sound is that it adds noise.

The parameters are, however, usually chosen in such a way that a human ear can't tell the difference. A healthy and young ear can hear frequencies up to about 20kHz, that's why the music on CDs is sampled at 44.1kHz (40kHz with some safety margin).

But that's only the conversion and it has not much to do with lossy codecs.

A lossless codec takes the result of this digitization (let's say 48k 16Bit numbers for a one second sound) and encodes them in such a way that those exact same 48k 16Bit numbers can be reconstructed after decoding.

Re-encoding a sound file with a lossless codec can consequently still be lossy: If you change the parameters (e.g. you half the sampling rate) you are still losing imformation - but it's at least possible to preserve the result of the original digitization perfectly.

A lossy codec doesn't aim for perfect reconstruction of all the numbers. It uses quirks in human hearing to be more or less precise about how those numbers are reconstructed. For example: A really loud sound masks quieter sounds. It's not possible for humans to hear that quieter sound, so why bother with all that precision for that quiet sound? That's where, for example, MP3 can dynamically decrease the quality. It's quite ingenious, really.

It is of course inevitable that during that process information is lost. It's not possible to get those original 48k numbers back, but at least something that sounds to an human ear more or less like them.



> that's why the music on CDs is sampled at 44.1kHz (40kHz with some safety margin)

A nice explanation of the 44.1kHz is here: http://www.cs.columbia.edu/~hgs/audio/44.1.html

Initially digital video tracks were used and as such they were constrained by the video formats commonly recorded on them and the playback devices coming with that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: