The Inqualified Scientist: Digital Music
So did you hear the news? By “hear”, of course, I’m referring to the conversion of vibrations in the air or other conducting medium to electrical impulses in your brain. Outside of your ear (an appropriate sensor responsible for the conversion), this sound travels as waves of vibrations.
Imagine one of those humidity and temperature detectors in museums, where a needle records the current temperature on a moving strip of paper. Instead of measuring temperature, our device will measure the vibrations of the air — the needle will draw the movement of the air particles on very fast moving paper.
If I were to play a very pure tone, the paper would show a smooth wave — the needle would move up and down smoothly, faster for higher tones and larger waves for louder tones. If (more likely) I were to belch at the device, the sound wave I produce would look a lot more random and jagged. It would be relatively flat, or gently waving while I inhaled quietly, then a sharp burst with the attack of my belch decaying into a long and regular afterburp.
The shape the needle draws on the paper (often called a wave even if it isn’t smooth and regular like a pure tone) captures the sound I made. This is called a time-domain representation because the moving paper shows the intensity of the sound as time passes.
All I need to do to play back the sound is to build a machine that scrolls through my belch paper and emits vibrations according to the intensity shown by the wave. This is the same principle behind the speaker on a record player, where the sound is stored in jagged grooves on a vinyl disc.
Going back to the device in the museum that stores daily temperature on a strip of paper, imagine that all the museums in the world want to share their results as part of ongoing research into the effects of temperature variation on their oldest works. Given that they are only connected by voice phone lines, how can they share the precise shape of the temperature waves?
A good answer is that they decide what precision is important to their research, then they read the results over the phone party line — at noon, it was 19.5 degrees, at one o’clock it was 19.6 degrees, at two it was 19.8 degrees, etc. The researchers at the other end can write the numbers down and sketch the approximate shape. The more precise the temperature readings (19.821 vs 19.8) and the more frequently they’re read (every minute instead of every hour), the more precise the sketch at the other end. The sketch can never exceed the original in accuracy, but using the same numbers, all of the museums can approximate the original wave sufficiently for their needs.
This is the difference between analog and digital. At one point, all of the data is coverted to numbers.
If someone were to convert my belch wave into numbers, they could get a reasonable accuracy by dividing the intensity (equivalent to the temperature) into 256 different levels, and taking readings 8000 times a second. BRAAP. This is equivalent to telephone quality sound.
A higher quality specification is to divide the height of the wave into 65 thousand levels (16 bits bits) and take readings 44100 times a second. Drop those numbers onto a laser-etchable substrate, and you have an audio CD.
Hooray for audio CDs! They’re a great source of high quality, raw audio data. Let’s do some calculations — 16 bits per sample, 44.1 thousand samples per second and 2 channels (don’t forget that stereo music is delivered to each ear) is about 1,411 thousand bits per second, or kbps (kilobits per second). This is the bitrate.
Note that 74 minutes (4440 seconds) at that rates is about 6.2 billion bits — or about 750 megabytes, which is about the amount of data we expect a standard CD to hold. The difference comes from error correcting codes and filesystem information on a data CD.
There you have it. Are you interested in learning why your MP3 files approach CD quality using one tenth the bitrate?
GKarlsen
Holy Smokes!!!
You know freaking everything!!
Genius!
I wonder what prompted professor Skraba to open his office? Well, while you’ve got your door open…
Could you go past MP3s, OGGs and Fourier Transforms, and explain what’s the difference between a Fourier Transform and a Wavelet Transform? I’m interested in the whole JPEG2000 proposition and what some of the next generation video codecs are up to (Wavelets are all the rage these days).
In other news – get a load of this little gem: http://linuxdevices.com/articles/AT7933085076.html
Convergence is here! I can’t wait to get my hands on one. (you may have out-Slashdotted me, and already seen this though)
Thank you for that illuminating lesson, Prof Skraba! You are a genius indeed.
And yes, I would like to hear a technical explanation of the reason that 128 bitrate sounds crap while 192 bitrate sounds great. Have techies/doctors attempted to find out where the bitrate quality “threshold” is for the average human ear?