Wednesday, February 26, 2014

Mix: The Emperor's New Sampling Rate

Audiophiles have been singing the praises of various high-definition audio formats (SACD, DVD-A, 96KHz sampling, 192KHz sampling, 24-bit, etc.) for years, while others have claimed that it's all hype and wishful thinking.

A double-blind study (from 2008) demonstrates (and to me, proves) that the nay-sayers are right and the audiophiles are wrong.

...
According to a remarkable new study, however, the failure of new audio formats — at least the ones that claim superiority thanks to higher sample rates — to succeed commercially may in reality be meaningless. The study basically says that (with apologies to Firesign Theatre) everything you, I, Moorer and everyone else know about how much better high-sample-rate audio sounds is wrong.

The study was published in this past September's Journal of the Audio Engineering Society under the title "Audibility of a CD-Standard A/D/A Loop Inserted Into High-Resolution Audio Playback."
...
It was designed to show whether real people, with good ears, can hear any differences between “high-resolution” audio and the 44.1kHz/16-bit CD standard. And the answer Moran and Meyer came up with, after hundreds of trials with dozens of subjects using four different top-tier systems playing a wide variety of music, is, “No, they can't.”
...

The article goes on to describe the testing methodology, point out some of its criticism and also explain why some people believe that these high-definition formats sound better.

The conclusions don't surprise me one bit either. It aligns with what I've said for many years. They (and I) conclude that the reason purchased SACD and DVD-A material sounds so much better than a CD is not because of the encoding format but because the engineers mixing and mastering the discs are taking much more care to get it right. They're not being pressured to ship it out the door as quickly as possible, or to compress the dynamic range to make it seem louder when played on the radio. You can hear this for yourself if you buy an audiophile mix (on CD) from a studio like Mobile Fidelity and compare that CD to the more common CD release of the same album.

The article also points out something that I intuitively knew, but never thought much about - that the sound you hear from any system can vary greatly based on the acoustics of the room and where you are sitting. Moving your ears even a few inches can tremendously affect the sound you hear. So even if there are audible differences between two sources, they can be completely overwhelmed by the acoustics of the room itself and the precise location where you're sitting in it.

All this having been said, there is still a place for audio with high sample rates and 24-bit precision, but it's not in consumer media. It's in the studio. When digitally processing audio, every processing stage you apply adds some errors to the data. With high quality software, these errors will reside in the least-significant bits and in the highest frequencies, but some errors are always added. When your source material is at a very high resolution, then the errors will be in bits/frequencies that get eliminated as a part of the final conversion to CD-format (44.1KHz, 16-bit). If your source starts out in CD format, however, then the bits/frequencies with the errors can't be eliminated without reducing the audio resolution to levels where the differences are audible. Since you usually don't want to force your listeners to hear low-resolution audio and you don't want them to hear processing errors, it is best to record and process audio at higher resolutions/frequencies and cut them back to CD quality as the very last step of processing (whether that is making a CD master or a digital audio file for customers to download.)

None of this should come as a surprise to anyone with experience with audio. In the analog world, every stage of processing adds a little noise to the signal - similar to the errors introduced by digital processing. Good equipment adds very little noise, but some is always added. If you work with very strong signals (and have equipment that can handle them cleanly, of course), then that noise will be completely overwhelmed by the audio signal. When you trim the signal strength down to the levels required for your final media (be it vinyl, cassette or CD,) then that noise will be reduced to levels where it can not be heard (and where the media may even be incapable of reproducing it.)

As an interesting side-point, this analog technique (record the strongest signal you can and trim it at playback) is the basis of the Dolby noise reduction system used by most cassette tapes. Although there are a lot of subtle details, the overall concept of the Dolby system is that the source signal is boosted at those frequencies where tape mechanisms naturally produce noise ("hiss"), and this boosted signal is recorded. At playback time, those frequencies are cut, restoring the levels of the original mix. The signal-cutting ends up reducing the volume levels of the player's inherent noise, sometimes to the point where it is no longer audible.

Ironically, vinyl records must use the opposite technique in order to compensate for the limits of that media. Phonograph records can not record very strong bass frequencies, because they would cause the grooves to get too wide, making the stylus skip on all but the best turntables. In order to compensate, all records are recorded using "RIAA curve" equalization, which cuts those frequencies down to levels that can be reliably played back. At playback, a pre-amplifier (whether standalone, part of the turntable or built-in to an amplifier) applies the reverse equalization to boost those bass frequencies back to the level of the original mix. This is why, if you try to digitize the raw signal from a record, it sounds wrong. You must use hardware or software to apply the RIAA playback equalization curve in order to make it come out right.

2 comments:

Drew said...

It's kind of sad to think that in a world of MP3 and AAC, CD Quality is what we are now striving to achieve again with Apple Lossless, FLAC, etc. I think one arguement for those high sampling rates and other formats is to get the version right off the mastering engineer's computer and not a downsampled version. Whether you can notice the dithering errors or not doesn't matter, it's just one step closer to the live performance.

Shamino said...

If you can't hear the difference, then what does it get you, aside from bigger files?

I think that what this really proves is that the CD format is pretty darn good for its purpose of distributing music for the public to play. Unless you want to use the CDs as source material for further editing/mixing, the extra bits are not going to help. Better quality playback equipment (DACs, amplifiers, speakers, etc.) will contribute much more to what you hear than simply adding more bits to the files.

And this is when talking about an ideal listening environment. When you're talking about an environment with lots of ambient noise like a car, or in a public place, or even when just out walking when there's some wind, that noise is likely to mask most of the differences between a CD and much older technologies, like cassette tape. For those environments, a low-bitrate encoding, which would sound terrible in your living room may sound just fine. Of course, how low you want to push it will depend what you're willing to listen to - some people are more sensitive than others to the artifacts from data compression.

I have one relative who transcoded his music collection down to 24Kbps MP3 (!!) in order to cram insane numbers of songs onto a single CD. He says that when he plays it in the car, he can't tell the difference between that and the original CD. I think he's exaggerating somewhat, but I do believe him when he says that he doesn't have any problem with what he's hearing.

Ultimately, most people are just interested in being able to take their entire music collection with them wherever they go, and quality, past a minimum acceptable threshold, is not very important. Different people have different minimum standards, but very few will insist on studio-master quality.

And I have to agree. I don't rip music at 24K, but I use compressed audio all the time. My CDs are all ripped at 128K VBR AAC in order to cram as much possible into a 4GB iPod nano. I know there are losses, and I can sometimes hear subtle difference when I play the iPod in my living room where the good stereo is, but most of the time I'm not listening there, but I'm in an environment where I really don't notice.

BTW, ALAC and FLAC are, by definition, indistinguishable from the original source. They are "lossless" in the sense of data compression - after decompression, the file is bit-for-bit identical to the original. It's just like unpacking a zip file. There can be no audio loss in such an environment, unless the playback software has bugs. This is also why those formats don't get more than 2:1 compression - because mathematically, that's usually the best you can do without throwing away some data.