A Speaker Maker's Journey: Digital Audio - Upsampling and Oversampling Explained

Many types of digital sources, accessories and Digital to Analog Converters (DACs) provide some sort of sample data magic called oversampling or upsampling. Put simply it means you end up with more digital data than you started with.

There are some benefits, but none of these methods truly gets you closer to the original music. They are all just ways of trying to make the experience more pleasant. Think of it as looking out your window with a screen. You may take a picture and find that you can see the screen itself in the image, or you can do some editing with Gimp or PhotoShop and remove it. The new image can't possibly contain more true to life data than you started with, but the picture should be much more pleasant to look at.

Many audiophiles have been led to believe that this kind of digital math can do things like you might see on the TV shows CSI or NCIS. Somehow four pixels on a grainy satellite image can be processed over and over again until the criminal's face is clearly visible. It's just not true.

Looking at it another way, the frequency response of up and oversampling does not change. A 44.1 kHz file is not going to have 30kHz created after 4x upsampling. The frequency range and content density is unchanged. What may happen is that digital filtering becomes smoother and easier on the ears, or that jitter is improved somewhat by the use of higher data rates.

Differences Explained

Let's take original data. Since digital music is always integer, I'll imagine two consecutive samples with convenient values of 24 and 28. Now lets see what happens at 4x up or oversampling. If the original data was 44kHz/16 bits the DAC will now see a sample rate of 176.4 kHz but the bit depth may or may not. So, just to be thorough, here is our original data:

at a sample rate of 44.1 kHz these two samples represent:

2 samples / 44,100 = 45 microseconds of music.

Remember that we are adding samples in between the time slots, so we don't want to stretch out our time, that would result in pitch changes. Instead we increase the rate (samples / second) at which we feed the DAC, keeping the pitch constant.
So, instead of 2 samples, we have 8, but with a new sample rate. Lets redo the math:

8 samples / 176.4kHz = 45 microseconds of music.

Thhat's great, because if that didn't work the sound would be 4 times slower. :)

Oversampling

This is the oldest trick in the book. Almost immediately after CD players became commercially available oversampling became a buzz-word. I am no longer sure, but this may have only worked with so-called Delta-Sigma or 1-bit DAC's.

It's so simple you don't think it should work. Take a sample, and repeat it several times. It's that simple. It does not attempt to provide any more data but may shift some noise far above the Nyquist frequency. No math is involved, just counting. With 4x oversampling the DAC our orignal two samples become:

It's weird it helps, but it does. In fact, with oversampling, only 1 sample really matters at a time.

Upsampling

Bit Perfection

One of the objections to upsampling, is that the signal is no longer bit-perfect. The DAC no longer gets the original facts, but the original facts, plus a lot more. That "lots more" is pure mathematical conjecture. However, there are some real benefits to be had.

Things get even more muddled when upsampling is used for ASRC, Asynchronous Sample Rate Conversion, but also more beneficial, as it's one of the best ways to reduce jitter. More on that in a future post.

Technically and mathematically more challenging, there are two general approaches. To take the best advantage of this it's better if the bit depth increases beyond the original. So if the original was 16 bits, 24 or 32 bits will provide better resolution. However remember that this doesn't really make it more true to life. It just makes some things easier to do and helps us keep more of our results. There are some VERY nice 32 bit DAC chips out there though, so taking full advantage of them may also get us much closer to true 24 bit resolution. That's a topic for someone else.

Linear Interpolation

Imagine two points on a chart. Draw a straight line between them. That's simple interpolation. It's no more complicated than simple algebra. Calculate the rise, divide it by the number of intervening samples, and add that much for each "new" sample. For linear interpolation, the sample rate converter needs to know two samples at a time in order to figure out the rate at which the intermediate samples should change.

Again, consider our original two samples, 24 and 28. The rate of change is 4/sample. 4/4 = 1. Now the DAC gets:

24 +1 =
25 +1 =
26 +1 =
27 +1 =
28

We'll just assume there's no bit-depth changes, or that in this case no extra resolution was required. Of course, I chose 24 and 28 to make the math here easy.

Spline

A much more advanced way to create more samples is by using what are called splines. Remember the "French Curve" tools you may have used in drawing school?

Technically you only need 2 samples for a spline, but the result is the same as linear interpolation, so we'll ignore that case. With spline math we take a number of samples, usually under 20, to draw a much softer curve. Wadia was the first company I know of who introduced this concept. In this case it really helps to have more bits, as the extra bits help with more fine grained results. As you might imagine, the math and CPU power required is greatest for this example.

If this was floating point math our working data set would be:

(nine samples before)
24.000
25.185
26.355
27.888
28.000
(nine samples after)

Remember that what's really going on is that the algorithm is taking more samples into account than our original two in order to fit the curve properly. So why the third sample is 27.888 instead of 27.978 or 26.500 has to do with the nine samples in the original file before the first (24) and after the last (28) shown here. It is believed, without a lot of proof, that this method may provide the most natural resulting sound.

Are Splines Really Better?

Splines are very cool, but it may be argued, convincingly, that we are not doing much more than you could achieve with a capacitor and resistor with the proper time constants. In other words, it's a lot of math and hardware for what could be done with $2 or less in parts. The real potential benefit of this advanced though is in custom algorithms. You can be as creative as you want to in your algorithms.

What About Sound Quality?

Personally I have come to believe that the analog output stages matter much more than interpolating algorithms and sample rates or bit depth but the devil is in the implementation details. As always, buy what you like, and what is most pleasant to your ears. Don't buy algorithms or chips. Buy results, and spend no money that isn't pleasing to you.

A Speaker Maker's Journey

Monday, July 11, 2016

Digital Audio - Upsampling and Oversampling Explained