Delving into audio programming; DSP 101

I’ve been taking the leap into the world of audio programming. After a couple of years of working on my C++ chops, I’ve reached a point I feel comfortable putting it to good use, and what better way than getting into building software audio instruments and effects (aka VSTs or plugins). This has been something sitting on my personal backlog for a long time, so I’m excited to get into it! I’m going in with some fundamentals, but not much, so I hope to bring all of you into the journey alongside me.

A real lack of resources…

First off, there is not much out there. Like, wow. This particular subject seems to be a little too niche, even for the almighty everything-ness of the internet. BUT, I have found some resources, and I’ll gladly share them here. I will also be sharing my findings and teaching what I learn here to hopefully have just one more location where this info exists. Should I make some YouTube videos as well? Maybe. Definite maybe. I mean, yeah, I should, but actually doing it is a totally different thing.

Where to start

There are so many different ways to tackle this. There are different libraries for different formats, so we should start there. I’m not really familiar with the AAX or RTAS sdks, so I can’t really speak much to those, but the SDK for VST2/3 is written by Steinberg and extremely thorough. So thorough, it absolutely made my head spin. So, depending on your skill level in C++, you could download the SDK directly from here and go to town. Godspeed.

There is a C++ framework that known as “wittle” or “iPlug” that aids in the building of SDKs in all three formats, or even standalone applications. It seems to be commonly used and very popular, so definitely an option, but alas not one I have quite looked at yet. You can find that one here. I haven’t found much in terms of tutorials or help with wittle, beyond the documentation, which is pretty good.

For a more graphical and feature rich platform, a really common platform is known as Juce. It can do a lot of things, not just audio plugins or standalone apps (in all three format), but other graphical applications as well. Seems to be very feature rich, but is a tad expensive when you get into the licensed versions. This seems to be the most popular option to be taught. The majority of info I have found in terms of tutorials and courses is around Juce. Check it out. I have played around with it, but don’t plan on going this route either.

I tend to be the “hard mode” kind of guy. More than anything, I like to understand how things work at a deep level, therefore my passion for C++. So, personally I’d like work with the Steinberg SDK directly, but as the documentation is a little intimidating, I have been looking far and wide for a resource. I found one. Check out this book. I run on books. I thrive with them. It’s how I learn. And this book looks like the exact kind of book I have been looking for. It covers the basics of DSP and some advanced DSP theory, it covers all the major SDKs and working with them directly, and really seems to be the right match for me. So, that is where I’m going to start, and I look forward to it.

DSP 101 - Sample Rate & Bit Rate

As I mentioned above, I do know some basics, so I’ll gladly explain them here. What is DSP? It stands for Digital Signal Processing (or Processor). It is the device that handles or the idea behind turning analog sounds into bits a computer can understand, manipulating those bits, and turning it back into analog sound. If you’re listening to music on your laptop right now, that is what’s currently happening between your computer and the speakers.

There are a lot of discussions about sample rates out there, with every producer trying to outdo the next by producing in the highest sample rate they can, i.e. 192khz. Most of them are doing so just because it’s a higher number and they don’t truly understand what a sample rate is. This particular topic has a lot of info out there, so by all means, Google the holy hell out of it, but I’ll lay out the basics here.

The sample rate is the rate at which a wave is sampled in a second.

Okay. WTF does that mean. Picture a single cycle of a sine wave. Imagine the center line is zero, the top is 1, and the bottom is -1. Along the curve of that sine wave there is an infinite amount of numbers that exist. When it comes to computers and managing data, it’s not good with infinite numbers, or real numbers. It can’t do them. It’s good at floats, or estimations of real numbers. Soooo, without going too far down that rabbit hole, in order for the computer to manage an estimation of that sine wave, it picks out a number of evenly spaced out locations along that wave and samples them. The number of times it does that in 1 second (normally, could be other measurements of time) is the sample rate. Therefore, a sample rate of 44.1khz means that there are 44,100 samples of that wave, every second. Now, considering an analog wave is an infinite amount of numbers along it, having only 44,100 isn’t nearly as good, right? Right. It’s an estimation. But, thanks for Nyquist’s principal, it’s a pretty good estimation. I’ll explain Nyquist’s principal in another post, but basically a really intelligent engineer found that you can accurately recreate a wave if your sample rate is double the bandwidth of analog signal. In the case of sound, considering the upper bound of human hearing is about 20khz, if you have a sampling rate of at least 40khz, you can accurately recreate any waves that are within the bounds of human hearing. That is why 44.1khz is such a standard.

If we only need 40khz, then why in the hell do we have the capability to go up to 192khz? Well, we’ve found that the higher you go, the accuracy of the wave within the 200-20khz range does actually improve. There are less artifacts, increased fidelity, etc. But it really depends on the situation, and when it comes to commercialized music, it truly doesn’t matter enough for the investment required to support producing in 192khz. Just sayin’. Put your money in your creative tools and just produce at 44.1khz.

Bit depth. This one is actually pretty easy, just a little computer techy and has to do with data sizes. Remember how there are samples taken in a wave? Well, each one of those samples (i.e. one of the 44,100 in the above example) has a hard limit of how big it can be. That limit, or size of that sample, is the bit depth. When we refer to a 16-bit bit depth, we’re saying the word length of that sample is 16 bits, or 65,536 (2 ^ 16). Now, the bigger this number, the more data can be packed into each sample, giving the computer more information, leading to a more accurate estimation of the original analog wave. But, the more data the computer has to work through, the slower it’ll be. More data, more work. The higher, common bit depth we see available now is 24 bits. Doesn’t sound like a big jump, but considering it’s a jump from 2^16 to 2^24, it’s exponential, so yeah, it’s a big jump. A 24-bit bit depth gives you a word size of 16,777,216. Allowing for way more data to be packed into each sample.

Whew, that was a lot. I think we’ll need some visual aids for this one, so yeah, YouTube video coming. I’ll set it up below once it’s recorded. But for now, there ya go, you can intelligently refer to sample rates and bit depths, and make the right call for you when selecting the options in your DAW or interface. Trust your ears, don’t push your PC to the point of crashing, and impress your producer friends at the bar by dropping some knowledge.

Previous
Previous

A Floating Point Discussion

Next
Next

Learned HTML/CSS, still used Squarespace...