Grow Your Knowledge
Audio, MIDI, Music Production, Live Performance general knowledge and tip & tricks
Grow Your Knowledge
- The Creation and Evolution of SWAM String Sections
- What is the difference between loading multiple Solo instruments and loading a Section in SWAM String Sections?
- What is a plug-in in music production?
- What do VST, AU, AAX, and AUv3 stand for?
- What is VST? What's the difference between VST, VST2, and VST3?
- What is a DAW and what is a host? Is there a difference between the two?
- What is MIDI? What is CC?
- What does USB Class Compliant mean?
- What is Core Audio? What is ASIO?
- A Beginner's Guide To How Digital Audio Works
- What’s the Difference Between Reverb and a Spatializer?
A Beginner's Guide To How Digital Audio Works
It’s hard to remember sometimes that, not so long ago, before the digital era came along, our sources of entertainment were TV, radio, vinyl records, cassette tapes, books, and magazines.
Artists and musicians, back then, needed access to recording studios with expensive equipment if they wanted to have the slightest chance of people hearing their music. Record labels were king in the industry and artists sometimes spent their whole lives waiting for that magical breakthrough or the moment they’d be discovered.
Then came the revolution brought by CDs, domestic computers, and of course, the internet, and everything changed.
Especially in the 80s and 90s, musicians and music producers argued over the quality of analog recordings versus digital ones. Wars of opinions on that matter were frequent, with some artists faithfully praising their love for analog technology by releasing limited editions of their albums on vinyl.
Today, we accepted digital audio as being the standard in the music world. And with the constant development of new and powerful audio technologies, whatever disadvantage digital audio had over analog in the last decades is now considered irrelevant by most people.
But how does digital audio actually work? And why did some people doubt the quality of digital audio at the beginning?
Understanding the Difference Between Analog and Digital Audio
An analog signal is a continuous signal always changing over time.
When we talk about an analog recording, what it means is that the sound was recorded using a technology that captured a representation of that continuous signal.
For example, on vinyl records, the shape of the sound wave is carved onto the record itself, with very thin lines. The needle then traced these markings which created sound waves that could be played back to us.
Digital signals, on the other hand, record information in the form of 1s and 0s by using a technology called sampling. We’ll take a look at what exactly is sampling in a moment but for now, let’s just say that a digital signal is not a continuous signal but a series of “snapshots” of a continuous wave. Your soundcard acts as an Analog-to-Digital Converter (ADC) and uses sampling to translate the wave into 1s and 0s.
Here lies the origin of the “Analogue vs Digital” battle. Back in the days, when digital audio was still a new technology, people believed that since it uses only a series of samples to recreate sounds, a lot of the original quality of the sound gets lost in the process (because you don’t have a full representation of the continuous sound wave).
At the time, this made sense. But nowadays, digital audio technology is incredibly advanced. With all the added advantages that it provides, there’s no more denying that it’s the best method we have so far for manipulating audio.
First, it greatly reduces the number of machines and gear you need to record and produce music. The digital era made DAWs and plugins possible. It’s allowing us to be able to play an infinite number of instruments using only our computer. No more need for expensive mixing boards, tons of keyboards, or different kinds of amps. Now, everything can be emulated digitally using a computer.
Second, with digital technology, you can make as many copies of the initial recording as you like. With analog, every copy made deteriorates the original in some way. That’s why artists who used to release music on vinyl records always sold them as a limited edition. You can’t copy an analog recording over and over again without the quality being affected.
Digital Audio Basics
Imagine digital audio as being a series of dots within a graph. The X-axis represents time, and the Y-axis represents levels to measure frequencies, from very high to very low. These dots need to be precise enough so that when the computer links them together to recreate the sound wave, the shape it creates is as close as it can be to the shape of the original wave.
There are different factors that determine the position of these dots on the graph.
Sampling and sample rate
Sampling is the action of recording the level of a sound wave at a specific moment in time.
The term sample rate refers to how many samples the Audio-to-Digital Converter (ADC) will record per second. In other words, the sample rate represents how often the level of a sound wave will be recorded per second by your sound card. This determines the position of each dot on the X-axis.
According to the mathematical formula used to determine sample rates (called the Nyquist–Shannon sampling theorem), it’s said the sample rate has to be at least twice as much as the sound frequency you want to record. Since the human ear can process sounds approximately up to 22,000 Hz (22 kHz), the sample rate we use needs to be at least twice that if we want to cover the complete range of sounds we can potentially perceive as humans. That’s why the normal CD standard is a sample rate of 44.1 kHz.
Even though 44.1 kHz is the standard for audio files, other standards exist. For movies, for example, the standard sample rate is 48 kHz and nowadays, it’s not unusual to see people using a sample rate of 96 kHz.
Bit Depth and Encoding
We saw that the sample rate is the measurement unit for recording digital data on the X-axis since it’s based on time. But what’s the measurement unit for placing the dots on the Y-axis?
Here comes what we call bit depth.
A bit is the digital representation of a 1 or a 0. Bit depth is the number of levels we have access to on the vertical axis of our graph. How many levels we have depends on how many bits we use to represent the data. That’s what we call encoding.
The more bits in the encoding, the more accurate our audio recording will be. Why? Because the higher the encoding, the more levels we’ll have access to for sampling.
Standard CD quality has a 16-bits encoding. This equals to 65,5336 (2 to the power of 16) possible levels on the Y-axis. Though this may seem like a lot, most sound engineers will choose to record in a 24-bit encoding—which increases the levels to 16,777,216 (2 to the power of 24) possibilities—to ensure the highest possible sound accuracy.
In other words, since digital audio doesn’t record the continuous wave but instead records the equivalent of a series of dots on an X and Y axis translated into 1s and 0s, the higher the encoding, the higher the bit depth, meaning the less distance there will be between the dots on the Y-axis. Similarly, the higher the sampling rate, the less distance there will be between dots on the X-axis.
Difference between Bit Depth and Bit Rate
The terms bit depth and bit rate (or bitrate) are so similar, it’s easy to get confused between the two. But even though they seem similar, they are two very different concepts.
Bit rate is the number of bits the audio file outputs per second. The higher the bit rate, the better.
Bit rate is especially important for streaming services. For example, on a Desktop computer, a free user on Spotify can stream music at a bit rate of 160 kbps/s while Premium users can stream music at a much higher bit rate of 320 kbps/s.
A word on buffers, buffer size, and no-latency monitoring
A buffer is simply a place for the computer, or a program, to store information. If the buffer size is small, the information can be accessed more frequently and quicker. The downfall of a small buffer size, however, is that if the flow of information is large (and audio data tends to be very heavy) then a small buffer might not be enough to store the amount of information necessary. This can translate into causing glitches and interferences in the sound inside your DAW.
On the other hand, a large buffer will allow more information to come through but will take longer to release the data. There lies the risk for audio latency.
Tweaking your soundcard’s software and your DAW to use an appropriate buffer size is a matter of trial and error. It’s good to experiment with different settings while taking into account the normal default values recommended by the software and how performant your system is.
If you’re having problems with latency, you can try bypassing the issue by using no-latency monitoring, which is a functionality offered by most modern sound interfaces. No-latency monitoring allows you to monitor the sound as it is captured by the sound card before it goes through the Analog-to-Digital conversion process.
No-latency monitoring can definitely come in handy. One of its main downfalls is that you’ll need to listen to the output completely dry, meaning before the sound is processed by your DAW (Oh no! No reverb!).
Following the Journey of Sound Waves
So let’s recap everything we discussed so far by following the journey the sound makes, from the moment it’s played to the moment it’s played back in your monitors.
You’re recording a singer.
The microphone picks up the voice as an analog signal and sends it to your audio interface (soundcard). The soundcard has integrated software that converts that analog signal to a digital one (ADC) according to a certain sample rate (usually 44.1 kHz) and bit depth (normally 24-bit for recording).
The digital signal then passes on to the Input Transport Buffer on your computer. From there, the digital signal is sent to the audio driver’s input buffer. Usually, these drivers are proprietary software that comes with your audio interface or built-in audio drivers on your computer, for example, Core Audio if you’re on Mac.
From there, the audio data is available to be processed and manipulated inside your DAW.
Once the manipulation in the DAW is complete, the digital signal is sent back to the audio driver output buffer, back to the Output Transport Buffer on your computer, and finally translated back to an analog signal with the help of the Digital-to-Analog Converter (DAC) inside your soundcard, ready to be played back by your speakers or headphones.
Now the complication is that each of these steps takes time—a few milliseconds up to tens of milliseconds per step—and this kind of delay creates audio latency. It’s possible to fix the problem by adjusting the buffer size of your DAW or by using no-latency monitoring.
You either record analog using microphones (or pick-ups), or digital using MIDI. When you record using a MIDI controller, the sound is processed differently, since MIDI signals don’t contain any audio data. MIDI is a communication protocol—a way for different computers, devices, and software and hardware instruments to communicate information with each other.
Most MIDI technology is based on sampling but there are now new technologies that don’t use sampling at all. This kind of technology, called Physical Modeling, recreates the sound of an acoustic instrument in real-time by using digital data processed by complex algorithms. That’s the kind of technology we use at Audio Modeling to create our SWAM instruments.
Other articles in this category
- The Creation and Evolution of SWAM String Sections
- What is the difference between loading multiple Solo instruments and loading a Section in SWAM String Sections?
- What is a plug-in in music production?
- What do VST, AU, AAX, and AUv3 stand for?
- What is VST? What's the difference between VST, VST2, and VST3?
- What is a DAW and what is a host? Is there a difference between the two?
- What is MIDI? What is CC?
- What does USB Class Compliant mean?
- What is Core Audio? What is ASIO?
- What’s the Difference Between Reverb and a Spatializer?