What is 3D Music?

by Cameron Summers

TL;DR

3-dimensional (3D) music is recorded music that has been formatted to simulate how we experience sounds in a real-world environment. While it can be mimicked with a large speaker array surrounding us in all directions, the sound can also be encoded with special audio effects to simulate the experience in headphones. This can be a significantly improved listening experience over existing audio formats like surround sound.

If you’re interested in learning how to make your own 3D audio, you have many tools available now. This includes special microphones that preserve the 3D properties of the original sound and audio processing software that can position mono audio in 3D space. You can listen (using headphones) to my music where I play live and dynamically position sounds in real-time with my own custom software.

Table of Contents


Primer on 3D Audio

Imagine someone called from behind you. “Hey!” they’d shout. You keep walking, not sure you are the intended audience and unsure of who this person is. They run around the side of you saying, “Hey! Excuse me?” And then stand in front of you, “Did you just drop this wallet?” You examine it then say appreciatively, “Oh, yes! Thank you!”

This is how we experience sound in the real world. We experience sounds such the person's voice that gave us our wallet back in three dimensions:

  • in front and behind (x-axis)
  • left and right (y-axis)
  • up and down (z-axis)
Combining each dimension, x,y, and z, we get a 3-dimensional (3D) experience of sound. What’s amazing is that our brain enables this. The brain tells us a sound is coming from behind, for example, because it has learned how a sound changes when it passes through our earlobe on the way to our eardrum. Similarly, when sound bounces off our shoulder from above or in front of us, our brain understands it is an indication of the direction of a sound source.

Where Are We Now?

So you might be saying, that is all well and good Cameron, but what does this mean for a musical experience? Well, we’ll get there, but it’s important to understand why 3D audio is even a thing. Like, you might even be wondering if it is a scam created by audio equipment companies to get us to buy more stuff. I know I would, since apparently I’m of a generation particularly distrustful of large companies. (Answer: No, it’s legit, not a scam). But we should have a basic understanding of the Why, which I’ll attempt to cover here.

Let’s first consider recorded music. The way many of us hear recorded music today is through the built-in speaker on our handheld devices, which tragically is a horrible way to listen to music. The tiny speaker on a phone isn’t loud enough, has a limited range of frequencies, and lacks any spatialization of the sounds emanating from it. We can do better by plugging in a pair of headphones (hopefully, if you have decent headphones). The headphones provide a more satisfying volume and frequency response. They also provide not one, but two sound sources in each ear in which to place individual sounds in the music, giving some spatial width in our listening experience. This type of audio transmission we call “stereo”. The bass can go into our left ear and the piano into the right ear - an improvement! But we can do better.

Even with stereo we’re still nowhere near recreating the experience of sounds in our natural world. The bass and piano example above doesn’t even simulate well a real-world situation where a bass player is playing to your left and a piano player playing to your right. When that occurs in the real world, you hear the sounds differently because they reflect in our environment to both our ears from many different places. So we still get information in front and behind, up and down, left and right. So where do we go from here? How can we improve the listening experience?

Getting Real: Simulating 3D Audio

We actually have two solutions to this problem. “Wow, two?” you say. Yes, but there are some caveats. The first and most straightforward solution is to physically recreate the spatialized sounds. What if we just put the audience in the middle of the stage and place the musicians around them? Brilliant! Wait. Although the experience of being surrounded by musicians while they play is one of my favorite parts of playing in a big orchestra, if we want the performers to be able to play together in most other genres, they generally need to be closer. Just 20 ft apart (~18ms latency) can be a pain for musicians to play together in time. For the moment, we’ll ignore that this is exactly what some composers actually write into their music.

If we forgo hearing our music being performed live then perhaps we could record each instrument and play it on its own individual speaker? Yes! We can arrange the speakers, perhaps 5 of them - plus a subwoofer for bass frequencies, of course - and… “Wait” you say. “Isn’t that just 5.1 surround sound? I have that at home when I watch movies!” Why even bother then with an idea like 3D music if we can already experience something like surround sound?

Well, surround sound still doesn’t really get us there. To really create a natural environment we would need much higher spatial resolutions, i.e. many more than five speakers. Imagine standing in your living room. There are perhaps as many as 100 speakers arranged around you in the shape of a large sphere. Now we’re talking. And this is exactly what some specialized facilities like the Sonosphere. And if you ever have a chance to go listen there, you should. Clearly, however, creating a setup like this at home is very expensive for your average listener. “Um, what about that second solution?”, you say while wondering how many $20 bills you wish to part with.

Our second solution involves some true human ingenuity. Using clever algorithms and headphones, we trick the brain into thinking sounds are coming from a particular direction. Just like we can create different types of audio filters that allow lower or higher frequencies to reach our ears, we can create audio filters that allow just the frequencies that naturally pass through our earlobe or shoulder coming from a particular direction. This mimics the way sounds arrive at our ear from a particular direction, and our brain thinks that is where it came from. Amazing! This type of filter is called a Head-Related Transfer Function (HRTF) and audio of this kind is called binaural audio.

So finally, we have a reasonable way to recreate our natural 3D sound experience. The main caveat is that we need to wear headphones. This is a small price to pay (literally, compared to buying dozens of speakers) for an exciting audio experience. So why is all the music we currently listen to still in mono or stereo?

How to Make 3D Audio

Even though this technology has existed for a long time, only just in the last few years have we seen many tools readily available to format audio in 3D. I’ll cover just the basics here and perhaps go into more depth in another article. There is a wide world of different formats and tools and much to talk about.

At a high level, there are a couple of different approaches to recording 3D audio. First, you can record audio live with a special microphone such as the Zoom H3 that actually uses multiple microphones to capture the sound and preserve its original spatialization. For example, someone might use a microphone like this to record a natural environment, preserving the sound of the birds above or the babbling brook to the right of the microphone. Or similarly, a microphone like this could be placed in the center of a group of musicians to simulate the experience of sitting where the microphone is. Using a ready-made microphone is great because it’s similar to using a standard microphone. But keep in mind that the quality of the 3D effect is proportional to the cost of the microphone. Microphones that can give high spatial resolution and a more lifelike experience like the Zylia PRO are quite pricey.

Secondly, instead of using a special microphone for a single live recording we can instead take a mono recording and actually “position” the sound in 3D space using software. This again uses the clever algorithms and HRTF filters we discussed above. And while it’s a bit more work than just setting up a microphone and pressing record, there are many tools available to do this such as the free Sennheiser AMBEO Orbit

The ability to position sounds opens up many exciting possibilities for different ways to experience music and audio. For example, one thing that is particularly cool about the ability to position sound sources in 3D space is it can evolve over time. Just like how we hear the changing sound of a plane that flies overhead, we can recreate that experience with a cello sound synthetically with these tools. For example, in my music compositions I actually play live AND use my custom tools to dynamically position sounds in real-time. This allows me to improvise compelling musical textures and create ways to experience music. It’s amazing to think of sound location as a compositional device. I believe it can enrich the emotional experience of the music.

Conclusion

I hope you have a better understanding of why 3D audio holds a lot of promise for next generation music. Please contact me if you have thoughts or questions. Thanks for reading.