The problem is that virtual audio via headphones remains very difficult to achieve. The things that make today different from the past are the facts that researchers now understand the challenges fairly well, and that current DSP technologies have the computing power to do the job. The remaining task is to dig into all the bits and pieces of the problem, efficiently come up with practical solutions, then figure out how to get it all to work together. That last bit may be the hardest, because there are a lot of variables in this equation:
Individual HRTFs must be generated for each listener. The resulting file must be broadly compatible with equipment made by many manufacturers.
Headphones will have to be very well behaved in the time domain. Localization cues generated by DSP must be reproduced with no added garbage. Impulse responses must be very clean, or we won't be able to clearly hear the cues. (Actual frequency response is less problematic, as tonal-correction curves can be built into the headphones.)
Much of the intended use of these types of headphones is with smartphones and tablets. Decisions must be made about where the various computations are made. The virtual audio reality may be rendered by the portable device, but it's likely that all the changing HRTF cues occasioned by head movements will be rendered in the headphones themselves, which will have significant computing power.
New types of content and formats will have to be standardized so that they can be interpreted by hardware from a variety of manufacturers.
Perhaps most difficult of all, headphones will have to be transparent to sound in the real environment around the listener. One of the major goals here is a mixed reality, in which you can interact with your normal environment as usual, while hearing artificial sounds superimposed on the real sounds. Imagine, if you will, the kids hearing a Pokémon giggling in the bushes as they search for it.
So, along with being a difficult technical problem, there is also a significant convergence problem. Numerous industry standards must be developed before creators can produce content with complex formats that can be transported to and rendered for individual consumers using a variety of devices.
*****
I knew that three-dimensional sound through headphones is tempting, but for a long time I thought the problem was just too complex, that it wouldn't happen any time soon. I hadn't yet put together in my head all the pieces I've described, and I doubted manufacturers would have the will to overcome the difficulty. After attending the AES conference, and digesting and reporting on many of the papers presented, I suddenly found myself believing that it would happen. There's just too much at stake—too many cool things to come from this technology, and so much money to be made by those who figure it out. As I sat in a room with 100 highly paid researchers, each on a mission to develop ways in which people will hear sound through headphones in the future, I could feel the industry's intense will to get this job done. Here's a description of the Headphone Conference, from AES's webpage:
Gaming and Pokémon stuff will certainly be profitable, but the real money is in developing something as ubiquitous as the smartphone. I draw your attention to the last sentence of the above description: "The conference will enable an interdisciplinary dialogue across the headphone and hearing aid industries."
One of the ideas discussed at the conference was that of personal assisted listening, in which the sounds around you are modified in some way to improve your sense of hearing. Here is the crossover with the hearing-aid industry: There are many cases in which those who enjoy normal hearing might find it nice to hear even more clearly. One technique described was the canceling of incoherent, diffuse noise and the augmentation of coherent sources—sound sources that are spatially well defined. Imagine sitting in a loud, crowded restaurant, talking with your friends: their voices will be nearby and spatially coherent; the din of the crowd will be diffuse. It's possible to suppress the background noise and augment the coherent sound of the friends sitting at your table, to allow you to clearly hear them even in such noisy environments.
Or imagine firefighters who could don special headsets that suppress the roar of the flames but augment the sounds of human voices, allowing them to more easily find survivors. Rescue workers might be able to use such "bionic" hearing to help them locate the muffled voices of people buried in the rubble of collapsed buildings. And, of course, there are military applications.
Taking it a step further: Those traveling in foreign countries could use smart headphones to hear English translations on the fly of what people are saying. Or a step beyond that—cameras and autonomous-driver automotive technologies could be combined in a headset that would allow the blind to follow a trail of sonic breadcrumbs as the headset listens and watches for obstacles and traffic lights.
In short, we'll stop thinking of headphones as a way to make phone calls and listen to movies and music, and start thinking of them much more as we do smartphones—as personal assistants, fitness-training aids, and reality enhancers.
Some of these devices will be full-size headsets with dropdown visual displays similar to those displayed by Microsoft's HoloLens, which superimposes virtual visual objects on the real environment. Such devices will permit the mixing of aural and visual realities. But many such devices designed for everyday use will be more discreet, and mounted in the ear—mashups between hearing aids and in-ear monitors. They'll also likely have interchangeable cover plates of different colors and designs, to suit the user's and the day's fashion requirements.
What does all this have to do with traditional two-channel audiophiles? Not all that much. I'm talking about something that will happen five or ten years from now, and two-channel recordings aren't going to magically go away. But there are a few things that may affect the audio avocation we so love.
Just as, 30 years ago, the best-sounding headphones came out of the pro-audio market, it's likely there will be an early drive for professional virtual-audio systems needed in content creation. A couple of generations on, these systems might sound very good. We all know that room acoustics play an important role in the sound of a good stereo system; future headphone systems will be able to synthesize any number of room acoustics. While they may not replace a big, serious hi-fi rig, they may make a high-quality listening experience portable and at considerably lower cost, thus making high-fidelity sound available to more people, more of the time.
One of the profitable areas left to music producers today is live concerts. The same virtual audio/video headset hardware used for future gaming could also be used as a way to distribute pay-per-view "you are there" concert experiences. But the original content for these concerts (and music videos, computer games, and movies) will use an object-oriented encoding system similar to Dolby Atmos. In other words, sound won't be assigned to a number of audio channels to be reproduced by a matching array of loudspeakers, but rather as various movable aural objects emitting sound from positions in space. Additionally, such technology will synthesize the acoustic response of the room or space in which the sounds are made. For audiophiles, this means that there will be ever-increasing pressure for content created with or recorded in new spatial-audio formats, which would then have to be downmixed for replay through two-channel or surround systems.
My inner audiophile cringes.
On the other hand, I could easily see world-class symphony orchestras making recordings using special soundfield microphones—recordings that would produce a very convincing immersive listening experience through a professional-quality virtual headphone system. I would argue that, given enough time for complete development, these systems might deliver a listening experience superior to two-channel or surround sound, as it can produce the illusion of a sound coming from any direction, seemingly enveloping you with sound. While high-end virtual audio systems may never sound as refined as the best two-channel systems, they may offer a heightened sense of immersion in a soundfield. About that, my inner headphone geek gets enthused.
And as long as I'm making predictions: There's currently only one company that controls a significant swath of the personal-audio market, from content sales through delivery to the hardware it's played through and the software to control it: Apple Inc. and its subsidiary Beats Electronics. Given the Apple ecosystem of content, software, and hardware, and Beats' extraordinary dominance in headphone sales, Apple is poised to develop and deliver a mixed-reality experience without the need for any industry standards other than those used in content creation. I suspect that this gives Apple a built-in lead of three years on everyone else. I don't expect that, in the long run, they'll end up being the best at it; I do expect that they'll do the job well enough for the average consumer, and that they'll do it first. It'll be the iPod/Beats by Dre phenomenon all over again.
I knew that three-dimensional sound through headphones is tempting, but for a long time I thought the problem was just too complex, that it wouldn't happen any time soon. I hadn't yet put together in my head all the pieces I've described, and I doubted manufacturers would have the will to overcome the difficulty. After attending the AES conference, and digesting and reporting on many of the papers presented, I suddenly found myself believing that it would happen. There's just too much at stake—too many cool things to come from this technology, and so much money to be made by those who figure it out. As I sat in a room with 100 highly paid researchers, each on a mission to develop ways in which people will hear sound through headphones in the future, I could feel the industry's intense will to get this job done. Here's a description of the Headphone Conference, from AES's webpage:
"More than 300 million pairs of headphones were sold in 2015, and people are using headphones everywhere. The popularity of 'smart and wearable' devices has driven developments in low-power processors and sensors that are enabling the augmentation of headphones with features more typically associated with hearing aids or smartphones. Therefore, this conference will focus on technologies for headphones with a special emphasis on the emerging fields of Mobile Spatial Audio, Personal Assistive Listening, and Augmented Reality. This conference will assemble scientists, developers, and practitioners who are involved in any head-worn hearing technology, be it in theory, technical design, application or evaluation. The conference will enable an interdisciplinary dialogue across the headphone and hearing aid industries."
And as long as I'm making predictions: There's currently only one company that controls a significant swath of the personal-audio market, from content sales through delivery to the hardware it's played through and the software to control it: Apple Inc. and its subsidiary Beats Electronics. Given the Apple ecosystem of content, software, and hardware, and Beats' extraordinary dominance in headphone sales, Apple is poised to develop and deliver a mixed-reality experience without the need for any industry standards other than those used in content creation. I suspect that this gives Apple a built-in lead of three years on everyone else. I don't expect that, in the long run, they'll end up being the best at it; I do expect that they'll do the job well enough for the average consumer, and that they'll do it first. It'll be the iPod/Beats by Dre phenomenon all over again.















