Many of us first think about the visual component of the metaverse, but sound will have just as an important part to play in creating convincing virtual worlds.
As part of our digital Into the Metaverse event, I moderated a panel with Philip Rosedale of High Fidelity and Andy Vaughan of Dolby. I was able to ask them about the part that sound technology will play in the creation of the metaverse.
Creating good sound in the metaverse will be a challenge. Every user can be in the same virtual world but use different microphones and headphones. They will also have different internet connections, which will create varying degrees of sound latency.
We will need good sound quality from everyone. Rosedale notes sound conveys emotions, even noting a Yale study that showed that people can identify emotions better with just audio than with voice and face cams. But he says that you will need high quality sound for that emotion to come through.
“You’ve got to start with audio,” Rosedale says. “You’ve got to be able to hear the nuance in my voice.”
Mono problems
Vaughan added that we have seen something of a regress down the sound quality scale. Many mics still only record in mono sound. Bluetooth, for example, doesn’t work in in stereo if you’re using a microphone. Those devices may be good for listening to music, but they aren’t great for voice chat. Consumer choice is good, but so many people using different devices of different quality makes live sound engineering difficult.
“It would be interesting if we could get everybody on the exact same hardware all the time,” said Vaughan. But even at Dolby, where many employees have the same work laptops, personalization leads to everyone sounding different.
“It feels like we’re pushing rope a lot of the time,” Vaughan said. “We build these technologies, and we put them out there, but it takes that OEM (original equipment manager) pull to make this stuff really shine out in the market.”
In other words, the technologies for better sound recording often do exist thanks to companies like Dolby and High Fidelity, but it is still down to hardware creators like Apple to invest in them and make them part of popular consumer products.
When we’re in the metaverse, it won’t be good enough to just hear a flat voice from a person’s avatar. We’ll need 3D audio and more to make those voices sound realistic and convey the full range of possible emotion. Without it, conversations in the metaverse will sound no less realistic than a Zoom conference.