Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
Google today open-sourced Lyra in beta, an audio codec that uses machine learning to produce high-quality voice calls. The code and demo, which are available on GitHub, compress raw audio down to 3 kilobits per second for “quality that compares favorably to other codecs,” Google says.
While mobile connectivity has steadily increased over the past decade, the explosive growth of on-device compute power has outstripped access to reliable, fast internet. Even in areas with reliable connections, the emergence of work-from-anywhere and telecommuting have stretched data limits. For example, early in the pandemic, nearly 90 out of the top 200 U.S. cities saw internet speeds decline as bandwidth became strained, according to BroadbandNow.
It’s Google’s assertion that Lyra can make a difference in these scenarios.
Lyra’s architecture is separated into two pieces, an encoder and decoder. When someone talks into their phone, the encoder captures distinctive attributes, called features, from their speech. Lyra extracts these features in 40-millisecond chunks and then compresses and sends them over the network. It’s the decoder’s job to convert the features back into an audio waveform that can be played out over the listener’s phone.
According to Google, Lyra’s architecture is similar to traditional audio codecs, which form the backbone of internet communication. But while these traditional codecs are based on digital signal processing techniques, the key advantage for Lyra comes from the ability of its decoder to reconstruct a high-quality signal.
Google believes there are a number of applications Lyra might be uniquely suited to, from archiving large amounts of speech and saving battery to alleviating network congestion in emergency situations.
“We are excited to see the creativity the open source community is known for applied to Lyra in order to come up with even more unique and impactful applications,” Google Chrome engineers Andrew Storus and Michael Chinen wrote in a blog post. “We [want] to enable developers and get feedback as soon as possible.”
The Lyra code is written in C++ using the Bazel build framework. The core API provides an interface for encoding and decoding at the file and packet levels, and the complete signal processing toolchain is provided, which includes filters as well as transforms. Google’s example code integrates with the Android NDK to show how Lyra can work with Java-based Android apps, and Google has also provided the weights and vector quantizers necessary to run Lyra.
“This release provides the tools needed for developers to encode and decode audio with Lyra, optimized for the 64-bit ARM android platform, with development on Linux,” Storus and Chinen continued. “We hope to expand this codebase and develop improvements and support for additional platforms in tandem with the community.”
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
- networking features, and more