Expanding HD Voice to Audio Quality
"It is quite surprising that we still have to endure phone calls today at a level of quality that is rooted in the technical limitations of the early last century."
I have used HD voice (150 Hz to 7000 Hz) and I like it. I have also used full-sound-range audio (50 Hz to 20,000 Hz). It's even better. Why can't we have audio phone calls instead of HD voice calls?
I spoke with H.P. Baumeister, Director, Mobile and Communications Markets, Fraunhofer USA Digital Media Technologies, to discuss the future of HD voice. He refers to audio quality voice as Full HD voice. Fraunhofer is Europe's largest application-oriented research organization. Their research efforts are focused on issues around health, security, communications, energy, and the environment.
1. We have had POTS voice for the last 100 years. Why expand the analog voice bandwidth?
All our audio experiences today are digital, full bandwidth, "CD-quality", from broadcast sources (ATSC, cable, IPTV, FiOS, FTA/DVB, DAB or ISDB outside the U.S., even AM radio's "HD Radio/IBOC"), physical media (CD, DVD, BD), or when delivered over the Internet. No one would even consider anything but full audio bandwidth when delivering a rich media service, nor would the market accept anything less than that. That even applies to smartphones with increasingly capable camcorders.
The only glaring exception to this is classical telephony. It is quite surprising that we still have to endure phone calls today at a level of quality that is rooted in the technical limitations of the early last century. Add to that the observation that the quality did not get any better with the introduction of the mobile phone.
2. How does expanded bandwidth help in the conversations? (understanding, accent reduction, productivity, error reduction, customer retention)
If we had full audio bandwidth in telephony, not only would a phone call be much easier on the ears, it would be much less stress to follow a conversation, especially with higher-pitched voices, speech accents, and foreign languages in the mix. We would not need to spell words and for that matter would not be limited to speech in the first place. We could play music, sing, or whistle to illustrate something. Would it not be nice to not only talk, but also communicate an ambience, for example crashing waves when calling from the beach, or church bells in the background when visiting a Bavarian village, all this in Full HD quality?
Many studies discuss a new realism, a new "closeness" when making calls with higher audio bandwidth, resulting in longer calls, or at least in staying with the mobile call. No more "can I call you back on the landline?"
3. Your website references three voice bandwidths, POTS, HD voice and Full HD voice. What are the differences?
"POTS" is limited to 300 Hz to 3.4 kHz, so called "HD Voice" to 150 Hz to 7 kHz, and Full HD Voice has at least 50 Hz-14 kHz audio bandwidth. AM radio stations in the U.S. modulate up to about 10 kHz audio bandwidth.
4. How is the Full HD voice codec technologically different than the other narrower-band voice codecs?
Most narrowband codecs are voice codecs today, which means they are optimized for "mainstream" voice. They are based on modeling a single human vocal tract. Everything but the voice is being distorted, often to the point of not even being able to recognize the original sound. Voice codecs inherently have problems with multiple voices, music, and ambient sounds. Noise cancellation with all its drawbacks is an absolute must.
Full HD Voice implies that audio codecs are used. Audio codecs are not limited to voice signals and at least fundamentally don't need noise cancellation. Audio codecs are the only codecs used in today's digital rich media world. The leading, most popular examples are MP3 and AAC. Full HD voice digital bandwidth can operate as low as 24kbps up to 64kbps.
5. Is an audio codec different than a Full HD voice codec?
The only major difference really is the requirement to have low latency for real time communications applications. AAC and MP3 are not tuned for low latency and are not really suitable. We have developed special, low latency communications versions of AAC, called AAC-LD and -ELD.
Next page: Considerations for vendors