No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Speech Tech Update: The New Voice Technologies


Voice assistant concept. Microphone voice control technology, voice and sound recognition. Vector sound wave. High-tech AI assistant voice, background
Image: SergeyBitos -
As Enterprise Connect 2021 approaches, I look forward to preparing my annual update on the state of speech technology in the enterprise. AI continues to drive most of the innovation around speech tech, and while the hot spots have always been with consumers and contact center applications, the enterprise continues to offer its share of interesting use cases.
Last year, I laid the foundation for what I’ve been calling the “new voice.” Many of these applications have already become mainstream, so this year I’m going to take an even more forward-looking view of this space. We all know how much faster technology changes now, and a lot has happened over the last year.
The pandemic gave rise to touchless experiences, and led to a dramatic acceleration of cloud migration plans — just two of the factors behind my current mantra that voice is “bigger than ever.” This year at Enterprise Connect, I’ll address why voice is so relevant for enterprises, and explore three core themes.
Theme 1 – Why Voice Is Growing
Voice remains the killer app in the enterprise, mainly because it’s so central to how we communicate and collaborate. This factor is easy to underestimate when filtered through the lens of telephony, where desk phones are declining in utility. To whatever extent legacy PBX thinking persists in IT circles, telephony is only one use case for voice, and today’s state of AI-driven speech technology has opened up new avenues that have nothing to do with dial tone, PRIs, and minutes.
Consider Microsoft’s move to acquire conversational AI provider Nuance Communications, announced in April with far-reaching implications. With this $19.7 billion deal, Microsoft is sending a loud signal about how important voice technology is to its success. I’ll be unpacking this during my session, but for purposes of this post, it’s an important anchor for the idea that voice is bigger than ever.
Since the pandemic began, voice has only grown stronger, not just as UCaaS adoption proliferates across all sizes of business, but also due to the ascendancy of video — which in some ways has replaced the phone call. We may think of video as a visual channel, but it’s voice-based as well, and when it comes to UCaaS, voice is the stickiest application, without which these platforms will have limited utility.
To clarify, these use cases for voice aren’t really driven by AI and speech tech, but in time, they will be. All the major UCaaS providers are on this path now, and last year’s cutting-edge — noise suppression, real-time transcription, and translation — have become standard collaboration features.
Other developments, however, are taking speech tech in new directions that are only just starting to emerge for enterprise use. Some will enrich the UCaaS value proposition and make hybrid work sustainable, but others will be outside the collaboration realm and will drive business value in other ways. Examples I’ll be addressing include the 5G’s impact, smart speakers and wearable tech, conversational AI, and immersive technologies related to virtual/augmented/mixed reality.
Theme 2 – How Voice Is Making Collaboration Better
During 2021, the pandemic gave rise to many speech tech applications that make hybrid work possible, and I outlined them in detail last year. The underlying idea is that we can speak faster than we can write, and that voice conveys clarity and nuance in ways that text cannot.
Building on these factors, AI-driven speech tech enables us to communicate more efficiently, and with greater impact, especially with larger groups. By removing friction from communication and processes, voice-centric modes of working are showing their value. The most familiar variations would include speech-to-text, text-to-speech, automatic speech recognition, and voice biometrics.
With a year or two of experience to build upon, these applications will only grow in value for the enterprise. Not only do they perform at a higher level as machine learning algorithms make them better over time, but they scale easily as more workers adopt them. The main idea here is that AI continually improves as it learns the habits, patterns, preferences, idiosyncrasies, etc. of each and every worker.
These applications not only help improve personal productivity and team effectiveness, but also make collaboration solutions like UCaaS accessible to a wider audience. During 2021, inclusivity has been a prominent messaging theme from the collaboration vendors, as they position themselves to help employers be more diverse.
With workplace demographics trending younger, enterprises must cater to what digital natives value, and inclusiveness is definitely part of that. AI-driven speech brings new value, not just by breaking down language barriers in real time to support global teams, but by supporting workers who are visually- or hearing-impaired, or have a speech impediment, for example. At Enterprise Connect, I’ll provide current examples of this, along with what the next wave of speech tech is bringing to collaboration, namely conversational AI.
Theme 3 – AI Reality Check
A common theme of my updates on speech tech is the need for enterprise decision-makers to think carefully about the AI component. Speech tech has been with us for decades, and until cloud and AI came along in recent times, this space was fairly mature. These new technologies have taken speech tech to new levels, and the pace of innovation over the past few years has been impressive.
At face value, this evolution is very promising, but as I always discuss, there are many caveats to consider on the AI side. The hype factor with AI is ever-present, and aside from offerings with well-defined use cases, it’s easy for new forms of speech tech to be solutions looking for problems. I’ll be providing current context around that, as IT leaders need realistic expectations around what AI can and cannot do with speech tech.
Furthermore, I’ll be explaining why the connection between AI and speech tech is complicated, especially for enterprise applications. Collaboration may be the use case Enterprise Connect attendees expect to be hearing about, but there are others that may sound futuristic for the enterprise, but after attending my update, you may find them nearer than you think. In that regard, my intention is for attendees to see the bigger picture for how transformative AI-driven speech can be, and if you like what it’s doing for UCaaS, I think you’ll love what it’s going to do for the future of work in the enterprise.

EC21 event logo with dates

Join Jon for his session, “The New Voice Technologies: How Speech, AI Are Creating New Value for Enterprises,” on Monday, Sept. 27, at 11:00 to 11:45 a.m. Interested, but not yet registered? Sign up now using the No Jitter promo code NJAL200 to save $200 off the current rate!