No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Speech Tech in the Enterprise: 3 Themes to Explore

Enterprise Connect 2019 is just a few weeks away, and during the conference I’ll be giving an update talk on the state of speech technology. Last year, I gave a speech tech 101 presentation, and this year I’m reviewing how the space has evolved since then.

I certainly have a lot to talk about, and this post serves as a preview of what to expect. If you’re trying to assess where and how speech technology can bring new business value to your workplace, you’ll want to hear about three themes I’ll be addressing.


Theme 1: Speech Is Riding the AI Wave
Artificial Intelligence (AI) is arguably the most important trend in technology, and it builds nicely on the uber-trend preceding it -- cloud. The two very much go hand in hand, with both now being ready for prime time in the enterprise. To be sure, AI remains surrounded by a cloud of hype, but things have progressed far enough that enterprises can no longer dismiss it as being more in the realm of science fiction than in the business sphere.


In terms of speech technology, AI-based initiatives are much further along in the contact center, largely because that’s where the utility and urgency is the greatest. This strand of AI will get plenty of attention during the Contact Center & Customer Experience track, as well as on the mainstage, so I’ll stick to the enterprise, where use cases are different and still fairly nascent.


Over the past year, we’ve been primed by the likes of Amazon Alexa and Google Home for a new category that’s based on AI-driven speech. This “category” is a mix of three things, all of which are poised to be adopted in the workplace almost as quickly as we’re seeing in our homes. The first element of this is being device-based, where most interactions occur with a new type of endpoint, such as Amazon Echo. In addition, existing endpoint vendors are integrating speech applications in their products -- and these purpose-built devices are part of what makes this a new category.


Secondly, these speech applications are tied to a specific service or offering -- either as part of a full-featured collaboration platform like Cisco Webex Teams, or as a standalone application such as transcription or translation. Third, speech is now being put to use for real-time communications, and in some cases, it’s conversational. With AI in the picture, this now becomes person-to-machine interaction, representing a new type of experience using speech.


Out of all this, a new shorthand is emerging under the moniker of digital assistant or intelligent assistant. The terminology will likely remain muddled until the space matures, and it’s not much different from all the labels we used to describe Slack when it hit the market. For now, let’s focus on the business value of speech, and in time, the language to describe it will evolve.


Theme 2: The Use Cases Are There
AI is as shiny as shiny balls come, and it’s still early days, but speech in the enterprise is being driven by real use cases. This is probably the most important development from last year, where the main attention was around how good speech recognition has become. Earlier generations of technology were only effective to a point, but AI has taken speech to another level. Whereas previously the goal was to achieve relatively good accuracy, cloud and AI have demonstrated scary-good results -- but we’re still far from perfect.


More importantly, these technologies have added intelligence to the mix, where the capabilities are now about context, intent, and predictive analytics. With AI, speech accuracy is the starting point, not the endgame. Building on that, we can now use speech to automate tasks, streamline processes, and manage data more effectively, both for personal productivity and working in teams.


At Enterprise Connect, I’ll focus on three core use cases, namely speech-to-text, text-to-speech, and automatic speech recognition. The possibilities for each are limitless, and there are plenty of examples now in enterprise settings. Most are fairly basic, but as the track record builds, the applications will expand. As such, it would be a mistake to write off enterprise speech just because the applications seem simple.


First of all, AI adoption is all about trust, and that has to be built from the bottom up, one application at a time. When basic applications are done well, it’s much easier to move on the next level, and as with all new technologies, end users will play a big role in identifying the new use cases. To illustrate, I’ll present various examples of enterprise speech applications, coming both from household names, and new companies you may not yet know.


Theme 3: Aside from Answers, There Are Important Questions
On its own, AI raises all kinds of questions about security, privacy, job displacement, ethics, Big Brother, etc. I touched on this last year, but the issues are even more particular for speech applications. This should be self-evident in the contact center realm, where the AI-driven customer journey can get pretty creepy. The first take on Google Duplex really drove this home, and even if you take the do-no-evil mantra at face value, bad actors and technology still have plenty of room to run amok.


My intention is to provide an update about the good things happening now with speech tech in the enterprise. That said, analysts bring value by providing balance, so the darker side must be considered, even if just for a few minutes. Putting AI’s current limitations aside, when making decisions about enterprise technology, the bigger picture matters. AI-driven speech can certainly help improve productivity, make better and faster decisions, and drive out some costs via automation.


All of these outcomes have valid business value, but what might be lost along the way? GDPR is just the first step to protect our digital privacy, but more thought is needed to ensure the machines don’t “win” as AI becomes mainstream. I’ll touch on some of those questions at Enterprise Connect, during my Tuesday, March 19, 8:00 a.m., speech tech for the enterprise update, and my hope is that will lead you to ask many more.
Till then, click on the player below and tune in to the latest episode of No Jitter on Air, for more of my guidance on speech tech in the enterprise.



If you haven’t registered for Enterprise Connect yet, no worries. Register now using the code NJPOSTS, and as a No Jitter reader you can save $200 off at checkout.