No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Speech Tech for Enterprise... With a Caveat or Two: Page 3 of 3

An Evolving Market: Some Players
Harris cautioned, however, that enterprise expectations might be ill-aligned with reality. "I think the more interested and optimistic a customer is, the more disappointed it often ends up being -- because it's really difficult stuff. There's no silver bullet that's going to make it all come together."

For example, ambiguous words still trip up natural language processors, and most voice-based assistants still can't handle compound commands (Alexa, start my meeting and open my presentation deck.) "No one has demonstrated a product that you can have a conversation with about anything," Harris said.

Speech tech holds its true value for tactical and logistical tasks, he added. "When you try to create more of a human element with speech tech, that's when it leads to frustration."



A slew of other companies are carving out niches in enterprise speech with an eye on improving the overall experience. In speech recognition, for example, Fluent.AI offers what it calls "acoustic speech recognition." By this, the company means it bypasses speech-to-text transcription and "goes directly from speech to intent," company CEO Niraj Bhargava told me. This differs from other speech recognition services, which require speech to be converted to text first before the AI system can learn intents and understand what's being said, he said.


Voice analytics players like VoiceBase and TalkIQ, which do things like real-time transcription and sentiment analysis, also aim to deliver insight from voice – often in conjunction with other products.

Indeed, most enterprise speech offerings take the form of add-ons to existing solutions or are platforms purpose-built for other applications, Arnold pointed out. AISense, for example, provides the speech-to-text transcription capability that cloud video provider Zoom has incorporated in its platform for meeting recordings. And uses VoiceBase transcription technology to deliver conversational intelligence to sales teams.

One beneficiary of's speech integration is Allbound, which provides software aimed at helping businesses build successful partner and referral programs. Allbound's sales team uses to improve sales conversions and guide sales members with intelligence on conversations they are having with prospects, Greg Reffner, VP of sales at Allbound, told me in a briefing last fall.



With the introduction of's speech technology, Allbound has brought a consistency, process, and transparency to its sales operations, leading to it decrease its sales cycle by 60 days, Reffner said. Not only that, but Allbound has quadrupled its sales rate. "These changes happened when we started having visibility, and we didn't get that visibility until Gong," he said.


Along these same lines, Genesys has integrated its PureCloud contact center solution with Amazon Lex, a service for building conversational interfaces. Genesys is using Lex to create a more conversational and intelligent IVR so that customers can speak more naturally when navigating support options. And real-time cloud communications platform Voximplant leverages the Google Cloud Speech API to help other businesses build voice and video applications.

"Before the Speech API, and speech to text, you could only get digit responses from callers [press 1 for X, press 2 for Y...]," Voximplant CEO Alexey Aylarov told me in a briefing last fall. "Now people can talk to a bot going through a scenario, or script -- voice bots are really valuable. It's a new experience, so some people don't even understand they are talking to a robot."

But it's still early days for Voximplant and its use of speech tech. "We started building an analytics platform around it as well, but right now it's not a big part of our business. But I have a strong feeling that this [speech tech] part will be growing much faster than other parts," Aylarov said. "It's not a free service, so we need to be able to sell this to our customers; eventually, I believe usage will be growing exponentially."

Indeed, the skepticism that abounded five years ago about speech technology in the enterprise has dissipated, MindMeld's Tuttle said, recounting how at that time "accuracy was very hit or miss, and people didn't feel comfortable or were creeped out by having a machine listening."

With people using speech technology on a regular basis in their personal lives, "we are rapidly moving towards a world where most users will expect to be able to talk to a device if they prefer that mode of interaction," he added. "Those same behaviors will come into the workplace."

Come explore speech tech with us at Enterprise Connect 2018, taking place March 12-15 in Orlando, Fla. In addition to Robert Harris's session, "Are Speech Technologies Ready for the Enterprise?," and Jon Arnold's session, "Tech Tutorial: Speech Technologies for the Enterprise," for the we have a whole track on Speech Technologies for you to explore, featuring a deep-dive tutorial, enterprise end user panel, and our annual Innovation Showcase. Register now using the code NOJITTER to save an additional $200 off the Early Bird Pricing or get a free Expo Plus pass.

Follow Michelle Burbick and No Jitter on Twitter!