Words mean more than what is set down on paper. It takes the human voice to infuse them with deeper meaning.
-- Maya Angelou
It should come as no surprise that we’ve entered the age of
bots. These intelligent, virtual assistants help us
order food,
request rides,
lull us to sleep, and
stay on top of the latest news stories. I’ve created prototype bots for healthcare, transportation, and parks and recreation. With recent advancements in speech recognition, natural language processing, machine learning, and speech synthesis, it’s becoming nearly impossible to know when you’re conversing with a person or a bot.
Reasons for the proliferation of bots are numerous, but high on the list is the desire for businesses to lower their costs by moving predictable, repetitive tasks away from live agents to lower-cost, always-on machines. While the average call center worker will quickly tire of answering the same questions over and over again, a bot’s cheerful demeanor never wilts or fades. Adding to that is the bot’s willingness to work nights, weekends, and holidays. A bot never calls in sick or comes to work with a bad attitude.
Regardless of how sophisticated bots get, there’ll always be a need for a living, breathing person to tackle complex problems. It’s easy to recite a hospital’s pharmacy hours, but it’s much harder to triage a potentially serious health problem.
Despite speculation about the death of voice calls, 39% of 5,000 consumers Microsoft surveyed globally for its
2018 State of Global Customer Service report rank the telephone as their number one communications channel for customer service. And yet, the majority of bots implemented today are strictly text-based. This presents a disconnect. Customers are either left to use a communication method — text — that’s not as comfortable to them as a phone call, or contact center agents get tied up answering calls on queries that are best suited for bots.
The AudioCodes Voice.AI Gateway
AudioCodes recognized this conundrum and concluded that there’s no good reason why old fashioned telephone calls can’t reap the many benefits that bots provide while enabling an agent to step in when a bot’s capability has been exceeded.
With AudioCode’s
Voice.AI Gateway, an enterprise can apply the same technologies that it has deployed for text bots (SMS, Webchat, Facebook Messenger, etc.) on incoming and outgoing telephone calls. This means that a bot developed using
Google,
Amazon, or
Microsoft tools can be voice-enabled and called from any telephone, unified communications system, or
WebRTC endpoint in the world.
The architecture of a Voice.AI Gateway is fairly straightforward. It sits between the voice, bot, and cognitive services worlds, as shown below. It uses standard SIP to communicate with carriers, contact centers, and enterprise UC systems, and Web services to connect bots to Web platforms. As is the nature of any gateway, it forms the bridge between two disparate technologies that are otherwise incompatible.
Administrators can select between different text-to-speech, speech-to-text, and bot frameworks as they see fit for their use case. For instance, one text synthesis platform may be better than another for particular languages or dialects. The Voice.AI gateway allows for a best-of-breed mixing and matching.
When I first looked at the Voice.AI platform, it reminded me of a session border controller (SBC). AudioCodes told me that’s because its SBC technology is at the core of the gateway. That shouldn’t come as a surprise since an SBC’s job is to connect IP calls with IP services. Having SIP calls “talk” to bot services in the cloud — Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP) — is simply the next logical step in unifying communications.
The benefits of this unified approach are many:
- Bring the most intuitive form of human communications (voice) into a bot service
- Maintain existing bot user flows and scripts
- Easily migrate bots onto voice engagement channels using a single solution
- Avoid complex integrations with voice networks by utilizing the voice communications capabilities embedded in the Voice.AI Gateway
- Connect to any third-party bot, speech-to-text, or text-to-speech service (Azure, AWS, GCP, etc.)
- Supports best-of-breed selection of bot frameworks and cognitive voice services
- Advanced call management (disconnect, transfer to agent, call recording, etc.)
A step-by-step voice-enabled bot flow looks like this:
- The customer poses a question.
- The Voice.AI Gateway streams the customer’s voice is streamed to a speech recognition service.
- The speech recognition service returns a text translation of the customer’s question.
- The gateway sends the text to the bot platform (Google, Amazon, Microsoft, etc.).
- The bot replies with an answer.
- The gateway sends the textual reply to the text-to-speech engine for conversion to speech.
- The engine returns the speech to the gateway.
- The customer hears the bot’s answer.
From the bot’s standpoint, it has no idea if it’s communicating with a landline, a cell phone, a WebRTC-enabled browser, or Facebook Messenger. In AI terminology, the bot’s entities, dialogs, and intents are the same. This allows bot developers to “write once and deploy many.”
As previously stated, a bot has its sweet spot, and there are times when a call needs to be escalated to a live agent. Since at its core the Voice.AI Gateway is an SBC, transferring to an agent is as simple as redirecting the call to an enterprise’s contact center. This allows for a seamless transition from bot-assisted to human-assisted customer support. While not available in the first release of the Voice.AI Gateway, attaching a transcript of the customer-to-bot conversation to the call is on the roadmap.
Mischief Managed
As a proponent and evangelist of digital transformation, I’m all in on artificial intelligence and bots. As a pragmatic geek, I understand that telephones and the voice network aren’t going away any time soon. While there are times when my desired mode of communication is a keyboard (physical or virtual), there are just as many times when I need to speak my piece out loud. The AudioCodes Voice.AI Gateway allows both technologies to coexist in perfect harmony.