This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
How AI-Driven Innovation Will Change Speech Technology
Hybrid work has emerged as the business model of choice coming out of the pandemic, but in 2023, most enterprises are still struggling to get it right. While technology is just one factor for having success with the hybrid model, communications applications are especially important for
maintaining employee engagement and workplace productivity. For the longest time, communications was largely the domain of telephony, but today, communication applications are independent of location and transcend telephony.
Voice remains the most natural, and often preferred, mode of communication, but with in-person forms of interaction being less common now, other text-based communications modes like email, web chat and messaging have become just as important. In other settings, when not working face-to-face, voice communication occurs either via telephony or video. These modes have great utility, but AI-driven innovation has created a much wider palette for what’s possible for voice in the enterprise.
The bigger picture with voice
The evolution of voice in the context of speech technology has happened quickly with advances in AI, and I have been presenting a state of the market update at Enterprise Connect for five years now. Use cases for speech tech have long been prevalent in the contact center, as well as the consumer world, but my focus has been on enterprise applications. In the collaboration space, several use cases have now become mainstream; applications such as speech-to-text and text-to-speech are standard UCaaS features.
For Enterprise Connect 2023, the focus will be on the current state of the market, as well as what’s coming. The big change to be aware of would be the advances in conversational AI (CAI) which have elevated chatbots to a level where they can be trusted for basic forms of communication and collaboration, like dictating an email message, updating a calendar entry, or sending out invites to the team for a meeting. With 2023 being touted as the “year of AI”, CAI is one of the brightest proof points, although we still need to view all forms of AI with healthy skepticism.
Enterprise use cases for CAI will take time to emerge, but it shouldn’t be hard to see how this makes the concept of a digital personal assistant (DPA) much more compelling. This is already being used in the contact center to provide real-time support and coaching to agents, and now for the workplace, think of DPA as a personal virtual secretary for each worker.
To properly recognize the possibilities for speech tech in the enterprise, you might need to think differently about the role of voice. While it’s understandable that many of us associate voice with telephony - much like the way it’s used with UCaaS - it's time to think bigger than person-to-person voice communications. With AI, voice becomes the enabler for other things, so the focus shifts from person-to-person communication to person-to-machine interaction.
Freedom to work hands-free
On one level, AI provides a level of accuracy to speech recognition that allows workers to remain productive without being tethered to a keyboard – much the way mobile phones freed workers from being deskbound in the office. During the pandemic, this gained currency by
providing more options for touchless working, and as accuracy keeps improving, voice recognition use cases broaden.
There’s another important level to consider here, namely speaker recognition. This is a different type of intelligence – biometrics, really – where AI can accurately identify the speaker, even when others are talking, such as in a meeting room. Not only is this useful for purposes of authentication, but with machine learning, speech applications keep improving over time.
Now think about how this makes a DPA more “intelligent”, as it learns your speech patterns, lexicon, acronyms, etc. Then tie that to its “knowledge” of your team members, your schedule, your projects, etc. With all these capabilities, DPAs can manage workflows and automate tasks, well, just like a real secretary, only using your voice. With the hybrid model creating such a fragmented workplace environment, this form of speech tech can help level the playing field between home and office-based workers.
Coming back to CAI, you’re probably wondering where Chat GPT figures into the equation. This is certainly the trend du jour, and it’s very much part of where speech tech is going for 2023. If you’ve attended my earlier speech tech updates, you’ve already seen a glimpse of what the future holds for the enterprise, but this time around, you’ll be hearing from others along with myself.
We’d love you to join our session!
We’re changing up the format to a roundtable discussion, and I’ve deliberately chosen a mix of companies from across the speech tech ecosystem. I’ll be providing a high-level update on trends to set the table, and will then lead a moderated discussion with speakers from three different types of players, all of whom are doing leading-edge things with speech tech. Joining me will be Dan O’Connell from Dialpad, Mahesh Ram from Zoom, and Edward Miller from LumenVox. To get a better sense of what we’ll be talking about, here are the key takeaways you should expect:
- What are the leading applications using speech technology and AI in the enterprise?
- How accurate is current transcription software, and how accurate does it need to be in order to enable more advanced applications like intelligent meeting summaries?
- What expertise does your IT team need to support applications enabled by AI and advanced speech tech?
- Is there an ROI case for adopting speech technology-enabled applications, and if so, how do you make it?
Our session runs on Wednesday, March 29 at 4pm, and here’s the link for more detail, and to add it to your event scheduler. Hope to see you there.
Enterprise Connect 2023 will be held from March 27-30 at the Gaylord Palms in Orlando, FL. You can check out the attendance options here and dive into our line-up of sessions and keynotes here.