No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

AI-Generated Transcription Still Requires Human Oversight

To streamline meeting experiences, generative AI-powered solutions are emerging as embedded applications within meeting and conference call solutions, with the aim of bolstering user productivity.

The incorporation of GenAI into meeting solutions introduces various functionalities aimed at optimizing the meeting process, for example seamless notetaking and minute transcription within meetings using integrated tools.

Third-party assistants can perform advanced analysis of meeting conversations with the creation of summaries and key highlights derived from transcripts.

In addition, real-time prompts for action items during meetings can enhance productivity, while instant content generation for whiteboards is based on text prompts.

While organizations exhibit a strong interest in these capabilities, there is a prudent approach taken regarding their implementation.

Particularly in the case of third-party tools, companies are exercising caution due to governance and security considerations, with pilot projects underway to ascertain optimal use cases for these features, aiming to derive superior outcomes.

Defining Real-Time Transcription

From the perspective of Gartner vice president analyst Ben Elliot, these features still need human oversight, as they can fail to understand the context of a speaker’s words and a person’s exact manner of speaking.

He notes the level of "real-time" in automated real-time transcription varies.

"An important issue is that often high accuracy requires a full sentence, but many don’t want to wait that long," he says. "So, a compromise is the automated simultaneous transcription, as you might see on the TV at an airport lounge."

If one looks closely at those, it will be noticed that the text on the screen is sometimes revised before the sentence is completed because has some later parts of the sentence or paragraph clarifies the intent or meaning of words earlier in the sentence.

"I make a distinction between immediate real-time—a time period of less than 200 milliseconds is imperceptible to humans--and delayed real-time, which can run up to several seconds," Elliot explains.

Will Patterson, head of new product integration at Clari, says from a value perspective, there is a "huge amount" of knowledge worker productivity lost to ineffective meetings or individuals attending meetings with a very limited role to play.

"From a technology perspective, we have seen over the last year that LLM’s are really well suited to summarization use cases hence the explosion of new features from players across the ecosystem," he says. "I expect that to continue next year but go beyond basic summarization to allow users to define more specific topics and prompts that they want the bot to analyze."

He points out with any conversation recording use case, there is the issue of recording consent– when one introduces the possibility of multiple bots joining on behalf of multiple parties, this becomes much more complex.

"As a meeting participant, how do I know where my data is going and who will have access to it?" Patterson asks.

He explains web conferencing providers can address this through their infrastructure for managing recording consent.

Meanwhile, creators of meeting bots will need to consider how they disclose the owner/intentions of every bot and give meeting participants visibility and control over how their data is used before, during and after recording.

"End users will have much more control over the summarization that bots perform on their behalf," he says. "Right now, the focus is on useful but generic summaries."

Productivity Possibilities Excite Employees

Nat Natarajan, chief product and strategy officer at G-P, says AI has had a profound impact on the global business landscape.

The company's 2023 Global Growth Report found more than nine in 10 (93%) employees are excited about potential uses of AI at work, particularly in automating tasks (46%), summarizing information (39%).

Natarajan points out more than a third (36%) of survey respondents said they were hesitant to join companies with global hiring models due to the complexities of cross-time zone collaboration and language issues.

"AI-powered transcription and summarization tools offer a remedy by facilitating seamless communication and collaboration," he says. "This increased interest in AI-powered tools had been particularly advantageous for organizations building and managing global teams."

Evolving Tech Broadens Capabilities

Overall, Generative AI is having a significant effect on what is possible to do with text, rather than the speech to text (aka speech rec aka ASR).

"So many of the enhancements we are seeing lately are relate to the value added by GenAI and LLMs rather than to the speech transcription process itself," he says. "There are some areas where GenAI is used on immediate real-time speech, but these are narrower use cases."

He points to some interesting solutions emerging for handling dialects and accents.

"You might look at as an example of a company looking at real-time accent localization in real-time calls," he explains. In other cases, LLMs can be tuned for specific dialects."

However, it is not real-time and may have inaccuracy, and he adds a distinction should be made between accent localization vs translation.

"The accent localization can be done in some cases in actual real time, while translation still has latency," he notes.

He predicts that while immediate real time language translation will happen, to even do accent localization at that real-time speed is still technically challenging.

"Better summarizations of meetings will be making their way into meetings," he says. "I expect over time, these will become integrated into what we call 'composite AI' and perhaps combined with decision support and agent assistant technologies."

Patterson adds there’s a whole range of use cases that are unlocked when users and admins can specify which tasks they want the bot to perform on any given meeting.

"If a time-constrained leader can’t attend a specific meeting, they can write a quick set of prompts specifying the topics they are interested in and quickly get up to speed on the things they care about--rather than watching a long call recording--and even get recommendations for items they need to follow up on," he explains.