Artificial intelligence (AI)-powered transcription company Otter.ai this week announced it has added a feature that allows a business to teach Otter words specific to the company, be those industry jargon, product names, or anything else.
Use of the feature, called Custom Vocabulary, will increase the accuracy of Otter Voice Notes, which is Otter.ai’s engine for converting speech to text in real time (see related No Jitter article, “
Otter.ai Has a Meeting Assistant for Your Team.”
Otter Voice Notes is useful in interviews, meetings, keynotes, or other instances in which capturing the conversation for follow-up review is either required or desired. Without a tool like this, the norm is either to take notes busily while someone is speaking or to record the conversation. The challenge with note-taking is that it’s often hard to write fast enough to catch what everyone is saying. Even with a one-on-one interview, keeping up with the pace of speech and paying attention is hard. Recordings can be useful but the process of converting the audio recording to text can be long and tedious.
In keeping with Otter’s mission to make things easy, the program automatically figures out how to pronounce the user-entered words and phrases. Otter is considering adding advanced options to allow the users to record the pronunciations of their custom vocabulary. Currently, Otter applies Custom Vocabulary at the individual account level, if the user has an Otter Premium account. Likewise, it’s applied at the team level for Team accounts.
Otter Voice Notes is an interactive transcript player; the interface provides simultaneous access to the audio recording and the transcript. This lets the user read the transcript while listening to the recording, making it easy to pull out relevant information. Otter conversations don’t necessarily have to be done in real time, either; users can convert uploaded audio recordings to text.
I’ve been using Otter.ai since March, when I first ran into the company at Enterprise Connect 2019 (the company provided real-time transcription for sessions on the Expo floor). I find it to be the fastest and most accurate transcription tool that I’ve used to date. The one issue that I’ve had with it is in the transcription of words that aren’t part of everyday vernacular.
For example, in my transcription of the EC19 keynote by Amy Chang, SVP of the Collaboration Technology Group Cisco, the term “Room Kit Mini” came out as “room kidney.” It’s a sensible choice, as there’s no way Otter would know what a Room Kit Mini is. The Custom Vocabulary functionality would fix this issue, with training of the engine so that it understands Cisco product names like Room Kit Mini, Webex, Nexus, and others.
Even with gobbledygook like room kidney, the fact I could use Otter to record an Enterprise Connect keynote shows how good the application is. It’s not perfect, but it certainly made it easier for me to write a
No Jitter post about the keynote as I didn’t have to spend hours playing a recording back and taking notes.
The Custom Vocabulary feature can greatly expand the use of transcription capabilities to businesses that deal with a large amount of industry-specific verbiage. For example, the healthcare industry uses human transcriptionists to convert conversations to text; that’s been necessary because of the many terms and words in healthcare that aren’t part of everyday language. But human transcription is slow work, and getting a transcription can take days, sometimes weeks, based on how long the queue is. Otter can do the same thing faster, and now with Custom Vocabulary should be able to get close to human accuracy.
That said, in healthcare and other industries for which 100% accuracy matters, Otter.ai likely isn’t ready to replace people but it can augment them. Transcriptionists can start with the Otter-created text and audio file and edit the transcript while listening to the audio recording. This would be faster than having to transcribe the entire recording, and so would increase staff productivity. This is another use case of where AI makes a great augmentative tool versus a technology that replaces people.
Another interesting feature of Custom Vocabulary is the ability to import the employee directory to ensure people’s names are spelled correctly and recognized when they talk in meetings. In Otter, once a user has been “voice printed,” the app recognizes the voice and automatically marks the transcript when that person starts to talk. The directory information further improves the accuracy of tying what’s said to who said it.
AI mania has hit the communications industry in a big way, with many vendors providing a “pie in the sky” vision of what it might do. Otter is an actual AI tool that works today. As I said before, it’s not 100% perfect, but it’s good enough to anyone who, like me, attends a steady stream of meetings, conferences, and interviews.