ABOUT THE AUTHOR


Mike Bergelson
Mike Bergelson is responsible for developing new product and business model strategies for Cisco's Unified Communications portfolio. Prior to this...
Read Full Bio >>
SHARE



Mike Bergelson | July 11, 2010 |

 
   

The State of Transcription: Part 1

The State of Transcription: Part 1 Should transcription functionality be fully automated, fully human-performed, or a hybrid of these approaches?

Should transcription functionality be fully automated, fully human-performed, or a hybrid of these approaches?

In an earlier No Jitter blog post, I discussed the potential benefits that transcription can provide for enterprise communications, such as helping us consume content more quickly (without compromising retention) and being able to better access and analyze the content.

In this and the following two posts, I discuss the state of transcription today. In the final post that I’ll write on this subject, I'll address where I believe the market is going and some key areas of innovation that can help us derive more benefit from recorded audio and video content.

First, let's consider the common use cases for transcription today: voicemail, meeting and medical transcription and closed captioning. The latter two don't directly relate to UC but I'll touch on each briefly because I think there are some interesting lessons to be learned.

Note that this isn't a completely exhaustive list, e.g., I don't address uses of transcription in surveillance applications.

Voicemail Transcription
A few years ago, voicemail transcription seemed to garner a fair amount of attention in the media (in UC industry press, anyway). Users touted the benefits of scanning message transcriptions received via SMS or email to get the gist of the original voicemail and determine the priority for required action.

As with speech recognition tasks in the contact center, there is a spectrum of automation applied to this task, enabling vendors to balance the trade-offs between the four key considerations in transcription: cost, accuracy, turnaround time and privacy.

* Fully automated: some providers, including, most notably, Google, use speech recognition and natural language processing algorithms to interpret voicemails. In Google's case, words are color-coded to indicate transcription confidence where lighter text indicates less confidence. While fast and inexpensive, at least on a marginal basis, this approach (somewhat famously) lacks accuracy. According to a recent study by industry analyst Bill Meisel, these fully-automated transcription engines tend to achieve accuracy percentages in the mid-to-high eighties (although these scores might be a bit pessimistic because of the way errors are counted).

* 100% human: another approach relies completely on humans to transcribe messages. Some vendors stream voicemails to transcriptionists as the messages are being left by the caller, allowing for near-real time conversion. While ostensibly solving for turnaround time and accuracy, the downside of this approach is the cost associated with so much human involvement.

* Semi-automated: some vendors (e.g., Nuance through their Jott and SpinVox acquisitions last year) offer a partially automated approach, akin to the agent-assisted IVR processes pioneered over the past few years by start-ups such as Unveil and Spoken. The goal with this approach, of course, is for humans to edit system-generated transcriptions. The human corrections are fed back into the speech recognition engine to improve its accuracy over time. This approach seems to provide the best balance of cost, accuracy and turn-around time for many enterprise use cases.

Although overall voicemail volumes are decreasing in some key enterprise segments, the interest in transcription of these messages is clearly on the rise. Over time, semi-automated transcription is most likely to win out for enterprise communications, where a few dollars a month per user can be easily justified for the productivity and response-time improvements. For consumer applications, the fully automated solutions such as Google Voice voicemail transcription will dominate, as these can scale at the price point that most consumers are comfortable with.

As analyst Dan Miller points out, to drive consumer adoption those marketing these services should attempt to redirect consumer focus from accuracy to usefulness. Consumers will come to accept that good enough is just that, rather than hoping for free or very low cost transcriptions that are 100% accurate.

Since fully-automated voicemail transcriptions won't be very accurate any time soon, marketers must reposition the services. Perhaps it makes more sense to describe these consumer-oriented services under a brand name that speaks to a message's gist or essence rather than a transcription, which implies a certain level of accuracy.

The evolution of the voicemail transcription market teaches us (once again) that managing user perceptions can be as important as the effectiveness of core technology itself.

(to be continued next week)



COMMENTS




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Did you know you can style comments using HTML tags and upload your avatar photo? To upload your avatar photo, first complete your Disqus profile. Once your profile is complete, you may add your avatar photo. (Hide this hint)
Enterprise Connect Tour 2012
In response to the booming demand for SIP Trunks—and for information about SIP Trunks—Enterprise Connect is launching a four-city “road show” on this vital topic.
May 22: San Francisco
June 6: Chicago
June 27: New York
Enterprise Connect 2012 Roundup
Read blog posts and watch videos captured live at the industry's leading event, and catch up on all the post-show analysis too.
Enterprise Connect Webinar
Forrester Research principal analyst Art Schoeller will provide an overview of trends in enterprise SIP Trunking and unified communications adoption.
Enterprise Connect Orlando 2013
Enterprise Connect Orlando 2013 takes place March 18-21, 2013 at the Gaylord Palms Hotel. We'll be opening registration shortly. Stay tuned!
Trending Now
Upcoming Events
May 23, 2012
The explosion of new hosted and cloud communications offerings can be confounding. With the potential for cost savings and productivity enhancements, adopting the right Hosted PBX solution can make a ...
May 9, 2012
SIP Trunking and unified communications strategies are important components of enterprise telecommunication strategies. Enterprise Session Border Controllers (E-SBCs) play a critical role in maximizin...
April 25, 2012
Unified Communications (UC) is becoming mainstream in the enterprise, enabling real-time, collaborative communications via a host of new media and applications. But this transition will bring challeng...

Sign up to the No Jitter email newsletters

  • Catch up with the blogs, features and columns from No Jitter, the online community for the IP communications industry. Each Thursday, we'll send you a synopsis of the high-impact articles, podcasts and other material posted to No Jitter that week, with links for quick access.

  • A quick hit of original analysis by the experts who bring you Enterprise Connect, the leading event in Enterprise Communications & Collaboration. Each Wednesday, this enewsletter delivers to your email box a thought-provoking, objective take on the latest news and trends in the industry.

Your email address is required for membership. For details about the user information, please read the UBM Privacy Statement

As an added benefit, would you like to receive relevant 3rd party offers about new products/services and discounted offers via email? Yes

* = Required Field