No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Designing a Voice User Interface

Creating a visual display depends on visual designers using a Graphical User Interface (GUI). When speech is involved, you need a Voice User Interface (VUI). The VUI is becoming important especially with the development of speech-driven digital assistants.

I met Crispin Reedy, president of the Association for Voice Interaction Design (AVIxD) at the SpeechTEK conference at the end of April. AVIxD is a professional organization for VUI designers. Their mission is to support design practitioners with community, networking, and knowledge sharing. Reedy sat down with me for an interview to share her knowledge about VUI.

What is a Voice User Interface?

The Voice User Interface (VUI) is an interface to any computer using a human speech for input, output, or both. The term can refer to a phone-based user interface supporting a call center (IVR), using either touch-tone entry or speech recognition-based interactions. It can also refer to personal assistants such as Siri, Alexa, and Google Home, as well as voice implementations on specific devices, such as the car, or the new Samsung voice-controlled refrigerator.

There's also a new term: CUI, for Conversational User Interface. This term is a bit broader since it can also refer to text-based conversational user interfaces, such as chat bots.

Help Me Understand What VUI Designer Does

A VUI designer designs and documents the flow of the conversation between the person and the computer. The speech output from the computer guides the user along the path of the conversation; VUI designers write prompts that give the user an idea of what to say.

For example, say we needed to collect the user's favorite ice-cream. The VUI designer might write the question "What ice-cream flavor do you want?" This is very open-ended and the caller will expect to be able to say anything, up to and including "Cherry Garcia." The designer documents that expectation, possibly in a sample grammar, which would represent the spectrum of what the user wanted to input. If the choices are more constrained, the prompt needs to reflect those constraints. If only three choices are allowed, the designer might write a prompt such as: "What ice-cream flavor do you want? You can choose: chocolate, strawberry, or vanilla."

A VUI designer's job may consist of any (or all) of the below (or more!):

  • Understands the technologies involved, and how that impacts the design
  • Knows user-centered design techniques and how to apply them
  • Advocates for the user in design meetings
  • Helps the enterprise make the best design decisions for their business and for their user
  • Writes prompts, documents business logic and data transactions in the detailed design document
  • Coaches the voice talent to get the desired performance
  • Assists in evaluation of the design, such as usability tests
  • No, the designer is not the voice talent

Are VUI Developers Different than Designers?

A VUI developer is the person responsible for coding the VUI design, possibly in VoiceXML or in some other language or tool. There are many tools available today, such as the new Speech APIs, and Natural Language Understanding tools, that make development faster and easier. In large, complex, enterprise-grade deployments, there is still a need for software architecture and development skills.

What Human Factors Enter into VUI Design?

Human beings have conversations every day, so we tend to think that we understand them. However, there's a big difference between being able to have a conversation, and being able to create a conversation. To return to our ice-cream example, one common pitfall is to write something like: "Do you want chocolate or strawberry?" However, what if the user interrupts the prompt? That question could legitimately be answered with "Yes." Or perhaps even "both." VUI designers must consider linguistic and psychological principles such as Grice's Conversational Maxims, pauses and turn-taking, cognitive load, discourse markers, apologies, and the structure of a conversation. All of that goes into a conversation that is effortless and natural instead of one that is stilted, frustrating, or confusing.

Are There Personality Traits of a Good VUI Designer?

Above and beyond the ability to write, a VUI designer needs to be organized, structured, and curious about how language works. Since these projects are often done in a team environment, advocating for your design decisions is a constant part of the job. A VUI designer should be in the habit of tying specific design decisions back to principles, best practices, and research.

Is This About Alexa and Google Home or More?

Certainly Alexa and Google Home are interesting, high-profile examples of VUIs, and they are signs of the increasingly wider adoption of this technology. But conversational speech can be useful in a wide variety of areas. Hands-free industrial productivity applications can make workers more efficient, or safer. Text-to-speech output and voice control have made many devices more accessible to the vision-impaired.

Traditional IVR is taking new forms but doesn't appear to be going away. And as represented by the continuing popularity of the talking AI computer throughout science fiction, there's something intrinsically appealing about the idea of a smart, conversational, wizard-like computer assistant. It's still an aspirational goal, and in order to create a "Jarvis," there are problems to be solved that go far beyond speech. It's a very tantalizing one which many people are still pursuing.