Hey Siri: Call Me An Ambulance
Siri is big. Big for the iPhone today, and big for Apple in general. Siri also represents another shift toward the cloud. Just don't think of Siri as Artificial Intelligence.
Siri is a breakthrough currently found on the iPhone 4S. Well, not actually in the iPhone--she (or he in some countries) lives in the cloud. Siri represents revolutionary voice recognition technology. It is the closest thing we have seen to 2001's HAL. In many ways, Siri is ahead of HAL, but after all, HAL did come a full ten years earlier.
Not everyone agrees that Siri is significant. Google's Andy Rubin played down the notion of an assistant on the phone, and Micrsoft's Craig Mundie told Forbes that Microsoft has offered similar functionality on its Windows phone for more than a year. TechAU compared Apple's Siri to Microsoft’s Tellme in this video.
Nevertheless, Siri represents a major milestone. It has phenomenal speech recognition and it has the notion of context. That is, Siri is better at understanding what you mean, rather than just what you say (similar to IBM's Watson). Better than Watson, Siri even offers a little personality or least sass. Though describing Siri as artificial intelligence is an exaggeration.
Siri is a huge step forward in voice recognition. It is the most compelling feature of the new iPhone and represents a new area for iPhone imitators to emulate. Apple obtained Siri when it acquired the company of the same name in April 2010. The acquisition was too insignificant to require any type of federal approval. Apple paid an estimated $200 million for Siri, which was an iPhone app at the time.
Voice recognition technology has made several major steps in the past few years. Microsoft, Google, and Nuance are all making strides toward accurate detection of conversational speech. Nor are phone-based personal assistants new--there’s been many before including Wildfire, Webly, and Atlas. What makes Siri different is that it is a centralized service for broad capabilities. It sends voice from the iPhone to its cloud-based servers that interface with numerous technologies and APIs. At the time of Apple’s acquisition, Siri had 30 different APIs to services such as OpenTable, TaxiMagic, and Bing.
Unlike Google Voice Commands, which look for certain key phrases such as "navigate to," Siri deciphers the audio and programmatically interprets the request. It builds a dossier on each user so a command such as "send a text to my wife" is possible, once it learns which contact is your wife.
There is speculation that Apple has some AI on deck, but so far Siri is only demonstrating clever programming, not reasoning. It is programming that gives Siri context. "What is the weather outside?" is translated into text and correlated against location information. "How about tomorrow?" is not treated as a new question, but associated with the previous one. It is very impressive programming and worthy of respect.
Google does similar searches with its text interface. Search on Google for "Chinese restaurants" and Google will display nearby restaurants if it can ascertain your location. You can turn this off with Google’s new "verbatim" mode and it will display popular Chinese restaurant web destinations without considering location.
Getting computers to understand exactly what you mean can be difficult. Abbott and Costello's "Who’s on First?" routine is a famous example of how complex language can be. Siri has similar challenges; it can be a bit like searching on Google with the Get Lucky option, as it acts on one interpretation. By the way, Siri designers anticipated the Who’s on First question and programmed Siri to respond with "Right. That's the man's name." Funny, but it’s impossible to program an accurate response to every question.
Programming responses is not the same as reasoning or intelligence. Insurance claims agents today use iPhones for things like pictures and forms, but could an iPhone (Siri) replace an agent? Not in this decade! Siri has no ability to detect fraud and assess fault. Siri is an interface to existing and programmed algorithms.
Of course what Siri does today is not what Siri will do tomorrow or in five years. But while voice recognition is making huge leaps, AI is not. However, Siri will be noted as the beginning of a new human interface for computing. Computer interfaces started with punch cards; then came command lines, GUIs, and most recently touch--the next wave will be voice, and this has significant implications. Each new era brings computing to more people by simplifying the barrier to use and access.
Ordinary English opens up computing to the masses. Dictating a simple email to Siri is surprisingly accurate. As this technology matures, it will effectively eliminate the middle step of learning a word processor program to complete the task. It also opens up computing to a far larger user base, including those unable to operate a keyboard. It could mean a new generation of apps that are designed for voice rather than touch (which were already redesigned from desktop versions).
Siri is big. Big for the iPhone today, and big for Apple in general, as Siri won’t stay confined to the iPhone. In addition to voice recognition, Siri also represents another shift toward the cloud.
Just don't think of Siri as AI. If you tell Siri, "Call me an ambulance, " it will respond with “OK, from now on I will call you 'an ambulance'."
Dave Michels is a frequent contributor and independent analyst at TalkingPointz.