No Jitter is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Working with DTMF Transmission at the Packet Level

It's easy to get lost in the technological babble of our industry. We are awash in acronyms, product names, standards (real and pseudo), protocols, and vendor specific terminology. I can't tell you how many times I thought I was hearing something new only to realize it was just another way of expressing an old idea. It's almost as if some people use their mastery of mnemonics as a game, and the one who can befuddle the most, wins.

However, there is some mumbo-jumbo that is crucial to understand if you want to be an effective and well-rounded communications professional. Sadly, most online resources have difficulty explaining many of these concepts in ways that non-engineers can understand. Today, I would like to take one of those topics and add a tad more oomph to your arsenal of knowledge.

How many of you are old enough to remember rotary telephones? For better or worse, I am. In fact, it was all I knew for the first ten or so years of my life. Heck, we even had a party line for most of my childhood.

Rotary phones used something called pulse dialing. You put your finger in a numbered hole in a "finger wheel," pulled that wheel back to the "finger stop," and let go. During the return rotation, the electrical current of the telephone line would be interrupted in accordance to the number you dialed. The number one would interrupt the circuit one time and the number zero would interrupt it ten times. The central office would then translate those current interruptions into the dialed telephone number.

DTMF (Dual Tone Multi Frequency) was introduced by AT&T in 1963 as a way to replace pulse dialing and rotary telephones. Now, instead of interrupting the electrical current to dial a number, the telephone produces a tone to represent the dialed number. Actually, it is two tones blended together -- thus the "Dual Tone" part of DTMF.

Over the years, DTMF has been extended for purposes beyond simply dialing telephone numbers. Interactive Voice Systems (IVR) prompt us with all sorts of questions that we answer with button presses. We log into our voice mail systems and retrieve our messages with DTMF. If so inclined, you can even play Mary had a Little Lamb using DTMF.

DTMF isn't a problem with digital and analog telephone systems because they both use a toll quality (64Kb, 8000 Hz) audio connection. Tones and speech easily mix with one another, and tone detection hardware is able to separate DTMF out for the applications that require it.

However, with VoIP and bandwidth concerns came voice compression and different techniques to send a legible voice stream using as few bits as possible. These compression and voice encoding techniques wreak havoc on DTMF, and render the tones undecipherable by the components that need to detect and act upon them.

Enter RFC 2833 /4733. With RFC 2833/4733, you don't send DTMF signals on the same connection that you send your audio conversation. Instead, you send them out-of-band on their own media stream. This allows you to compress the heck out of the voice stream without altering the DTMF signals.

(Note: RFC 2833 has been replaced by RFC 4733, but people still want to call it 2833, so I do, too. For the purposes of this article, they are essentially the same.)

Depending upon the origin of the DTMF, it can start out in a separate stream, or a new stream can be created by stripping the tones out of an audio conversation. An example of the latter would be a gateway that converts analog signaling and media to SIP and RTP (Real-Time Protocol).

Problems can arise from this stripping. The converter must "hear" a tone before stripping it out, and leakage can cause the very beginning of a tone to make its way through. This results in a voice mail system hearing two tones instead of one -- one from the RFC 2833 stream and one in the voice stream. Fortunately, conversion hardware is getting better, and these problems have become less common (albeit a bear to debug when they occur).

So, in terms of SIP, how is this RFC 2833 stream created and managed? With Session Description Protocol (SDP), of course. SDP is already used to describe a voice stream (e.g. G.729), and it's also used to inform the recipient that RFC 2833 is available. Specifically, it uses something called telephone-event.

Here is an example of an SDP media description that you might see in the body of an INVITE message. Note the format of "0 – 16." This represents the ten digits plus *, #, A, B, D, E, and Flash.

If you started reading this article to simply gain enough knowledge to understand what RFC 2833 is and how it's applied, you can skip this part and scroll down to the conclusion. However, if you really want to call yourself a DTMF professional, read on.

Using my favorite packet tracing tool, Wireshark, I captured and displayed an actual RFC 2833 message.

I want you to pay attention to a few things. First, notice all the different protocols involved -- Ethernet, IP, UDP, and RTP. Of interest to us, of course, is the RTP portion. That's where the good stuff lies.

Second, notice how RFC 2833 is expressed within the RTP payload. The parameter Payload type has been set to telephone-event which indicates that this packet contains DTMF tones.

Finally, the actual DTMF data has been encapsulated into an RTP Event. In this example, Event ID indicates that the number six has been sent with a Volume of 7 and a Duration of 480 timestamp units.

Measuring duration in timestamp units means that the event began at the instant identified by the RTP timestamp and has so far lasted as long as indicated by the duration parameter. For a sampling rate of 8000 Hz, the duration field can express an event of up to eight seconds.

You should also see the End of Event flag. Here, it is set to False (0). This parameter allows the sender to extend the tone beyond that of the duration field. A value of False tells the recipient that more packets are coming.

The final packet sets End of Event to True (1) and looks like this:

At this point, the tone has stopped playing and the application can process it as it sees fit.

You may never have to work with DTMF transmission at the packet level, but you will encounter RFC 2833 as you shop for an SBC or configure a SIP trunk. For example, if you are an Avaya Communication Manager administrator, you may have seen the parameter DTMF over IP in a SIP Signaling Group. Guess what? This turns support for RFC 2833 on or off for that trunk.

Let me know if you find this kind of article useful and easy to comprehend. My head is filled with similar mumbo-jumbo that I would be more than happy to share.

Andrew Prokop writes about all things unified communications on his popular blog, SIP Adventures.

Follow Andrew Prokop on Twitter and LinkedIn!
@ajprokop
Andrew Prokop on LinkedIn