(Hey, I am at Interop in NYC this week Wednesday and Thursday. Look me up if you are in the Big Apple too.)
Video isn't like those other applications. I know, I said that before because they use real-time traffic streams. Well this time it's more about how connections get opened and closed instead of the real-time problem.
Video and voice have two characteristics that make them different than most other applications. First, they often require connecting from outside the enterprise to inside the enterprise, and second they use ephemeral UDP ports.
When we use data applications like HTTP, the initiating computer is the client that is inside the firewall (on the trusted-side of the firewall). The firewall allows the HTTP connection out through the firewall, and any responses that come back on that same connection. A voice or video call from inside the company has these same characteristics. But calls initiated from outside the organization are blocked by the firewall. Limiting calls from inside to outside won't work for voice and video, especially if both parties are behind similar firewalls. Now, no calls are allowed!
Now let's look at the ephemeral ports problem. When a voice or video call is set up, the UDP ports for the voice and video streams are dynamically negotiated. So the firewall has no way of predicting which ports will be used. Older applications like SMTP email have known ports that can be enabled or blocked. But voice and video use a wide range of port values that are decided at the last minute. Leaving all those ports open is a serious security risk, so this is seldom done.
To solve this issue, firewalls that are video- and voice-aware listen in on the protocol conversation between the endpoints to see which ports are being opened for this call. The firewall can then open those ports for access, monitor the activity of the call, and close the ports again when the call is complete. This requires a stateful inspection firewall, one that keeps track of a voice or video call state and manages the ports according to the state of the call.
In the past firewalls have had a checkered record of being able to manage this task reliably. Voice leaving the enterprise has often been handled by dedicated trunk lines or by direct PSTN connections, solving this problem without the firewall. Video has in the past not been a big player in the enterprise, and so the firewall vendors did not put a full court press on solving the issues. And for awhile the H.323 protocol was a moving target as data sharing (H.239) was introduced and standardized. But now the major firewall players seem to have their stateful inspection services for voice and video working well.
But it still doesn't do everything we want! Dialing in from the outside is a problem because we don't know if we can trust that outside party, and the calling party does not know the internal address of the endpoint when NAT is implemented at the enterprise boundary.
Small deployments resolve this issue by creating dedicated external IP addresses for the video conferencing systems and asking the firewall to direct traffic for that IP address to a specified internal IP address. This in effect creates internal video endpoints who have dedicated external addresses. This works well in a home office, for example, where there is only one video endpoint.
But the enterprise needs more flexibility than this for inward dialing, and needs to manage their security at the same time. Next column I'll discuss other methods of crossing the firewall with video, and the choices they provide.