Scalable Video Coding Solves Video Conferencing Scaling Issues

Scalable Video Coding (SVC) is a different approach to compressing video that brings a number of key advantages when operating in the real world. Two vendors, Vidyo and Radvision, are already providing this technology, and a number of others have licensed the technology. I expect we will see much more of this in the next few years.How is it different? Scalable Video Coding (SVC) is specifically designed to address three key issues facing wide deployment of video conferencing today:

1) Networks do not all deliver the low loss and jitter required for high quality video

2) Networks do not have consistent bandwidth available to all users

3) Scaling the multipoint conferencing unit (MCU or bridge) used today is difficult and very expensive because of the high bandwidth concentration and high CPU power requirement of the MCU

That's a big claim. First let's take a look at how it works. I am an engineer, so I always want to know how it works, and then understand the benefits. If you are in marketing, you will have to bear with me. We will get to the benefits in good time.

What is Scalable Video Coding? SVC uses a different approach for the way it packages a series of video frames for transport over the IP network. Today's codecs (non scalable) negotiate the bandwidth at which the connection will be made, and then the codec does the best job it can at making an intelligent tradeoff between high resolution and high frame rate given the available bandwidth. The higher the bandwidth, the better a job the codec can do at delivering high-resolution motion images.

If the bandwidth is high enough that the codec can deliver full HD video with 30 or even 60 frames a second, then there are no hard decisions. But at lower bandwidths the codec must decide to either optimize resolution or to optimize frame rate based on the amount of motion in the video image and based on settings on the codec.

Figure 1--Traditional codecs encode entire video in a single packet stream

Most codecs today also have dynamic bandwidth algorithms that are designed to reduce the amount of bandwidth being used if the network is losing some packets. These algorithms assume that if there is packet loss then network then congestion is the cause, and that reducing the bandwidth will help reduce the congestion and allow a return to a higher quality connection.

SVC works differently. SVC compresses the video images in such a way as to provide multiple different streams, each containing different components of the high quality video image. The first video stream is a low resolution image that can operate at a modest bandwidth. Additional streams are then encoded that contain the information for higher resolution, higher frame rates and higher quality levels to create higher resolution and better motion images. But these additional streams do not duplicate the information contained in the first stream, they complement it.

Figure 2--SVC codecs encode a base layer and multiple layers that enhance the base information

Consider a receiving codec that is able to get all the streams of this video conferencing session. This decoder can piece all the streams back together to create a full motion, full resolution image. However, if a second codec is on a slow link with only sufficient bandwidth to receive a low image resolution, that codec can still view the same source created by the same endpoint, by only receiving the base layer stream. Codecs might be limited by bandwidth, by available CPU power for decoding or maybe displaying the image as a small window on a laptop and not need additional resolution.

Some of you are saying, "we do that today with our multipoint bridge (MCU)". Yes that is true, the MCU can transcode an HD image from one endpoint into a standard resolution image for another endpoint. But to do this, the MCU requires substantial CPU cycles or dedicated signal processors. The MCU in an SVC environment merely has to decide which components of the SVC stream to forward, and which not. If an endpoint only needs the base layer, only that component of the stream is forwarded. If the endpoint wants full resolution, all streams are forwarded. Intermediate needs are met with combinations of the SVC streams. Because this task is much easier from a computational point of view, we can expect SVC bridges to be much less expensive or be able to handle many more simultaneous endpoints. But that gets to scaling video deployments--we'll come back to that in a later posting.

In my next post I will look at why SVC codecs handle packet loss better than traditional codecs.