Scalable Video Coding (SVC) is a standard, specified by the ITU (H.264 Annex G). But as recently discussed at VoiceCon in Orlando, having a standard and having interoperability are two quite different things. My last four posts talked extensively about the reasons SVC is a great new technology, but interoperability is not with us yet.Standards today often specify only a small part of a much larger problem. It is often a group of standards that are needed to provide interoperability. And in addition, extensive interoperability testing must be done to ensure that all parties have interpreted the specifications in the same way. All this is true for scalable video coding in video conferencing. Let's take a look at how it breaks down.
Video Encode & Decode: This is the piece that is standardized by H.264 Annex G. If two vendors are both following this standard, then they should be able to decode the other vendors' video streams and reconstruct the original image.
Video Transport: This is well standardized. Everyone uses RTP (IETF RFC 3550) as the protocol over IP for the transport of video media.
Video FEC (Forward Error Correction) Protection: The protection of the base layer of an SVC video session is not part of the H.264 SVC standard, and so the vendors are implementing proprietary protection approaches. This means one vendor's SVC protection will likely not work with other vendors.
Audio Encode & Decode: Audio codecs are well standardized, and vendors usually choose to implement at least one of the standard audio codecs to ensure compatibility.
Audio Transport: As with video transport, audio uses RTP and is well standardized.
Signaling/Call setup: There are two standards actively being used in video conferencing today, H.323 and SIP. Vendors who already interconnect using one or both of these standards are likely to continue to use standards and implement SVC within this framework. Vendors using SVC as their initial deployment may or may not be interoperable, depending on how many extensions they have implemented to take advantage of their specific design.
Signaling/Dynamic call bandwidth management: There are many advantages of SVC encoding that are associated with the ability of the MCU (or video switch) to dynamically change which layers of the SVC encoding are forwarded to each endpoint. Signaling is required between endpoints and the video switch to enable this functionality, and no standard exists for this communications at this time.
Data Sharing, IM, presence, etc.: The additional collaboration functions we expect from a Skype-like service today also need to be using a compatible protocol. Standards exist, but vendors may or may not have chosen to use them. Video services (e.g. Visimeet, ooVoo, Skype) have clients that are also trying to verify a subscription with credentials, and manage the services available for that subscription.
The Gateway Solution Most of the vendors will claim interoperability today between their SVC endpoints and traditional (H.323 or SIP-based H.264 AVC) endpoints using a gateway. A gateway takes in SVC streams and signaling on one side, and puts out H.323 or SIP and H.264 AVC streams on the other side. An MCU can certainly serve this function if deployed in the infrastructure, by adding the SVC codec to the MCU.
A gateway solution works when one of the technologies is a relatively small percentage of the overall deployment, and interoperability is required. But the gateway itself, whether it be a stand-alone solution or incorporated into an MCU, is an expensive component that does not scale well. Remember that any video stream flowing from an SVC endpoint to a non-SVC endpoint (or vice versa) must flow through the gateway device. This means the gateway device has a significant bandwidth and CPU-cycle requirement, and must be replicated as the amount of traffic grows.
Conclusions So where are we? Today the only "compatible" solution uses a gateway device like the RADVISION MCU (see post by Tsahi Levent-Levi). Polycom has announced that they will implement SVC in their endpoints, which will allow direct Polycom to Polycom connections using SVC technology, and may allow Polycom to RADVISION connections as well, given that Polycom and RADVISION have a long history of interoperable signaling. But these direct connections will not have the multipoint flexibility provided by the video switch as described in earlier posts and implemented by Vidyo.
So expect to see claims of interoperability, and of the value of SVC, but look closely at what actually works in a multi-vendor environment before you build a strategy around the value of SVC. I believe in the long term that SVC will provide a significant value for video conferencing, especially for desktop-based deployments. But there is a long way to go before the full value of SVC will be available in an interoperable way between vendors.