Achieving QoS for Cloud UC Services
How do we ensure satisfactory performance of cloud-based UC services?
Many organizations are moving (or considering the move) to cloud-based UC services as a way to reduce costs and increase availability. The cloud-based UC provider can achieve efficiencies of scale and streamline operational functions. The result is the ability to offer UC services at a price that's very competitive with an organization's ability to provide the services internally.
But using a cloud UC implementation doesn't mean you can overlook the need for QoS mechanisms.
An organization's bandwidth requirements can vary significantly, from LAN speeds in campus environments to WAN speeds that service remote offices. Mobile staff can be limited to network connections over a mobile phone connection and Internet connections at available hotspots (i.e., at a coffee house or hotel), where QoS is ignored. The mix of environments suggests that traffic be classified and marked when it enters the enterprise network. Outside the organizational network boundaries, the QoS policy is likely to be: Hope for the best.
The typical UC applications are voice, interactive video, streaming video, screen sharing, and interactive chat. I've listed them in the order in which most QoS policies will prioritize them. The reasoning is that voice is the most important service and has the tightest constraints with respect to timely packet delivery. It uses UDP, with specific port ranges that allow the definition of QoS policies for prioritizing voice over other flows. Interactive video uses UDP, too, for both the video and audio flows -- although it's more important for the audio stream to reach the recipient than it is for an unblemished video stream to reach the recipient. The importance of the audio stream requires that QoS policies favor the audio stream over the video stream.
TCP data flows apply to streaming video and screen sharing, which are next in priority. The use of TCP allows for reliable delivery, while deep buffering by the receiving codec minimizes playback delays. Packet loss in a streaming video session can cause delays as TCP retransmits data. These delays appear to the user as "Buffering..." pauses when the receiver's codec waits for data to display.
Another scenario occurs with conference calls. The call controller sets up a regular point-to-point call between the two endpoints. The audio (or audio + video for an interactive video call) data flow is directly between endpoints. Conference calls are an exception. The call controller instructs each endpoint to connect to a media control unit (MCU) that acts as a central replication point for all the audio/video data streams. Conference calls that rely on a cloud-based MCU may present more problems than an MCU that is centrally located within the organization, simply due to the lack of QoS.
For example, at NetCraftsmen we've been working with a UC application hosted on hardware at a colocation facility. Clients are distributed and must connect to the UC application over the Internet. Call quality dips at the local busy times: during lunch, after school lets out, and in the evening after dinner. The user community has started to switch to conference calling with cell phones to circumvent the call-quality problems inherent in Internet-based voice.
QoS with the Cloud
To address the problems mentioned above, we can start by applying QoS within the organization. This makes sure that the network paths within the organization aren't a problem.
Since calls flow directly between endpoints, it's possible to avoid the need for QoS. Consider this example I heard from fellow NetCraftsmen Steve Meyer, who described a Skype for Business calling scenario between two laptops in the same remote office. Checking into Microsoft call controller smarts revealed that the Skype for Business controller was able to recognize that both endpoints were in the same subnet. The controller configured the call data flow directly between the endpoints instead of routing the flow through the organization's data center.
As noted above, a cloud-based MCU could result in quality problems that are only apparent with conference calls. The key factor is whether point-to-point calls have good quality while conference calls are of poorer quality. Simply moving the MCU from the cloud onto the internal network, but leaving the call controller in the cloud, could solve a nagging conference call-quality problem.
If requirements dictate using a cloud UC provider, you may want to consider the use of a few dedicated links or MPLS connections to the provider's facility. Make sure that QoS will work across these links before you install them. Similarly, you might get dedicated links to a colocation facility that has cross-connect agreements with various cloud providers. This can provide flexibility for selection of a cloud provider while minimizing the cost of dedicated links.
Since the Internet doesn't support QoS, another alternative is the use of software-defined WAN (SD-WAN). It works well when multiple independent paths run between endpoints. The SD-WAN system measures the characteristics of each path and selects the path that matches administrator-defined packet flow policies. Note that SD-WAN isn't really QoS. It's simply measuring path characteristics and choosing to forward UC flows over the path that best matches the desired policies. The success of SD-WAN depends on having multiple diversely routed paths between endpoints.
How to Proceed
Using a cloud UC implementation doesn't avoid the need for QoS within your organization's network. You'll still need to create a QoS design, possibly in conjunction with an SD-WAN policy definition. I also recommend that you work with your cloud UC provider to identify potential problems before you get to the implementation (or later) phase of a deployment. The use of active path testing tools, for example, can alert you to problems before the user community begins to complain and call.