SHARE



ABOUT THE AUTHOR


Andrew Prokop
Andrew Prokop has been heavily involved in the world of communications since the early 1980s. He holds five United States...
Read Full Bio >>
SHARE



Andrew Prokop | October 26, 2015 |

 
   

Peeling Back the SIP Resiliency Layers

Peeling Back the SIP Resiliency Layers Eliminating all single points of failure as a means of taking a layered approach to establish better SIP resiliency.

Eliminating all single points of failure as a means of taking a layered approach to establish better SIP resiliency.

Like most people, I like it when my machines work. If I get into my car and turn the key, I expect the engine to start and the tires to roll. Of course, I play an important role in keeping my Prius on the road. I follow the manufacturer's recommended maintenance schedule. I keep the tires inflated and regularly check the tread for excessive wear. In the case of an unforeseen breakdown, I travel with a few essential tools and know who to call if a problem surpasses my ability to fix it.

Just like my car, SIP resiliency needs to be a layered approach. While it's impossible to build a system that is completely unbreakable, it's not that difficult to eliminate all single points of failure and design something that can handle a myriad of software, hardware, and signaling failures.

There are a number of different ways to break down a SIP infrastructure into its components, but for the sake of simplicity, I will focus on three major systems: the SIP carrier and the interfaces it delivers to an enterprise; the border elements that sit between the carrier and the communications system; and the communications system itself.

The SIP Carrier

In nearly all cases, carriers deliver SIP trunks to an enterprise by way of an MPLS network. That MPLS network is commonly managed by the same SIP carrier, but this isn't a requirement. For example, it is possible to use Verizon for your MPLS network and Level3 for your SIP trunks.

The MPLS network terminates at an enterprise's demarcation point in the form of a label edge router (LER). You can think of the router as the on and off ramp for all data traffic on the MPLS circuit. It has been my experience that the router is owned and maintained by the carrier, but it's certainly possible for an enterprise to take on that responsibility.

The first level of resiliency is to use LERs with redundant components such as hot swappable power supplies and fans. This allows the router to continue to function when one of its components fails.

The second technique is to deploy a high available (HA) LER. This configuration uses an active router paired with a standby router. A failure of the active LER causes the standby LER to seamlessly take control of all data traffic.

The third resiliency technique is to use multiple data circuits. Traffic can be shared on these links or one circuit can be the backup for the other.

Lastly, I want to see geo-redundant data circuits in two or more data centers. As with the duplicated links, these circuits can be active/active or one can be designated as the failover link.

The Session Border Controller

If you have been following my No Jitter articles, you likely know my thoughts on session border controllers (SBC) by now. Not only are they necessary for security, but they are used for remote endpoints, call admission control, call recording, routing, and SIP adaptation. I would never open up a network to external SIP traffic without first having it pass through an SBC.

Therefore, resiliency needs to be a critical aspect of every SBC configuration. This critical component of SIP communications must be as rock solid as possible, lest you risk a break in your SIP chain.

As with the LER, I am a big fan of SBCs with redundant components. At a minimum, power supplies and fans should be duplicated and hot swappable.

I am nearly always insistent that an enterprise deploy SBCs as an HA pair. Like an LER, an HA configuration consists of an active SBC paired with a standby SBC. On a sunny day, the active SBC handles all SIP traffic, and the standby only kicks in when the active fails. A link exists between the two that lets the standby be fully aware of all active calls. This facilitates a seamless failover with no lost calls.

It's important to know that an HA pair of SBCs must be separated by a Layer 2 network. This means that they must be on the same subnet. Since most geo-separated data centers are Layer 3 connected, you cannot split an HA SBC pair across data centers. In this case, I commonly recommend an HA pair in the prime site and another HA pair in the disaster recovery data center.

The Communications System

This brings me to the IP SIP PBX. Here, too, I avoid as many single points of failure as possible. This means duplicated call processors, enterprise survivable servers, and when appropriate, survivable branch servers. While some failures may cause calls to drop and call processing to become temporarily suspended, the goal is to minimize the disruption and return service as quickly as possible.

In addition to call processing servers, there will most likely be some form of session management between the SBCs and the call servers. This, too, needs to be made resilient. In my mind, that means N + 1. Determine how many session manager servers you need and add one. Do you need one server? Deploy two. Do you need three? Deploy four.

I do a great deal of work with Avaya Aura, and its session managers support HA as active/active and not active/standby. This means that all session managers process calls at all times. Additionally, a failover from one session manager to another session manager is seamless with no lost calls.

Mischief Managed

While there are certainly more points of possible failure (endpoints, networks, power sources, etc.), these three go a long way in keeping the bulk of an enterprise's communications system up and running even when a disaster strikes. Believe me, servers crash, links die, cooling systems fail, and electricity suddenly disappears. The loss of communications can result in angry customers, lost revenue, and, for example, in verticals like healthcare, death.

Resiliency and redundancy do not come for free, but careful planning coupled with comprehensive risk management will help determine what needs reinforcement and what does not. Failure to plan is planning to fail, and you don't want to be the one responsible for dead air and dropped calls.

Andrew Prokop writes about all things unified communications on his popular blog, SIP Adventures.

Follow Andrew Prokop on Twitter and LinkedIn!
@ajprokop
Andrew Prokop on LinkedIn





COMMENTS



Enterprise Connect Orlando 2018
March 12-15 | Orlando, FL

Connect with the Entire Enterprise Communications & Collaboration Ecosystem


Stay Up-to-Date: Hear industry visionaries in Keynotes and General Sessions delivering the latest insight on UC, mobility, collaboration and cloud

Grow Your Network: Connect with the largest gathering of enterprise IT and business leaders and influencers

Learn From Industry Leaders: Attend a full range of Conference Sessions, Free Programs and Special Events

Evaluate All Your Options: Engage with 190+ of the leading equipment, software and service providers

Have Fun! Mingle with sponsors, exhibitors, attendees, guest speakers and industry players during evening receptions

Register now with code NOJITTEREB to save $200 Off Advance Rates or get a FREE Expo Pass!

December 13, 2017

The two major vendors in the Unified Communications space, Cisco and Microsoft, are both strongly promoting their cloud UC deployments. If cloud UC is on your enterprises roadmap, but you dont want

November 29, 2017

As video conferencing use rises in the enterprise, businesses are looking for ways to bring this technology out of traditional conference room and make it more broadly accessible. That's made the h

November 1, 2017

Your customers (internal and external) demand that you offer them the ability to connect by any means. With the adoption of cloud communications tools you now have access to an expanded portfolio o

November 30, 2017
With a ruling on the FCC's proposed order to dismantle the Open Internet Order expected this month, communications technology attorney Martha Buyer walks us through what's at stake.
October 23, 2017
Wondering which Office 365 collaboration tool to use when? Get quick pointers from CBT Nuggets instructor Simona Millham.
September 22, 2017
In this podcast, we explore the future of work with Robert Brown, AVP of the Cognizant Center for the Future of Work, who helps us answer the question, "What do we do when machines do everything?"
September 8, 2017
Greg Collins, a technology analyst and strategist with Exact Ventures, delivers a status report on 5G implementation plans and tells enterprises why they shouldn't wait to move ahead on potential use ....
August 25, 2017
Find out what business considerations are driving the SIP trunking market today, and learn a bit about how satisfied enterprises are with their providers. We talk with John Malone, president of The Ea....
August 16, 2017
World Vision U.S. is finding lots of goodness in RingCentral's cloud communications service, but as Randy Boyd, infrastructure architect at the global humanitarian nonprofit, tells us, he and his team....
August 11, 2017
Alicia Gee, director of unified communications at Sutter Physician Services, oversees the technical team supporting a 1,000-agent contact center running on Genesys PureConnect. She catches us up on th....
August 4, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, has lately been working on integrating enterprise communications into Internet of Things ecosystems. He shares examples and off....
July 27, 2017
Industry watcher Elka Popova, a Frost & Sullivan program director, shares her perspective on this acquisition, discussing Mitel's market positioning, why the move makes sense, and more.
July 14, 2017
Lantre Barr, founder and CEO of Blacc Spot Media, urges any enterprise that's been on the fence about integrating real-time communications into business workflows to jump off and get started. Tune and....
June 28, 2017
Communications expert Tsahi Levent-Levi, author of the popular BlogGeek.me blog, keeps a running tally and comprehensive overview of communications platform-as-a-service offerings in his "Choosing a W....
June 9, 2017
If you think telecom expense management applies to nothing more than business phone lines, think again. Hyoun Park, founder and principal investigator with technology advisory Amalgam Insights, tells ....
June 2, 2017
Enterprises strategizing on mobility today, including for internal collaboration, don't have the luxury of learning as they go. Tony Rizzo, enterprise mobility specialist with Blue Hill Research, expl....
May 24, 2017
Mark Winther, head of IDC's global telecom consulting practice, gives us his take on how CPaaS providers evolve beyond the basic building blocks and address maturing enterprise needs.
May 18, 2017
Diane Myers, senior research director at IHS Markit, walks us through her 2017 UC-as-a-service report... and shares what might be to come in 2018.
April 28, 2017
Change isn't easy, but it is necessary. Tune in for advice and perspective from Zeus Kerravala, co-author of a "Digital Transformation for Dummies" special edition.
April 20, 2017
Robin Gareiss, president of Nemertes Research, shares insight gleaned from the firm's 12th annual UCC Total Cost of Operations study.
March 23, 2017
Tim Banting, of Current Analysis, gives us a peek into what the next three years will bring in advance of his Enterprise Connect session exploring the question: Will there be a new model for enterpris....
March 15, 2017
Andrew Prokop, communications evangelist with Arrow Systems Integration, discusses the evolving role of the all-important session border controller.
March 9, 2017
Organizer Alan Quayle gives us the lowdown on programmable communications and all you need to know about participating in this pre-Enterprise Connect hackathon.
March 3, 2017
From protecting against new vulnerabilities to keeping security assessments up to date, security consultant Mark Collier shares tips on how best to protect your UC systems.
February 24, 2017
UC analyst Blair Pleasant sorts through the myriad cloud architectural models underlying UCaaS and CCaaS offerings, and explains why knowing the differences matter.
February 17, 2017
From the most basics of basics to the hidden gotchas, UC consultant Melissa Swartz helps demystify the complex world of SIP trunking.
February 7, 2017
UC&C consultant Kevin Kieller, a partner at enableUC, shares pointers for making the right architectural choices for your Skype for Business deployment.
February 1, 2017
Elka Popova, a Frost & Sullivan program director, shares a status report on the UCaaS market today and offers her perspective on what large enterprises need before committing to UC in the cloud.
January 26, 2017
Andrew Davis, co-founder of Wainhouse Research and chair of the Video track at Enterprise Connect 2017, sorts through the myriad cloud video service options and shares how to tell if your choice is en....
January 23, 2017
Sheila McGee-Smith, Contact Center/Customer Experience track chair for Enterprise Connect 2017, tells us what we need to know about the role cloud software is playing in contact centers today.