Matt Coulter, Senior Portfolio Architect, Liberty Mutual Insurance, enjoys telling this joke: 10 years ago, the insurance company’s Amazon Web Services bill was so small that it could fit on an employee’s credit card without the employee noticing. However, over the next decade, the company wanted to effect a wide-scale digital transformation and ensure it had set up a development environment that allowed its engineers to create and launch new products rapidly and without much technical debt.
In 2019, the company began investigating Amazon Web Services Cloud Development Kit (CDK) as a solution for effecting digital transformation across a very challenging technical environment.
“Liberty Mutual is a 110-year-old Fortune 100 company that provides property and casualty insurance across the globe, which means that we have a lot of different systems running in different languages and frameworks,” Coulter said. “We needed to make sure that we brought everyone along on the journey, and we knew that we didn’t want to just “lift and shift” what we had to the cloud.”
The whole enterprise's move to AWS helped Liberty Mutual's customer-facing contact centers or customer service infrastructure by incorporating virtual agent technology into the process. That allows its customers to self-serve, and they can always pass through and speak to an agent at any time. "We find the vast majority of customers love the self-service option, which alleviates the instant pressure on the call center, reducing wait times dramatically."
Bridging a Skills Chasm With AWS CDK
The approach required Liberty Mutual engineers to learn how to deploy to the cloud—at a time when they hadn’t yet defined what good looked like—and many of the easy-to-use cloud capabilities that exist today hadn’t yet become available. This process required predominantly Java developers who had never really focused on infrastructure before to start defining the infrastructure in long, detailed YML files. Complicating the matter was the files’ slow and persnickety handling of syntax: If a developer put one space in the wrong place, it wouldn’t deploy (after a potentially several-minute wait). Then, most of the best practices for serverless application development required TypeScript or Python, not Java.
“The root problem was that even with all our guardrails, it was too big of a shift from the skills they had spent years refining, so the skills chasm across the enterprise widened,” Coulter said.
AWS CDK allowed developers to code AWS infrastructure in the languages they already knew and loved, and at deploy time, that infrastructure became CloudFormation.
“Keeping it as CloudFormation at deploy time meant it broke none of our previous rules or guardrails. This plan seemed like a fantastic compromise to meet the developers where they are and bridge the gap while not stepping away from any solutions that we had previously invested in,” Coulter said.
AWS Cloud Development Kit: Implementation
Coulter described how he took AWS CDK and built a [programming] construct that reduced the infrastructure code required to deploy one of Liberty Mutual’s most common patterns from thousands of lines of code down to about 15 lines of TypeScript.
Not only did this involve less code, but Coulter had baked in all AWS Well-Architected practices, so the developers didn’t need to start learning from scratch every time. “This was the moment I knew we could create our environment where our engineers could rapidly deliver customer solutions by reusing well-architected patterns.” At this exact moment, Coulter added, Liberty Mutual’s developers were building its [serverless] patterns reuse catalog website: The Software Accelerator.
Coulter implemented what he called: a “Skeleton Pattern,” i.e., the minimum amount of code needed to have a working production pipeline to AWS. After adding these pieces, Coulter and his team ran the workshop globally, so everyone knew about the technology. “We always finished with everyone using the skeleton pattern so that we could say, ‘after you leave today, you have everything you need to build your next production solution.’”
In the following months, Coulter expressed that developers used these serverless patterns thousands of times, and dozens of teams used them as the basis for contributing new patterns back to the catalogue. “We truly had a cultural flywheel of innovation based on developer sharing.”
Fast forward to today, and Liberty Mutual has over 85,000 Lambda functions deployed behind over 5,000 API Gateways. At AWS re:Invent 2020, Amazon Web Services announced Lambda container images support. With
AWS Lambda, developers can upload code—and run it without thinking about servers. Liberty Mutual had a Well-Architected pattern in its reusable [serverless] pattern library after the announcement went live. By the following Monday (six days after the announcement), production was already using the catalog.
Coulter explained how it’s not uncommon for people using a piece of the cloud that’s new to them to be in production in less than a day or two later. “That’s the power of reusable patterns—empowering our engineers to focus on solving our biggest problems for our customers even faster.”
Lessons Learned: What Liberty Mutual Wants You to Know
Coulter emphasized that he has three top takeaways. First: building from industry standards over company-specific refinements is infinitely more scalable. “Every external talk we [have] about our solutions is instantly relatable to people outside of Liberty Mutual,” Coulter added that when he’s between teams to address what they’re building, he already has the blueprint for how the conversation will go. “Micro-optimizations per team—or even at the organization level—create a bubble of knowledge lock-in that doesn’t gel well with individual team member movement between teams or in the organization.”
Coulter’s next takeaway is to have enabling constraints and automated telemetry/metrics for enablement to succeed. “You need to give your engineering teams a direction that you want them to travel and the ability to self-assess how close they are, he explained. “Otherwise, how do you know you have transformed, and how do they know they aren’t burning out learning the wrong solutions?”
When choosing solutions, Coulter said that Liberty Mutual selects the best technology per cloud for deployment rather than one tool across every cloud. “If you want to have the same experience as I mentioned with AWS CDK on a cloud other than AWS, then you can use CDK for Terraform or Pulumi, he explained. “Both tools allow you to use the AWS CDK constructs when deploying to AWS but also build patterns for and deploy to the other cloud providers.”
Lastly, regarding specific learnings around AWS CDK, Coulter explained that there’s a skill to creating the right level of abstraction.
“Engineers—when presented with the ability to create abstractions, will always prefer to create their own that’s hyper-specific for their own purpose, even if there’s one that’s 99.8% perfect for what they need.” Architects must be close enough to [developer] teams to have the conversation and drive that reuse model where they contribute back the 0.2% rather than diverge. This divergence doesn't need to be a blocking model, as teams can diverge for a while and then converge through refactoring later, "You need to have that picture of your solutions to ensure high quality."