What are service level objectives?
What’s an SLO?
Service level objective (SLO) is a goal set using the data you receive from monitoring. It gauges how well the product is doing and helps to point out things like trends—maybe there’s consistent downtime when code changes are implemented, or you’ll be able to forecast the availability of your service a bit better.
A good SLO empowers you to understand what target level of reliability is optimal for your service’s customers. It sets a threshold above which most customers should be happy and below which you should consider investing more in improving reliability.
SLOs are the key to the ultimate value proposition for any business. They give you the visibility to ensure that your products and services are meeting your expectations of your customers’ experience. SLOs also provide you with the framework to be confident of not having 100% reliability in a planned way.
The great thing about an SLO is it can be as generic or specific as you like as long as you’re recording the data to measure it. And SLOs are only for internal purposes, so don’t be afraid to change them!
Why SLOs?
SLOs are essentially a target set for a given system’s desired consistency of behavior over time. Tracking SLO adherence is dependent on SLIs, which are a way to measure if an SLO is being met or not. These SLIs are the actual numbers or queries you pull from your monitoring stack. You apply a little bit of math to them, which tells you how well you’re meeting your goals.
If a system or service is not meeting its SLO, there are consequences such as performance degradation, outages, or impact on customers.
As an SLO approaches (but never reaches) 100%, the cost of reliability increases exponentially. Your software releases slow down as you build expensive infrastructure far beyond user expectations. Not all downtime is equal, so fine-tuning SLOs that correlate with business KPIs will give you a clearer picture of where to spend your time and energy—delivering new features or paying down technical debt. Not just nines for nines’ sake; find the right nines for the job.
SLOs balance the possible negative repercussions and the lack of availability to determine a reasonable uptime goal for that service, with the downtime afforded in that SLO as the error budget.
Join the SLO development lifecycle (or SLODLC) community that develops the methodology for defining reliability and performance goals for software services across an enterprise.
SLODLC community is an open-source project spearheaded by Nobl9 that gathers people across the entire SRE ecosystem to help overcome the challenges to SLO adoption by:
- Repeatable methodology for SLO adoption.
- Playbook for consultation and service providers.
- Agile model that fits your existing practices.
Useful links
SLOs are a Win / Win for Your Products and Services
What Every CEO Needs to Know about SLOs
An Easy Way to Explain SLOs and SLAs to Business Executives
Optimizing Cloud Costs through Service Level Objectives
SRE Blueprint: Creating and Fulfilling SLOs for Optimized Business Outcomes