Skip to main content

SLO oversight

Reading time: 0 minute(s) (0 words)

SLO oversight provides visibility into the performance and quality of your Service Level Objectives (SLOs). It helps organizations track, analyze, and continuously improve their reliability initiative.

SLO oversight enables teams to:

  • Proactively identify issues to prevent incorrect SLO reporting.
  • Standardize reliability practices across teams and services.
  • Make data-driven decisions to improve the SLO program.
  • Fine-tune reliability reporting for stakeholders.
  • Validate measurement accuracy by detecting data anomalies.
  • Track SLO review processes to ensure ongoing relevance.

Core concepts of SLO oversight​

SLO oversight ensures your organization effectively uses SLOs to monitor system reliability, providing the central oversight needed for your reliability program. It covers the following core principles:

  • Ensuring SLOs reflect actual system reliability
    Make certain your SLO configuration balances the desired service level with the real capabilities of your system.
  • Calibrating health category thresholds
    Review Service Health thresholds to see how services and SLOs are distributed across health categories.
  • Keeping SLOs relevant
    Ensure they're reviewed regularly to align with current business needs.
  • Maintaining robust data integrity
    Keep an eye on any data anomalies detected in your SLOs.

By validating and evaluating these pillars, you create a reliable foundation for your SLOs. The ultimate goal is SLOs that report with high fidelity. It means that the reliability values they produce reflect the real-world performance of your service, giving you authentic data for informed decision-making.

The SLO oversight dashboard helps you turn these principles into actionable insights. Its purpose is to put the SLO oversight into practice, segmenting it to discrete tasks.

SLO oversight in practice​

The SLO oversight dashboard gives you a clear view of your how your organization's SLOs are performing. It focuses on three main areas to help you manage SLOs:

  • Service and SLO health monitoring
    • Shows the current health status of services and SLOs.
    • Lets you track error budget usage to find SLOs that are at risk or have been used up.
    • Provides detailed lists of services and SLOs to help you identify health issues.
  • SLO review workflow
    • Includes a review process to keep SLOs accurate and up to date, incorporating the review flow into service and SLO lists and details.
    • Allows assigning responsible people at the service level, with ownership automatically inherited by all underlying SLOs.
    • Keeps a record of all SLO review events with editable annotations.
  • Data integrity diagnostics
    • Provides insights into detected data anomalies to help improve SLO quality.
    • Annotates SLI charts when data anomalies occur, providing context for troubleshooting.

Resource lists​

While the SLO oversight dashboard highlights areas that require your attention, you may want to examine them more closely. Resource lists can help with this. In the service and SLO lists, you can find quick table presets that help you assess their current state in detail. The following presets are available:

  • Catalog > Services
    • Health overview
      This preset focuses on SLO and service health, displaying columns with exhausted, at-risk, and healthy SLOs according to their remaining error budget.
    • Completed reviews
      This preset focuses on reviewed SLOs. It includes the percentage of reviewed SLOs per service, review due dates, schedule details, and users responsible for each service.
    • Review debt
      This preset focuses on SLOs with the Overdue review status. It includes the percentage of overdue review SLOs per service, review due dates, schedule details, and users responsible for each service.
    • SLO quality
      This preset provides a deep dive into the SLO quality widget, showing how many SLOs have review debt and data anomalies, and the users responsible for each service.
  • Service Level Objectives > SLO list
    • Health overview
      This preset focuses on the remaining error budget. The table is sorted by the most exhausted SLO in descending order and includes the Reliability and Error Budget columns.
    • SLO review
      This preset displays columns related to reviewsβ€”review status of SLOs, review notes, due dates, schedules, and users responsible for each SLO's service.

You can also customize the table view to fit your needs by selecting the available columns.

Table: Available columns in the service and SLO lists

ColumnDescriptionList
ServiceService name or display nameBoth
ProjectProject where a service or SLO is locatedBoth
LabelsLabels attached to a service or SLOBoth
ResponsibleUsers responsible for a serviceBoth
Review due dateDue date of the next reviewBoth
Review scheduleFrequency of the review scheduleBoth
Reviewed SLOsNumber of SLOs with the Reviewed status that a service containsService
To review SLOsNumber of SLOs with the To review status that a service containsService
Overdue review SLOsNumber of SLOs with the Overdue review status that a service containsService
Skipped review SLOsNumber of SLOs with the Skipped review status that a service containsService
Not started review SLOsNumber of SLOs with the Not started review status that a service containsService
Total SLOsTotal number of SLOs that a service containsService
SLOs to checkSLOs with review debt or data anomaliesService
Service healthExhausted, at-risk, or healthy, according to the Service health dashboardService
Exhausted SLOsSLOs with exhausted error budget remainingService
At risk SLOsSLOs with error budget remaining between the exhausted and healthy thresholdsService
Healthy SLOsSLOs with sufficient error budget remaining, according to the Service health dashboard thresholdsService
SLO review statusReview status of an SLOSLO
Review noteAn annotation added to an SLO on its transition to the current statusSLO
ObjectivesList of of objectives in an SLOSLO
TargetDesired level of SLIs set for an SLO objectiveSLO
ReliabilityPercentage of SLIs above the SLO objective's targetSLO
Error budgetPercentage of the error budget remainingSLO

In addition, you can filter the service and SLO lists to narrow down the results. To do this, click Show filters above the list on the right and select the required resources.

Opening service list filters
Opening service list filters

The following filters are available:

Filter categoryDescriptionList
ProjectDisplay services or SLOs located in a specific projectBoth
ServiceDisplay specific services or SLOs located in specific servicesBoth
LabelsDisplay services or SLOs tagged with selected labelsBoth
SLO typeDisplay multi-level hierarchy composite SLOs or single-level hierarchy standard SLOsSLO
Error budget statusDisplay SLOs with healthy, at risk, or exhausted error budgets, or SLOs that lack data to calculate their remaining error budget.
This filter uses the thresholds set for the Service health dashboard
SLO
SLO review statusDisplay SLOs with particular review statuses: Overdue, To review, Skipped, Reviewed, and Not startedSLO
Reviewed withinDisplay SLOs with the latest review date within the last calendar month, quarter, six months, more than six months ago, or never reviewedSLO
ResponsibleDisplay SLOs located in services with responsible users that you selectSLO
Check out these related guides and references: