Skip to main content

SLO reviews
Enterprise

Reading time: 0 minute(s) (0 words)

Service reviews in Nobl9 provide comprehensive SLO oversight and deliver actionable insights for continuous SLO improvement. Reviews help organizations keep their SLOs relevant, ensuring effective monitoring of service performance.

SLO governance and reviews are mutually connected: through automated data integrity tracking and service health analysis, teams can conduct systematic reviews to identify any inconsistencies in their SLOs and implement proactive service management strategies.

Review schedules
Enterprise

Review scheduling is available only in the Nobl9 Enterprise Edition.

Reviews in a nutshell​

  • Set a review at the service level
    • Reviews are scheduled for services and applied to all SLOs they contain.
    • Moving SLOs to services with a review schedule applies the review settings to the moved SLOs.
    • Nobl9 tracks the review progress and provides details on review debt, frequency, and statuses.
  • Configure repeated reviews
    • Set the required cycle frequency, which defines the review due dates.
    • You can configure your Custom cycle frequency using the rrule in the iCalendar format.
  • Monitor the review debt
  • Stay aware of review details
    • Review statuses, due dates, and other review schedule settings are displayed in the SLO and service lists.
Recommendations for effective reviews
  • Schedule reviews at least for essential services.
  • Assign users responsible for these servicesβ€”they can act as a point of contact if any issues with a service appear.
  • Keep the review deadlines concise and tailored to your business objectives.
  • Ensure To review SLOs are reviewed before the review due date.

Scheduling a review​

To schedule a review, open the service wizard. Select Schedule review and configure the schedule start, time zone, and repeat options:

Scheduling review
Scheduling a review

The review deadline depends on the Repeat settings: the due date matches the start date of the next cycle. For example, if the schedule starts on October 17 and repeats weekly, the due dates will be as follows:

1st due date2nd due date3rd due date4th due date
October 17October 24October 31November 7
Custom repetition schedule

To customize the frequency of repetition, enter the recurrence rule in the iCalendar format (RRULE).

Need help? Use the rrule tool to generate your rule.

Once the review is scheduled, the To review status is applied to all SLOs in the service.

Reviewing SLOs​

When reviewing an SLO, you may want to:

SLO guides

Refer to the SLO guides where we explain budget-critical settings in detail.

Make the necessary fixes. Once done, set the SLO's status to Reviewed.

Tools for SLO reviews
ToolDescription
SLI AnalyzerPinpoint reasonable targets, assess incoming data, and tailor the time window and budgeting method for your SLO
ReplayRetrieve historical data for an SLO after you correct its configuration
Budget adjustmentsExclude events that might skew SLO performance data if these events aren't reflecting your service's actual reliability
Query checkerValidate that your SLOs for Datadog, Dynatrace, and New Relic are working correctly
Data source event logsDiagnostic logs from an SLO's data source available for data sources connected using the direct method
Agent metricsVerify that an SLO's data source is collecting data correctly. They are available for data sources connected using the agent method

On each review status change, Nobl9 automatically generates an SLO annotation of the Review note type. These annotations fall into the User category, and you can edit them.

Review note

As a best practice, we recommend adding a note after every transition in review status.
SLO annotations support Markdown.

If a review is not needed for an individual SLO, you can skip it. The status of such an SLO becomes Skipped. It won't count toward the overdue SLOs number.

An SLO with the To review status that hasn't been reviewed before the due date becomes Overdue. Overdue SLOs affect SLO quality. Their status doesn't change with a new review cycle or after editing the review schedule. To address this, review such an SLO and transition in to Reviewed.

When no review is scheduled for the service, all its SLOs have the Not started review status. You can still review such SLOs individually and transition them to Reviewed.

Status transitions​

SLOs can be of five review statuses.

The initial status is Not started, indicating that the SLO's review has not been scheduled for its corresponding service and that the SLO hasn't been reviewed.

Once a review is scheduled, on its first due date, all Not started SLOs automatically become To review.

Status transitions in a nutshell​

An SLO's review status can change in two ways: manually or automatically based on the review schedule.

Manual status transitions depend on two factors: whether a review is scheduled for the SLO's service and what its current status is.

The following table lists allowed transitions between statuses, considering the review schedule existence. Rolling back is available for all transitions listed.

Table: Allowed manual status transitions based on review schedule

Is review scheduled?FromTo
NoNot started / notStartedReviewed / reviewed
YesTo review / toReviewReviewed / reviewed
YesTo review / toReviewSkipped / skipped
YesSkipped / skippedReviewed / reviewed
YesOverdue / overdueReviewed / reviewed
YesOverdue / overdueSkipped / skipped
Manual status transition based on review schedule and status
Manual status transition based on review schedule and status

The tables below describe basic status transitions and scenarios where no automatic status change occurs.

Table: Possible review status transitions

Status beforeScenarioStatus afterTransition type
Not startedAdd a review schedule; start date is now or earlierTo reviewAuto
Not startedReview an SLO without the scheduleReviewedManual
To review, Skipped, OverdueReview an SLOReviewedManual
To review, OverdueSkip an SLO reviewSkippedManual
To reviewMove an SLO to a service without a review scheduleNot startedAuto
To reviewNew review cycle starts; SLO hasn't been reviewedOverdueAuto
SkippedNew review cycle startsTo reviewAuto
ReviewedNew review cycle startsTo reviewAuto

Table: Scenarios not resulting in status transition

ScenarioStatus
Set a review schedule the start date is in the futureThe Not started status don't change until the start date
β€’ Move an SLO to a service with a review due date in the future
β€’ Defer a review due date
All review statuses retain until the start date
β€’ Discard a review schedule
β€’ Move an SLO to a service without a scheduled review
The Reviewed status don't change
β€’ New review cycle starts
β€’ Modify a review schedule
The Overdue status don't change

Recommendations for an SLO review​

Based on industry-accepted SRE best practices, we recommend considering the following key areas when performing an SLO review:

  1. SLO performance and error budget analysis

    • Was the SLO met?
      This is the most fundamental check. Was the service reliable enough over the review period?
    • How much of the error budget was consumed?
      • High consumption
        If you're close to exhausting the budget, why? Was it due to a single major incident or an accumulation of many minor issues? Analyze the consumption trend to determine where to focus engineering efforts.
      • Low/No consumption
        If you're consistently consuming very little of your error budget, is the SLO too loose? An overly loose SLO fails to provide a useful signal and can obscure minor issues. It might be time to tighten the target.
    Hint: Check for Constant or No burn data anomalies detected for this SLO.
  2. SLO definition and relevance

    • Is this still the right SLO?
      Does it accurately reflect the user experience? For example, if you're measuring latency, is it still the most critical indicator of user satisfaction, or has another metric (like success rate) become more important?
    • Is the SLO target still appropriate?
      Is it still meaningful for the business and users? Has the service's criticality changed? A more critical service may require a stricter SLO, while a less critical one may not justify the effort needed to maintain a high target.
    • Does the measurement time window make sense?
      Is the current period still the right one for assessing performance?
    Hint: Use the SLI Analyzer to reevaluate the target, values, and the time window.
  3. Alerts and incident response

    • Did the SLO alerts fire when expected?
      If the error budget was significantly impacted, did you receive timely alerts? If not, consider tuning the alert policies applied to the SLO.
    • Were there any near-misses?
      Were there periods of high error budget burn that didn't trigger an alert but should have?
    • What was the outcome of incidents related to this SLO?
      Review the post-mortems for any incidents that consumed error budget. The SLO review is the perfect time to ensure that action items from these post-mortems are being implemented.
    Hint:
  4. User and business context

    • Is there a correlation between error budget burn and user-facing issues?
      Check if periods of high error budget burn correlate with an increase in customer support tickets, negative feedback, or other business metrics. If there's no correlation, your SLO might be reflecting the wrong thing.
    • What is the business impact of the current level of reliability?
      Is the business happy with the service's performance? The answer to this question is the ultimate test of your SLO's efficacy.
    Hint: Dive into SLO configuration details with our SLO guides section.

An SLO review is a powerful, proactive process. It helps ensure that your SLOs are not just numbers, but are actively driving engineering decisions, prioritizing work, and ensuring your service meets user expectations.

Check out these related guides and references: