Skip to main content

Service Health Dashboard

Service Health Dashboard (or Dashboard) provides a high-level overview of the reliability of your organization’s services. The Dashboard targets product managers or executives who do not require a granular view of each SLO and instead are looking for a holistic view of reliability within their organization. Engineers or SREs can also use it to see a snapshot of the current state of their environment and drill down for more information.

The Dashboard gives an aggregated view of the overall organizational health. It shows which Services are at risk or have burned through their error budget. All Services are grouped by Projects in the Dashboard.

Users can check the following things using the Dashboard:

  • Which Services are healthy so that the maintenance of other services can be prioritized.

  • Which Services are trending in the wrong direction so that relevant engineering teams can be used to pre-empt a potential failure.

  • Which Services have burnt their error budget so that a relevant engineering team can fix them.

Monitoring Health of Services

The Dashboard displays services in terms of their health, following the below coloring pattern and naming logic:

  • Healthy: All SLOs in this Service have more than 20% of the error budget still available.
  • At risk: All SLOs in this Service still have a remaining error budget, and at least one SLO for this Service has less than 20% of the error budget left. For example, Service X displays an ‘At Risk’ status because SLO C and SLO D are under 20%:

    Service XRemaining error budget
    SLO A84%
    SLO B45%
    SLO C 9%
    SLO D19%
    SLO E95%
  • Exhausted: At least one of the SLOs in this Service has burnt its error budget in the current time window, and at least one SLO for this Service has less than 20% of the error budget left.

    For example, Service Y displays an ‘Exhausted’ status because SLO B has already burnt its error budget in a specified time window:

    Service YRemaining error budget
    SLO A84%
    SLO B-55%
    SLO C 9%
    SLO D19%
    SLO E95%
  • No Data: There is no data available for the Service’s SLOs, or the error budget hasn’t been calculated yet.
note

Each Service is evaluated based on the current time window. This way, the Dashboard gives a high-level view of all Services in the user’s organization, even if the error budgets for each SLO are calculated differently (e.g., based on time slices or occurrences).

Changing the View Mode

Users can adjust how the Services are displayed on the Dashboard and choose the view mode that suits their needs best. To change the display mode:

  1. Go to the ‘View’ drop-down list at the top pane of your Dashboard.

  2. Click the drop-down list.

  3. Choose between the available view options:

  • View-circle:

    circle mode
    Image 1: Services in View-circle mode
  • View-hexagon:

    hexagon mode
    Image 2: Services in View-hexagon mode
  • View-circle + icons view: We recommend this accessibility feature for those users that are color-blind:

    View-circle + icons view
    Image 3: Services in View-circle and icons view
  • View-circle + icons view: We recommend this accessibility feature for those users that are color-blind:

    • available icon icon means that the Service is healthy.
    • icon means that the Service is at risk.
    • icon means that the Service is exhausted.

Accessing Service Details

The Dashboard enables a more detailed view as well. Clicking a specific Service on the Dashboard shows a summary of the SLOs with their remaining budgets. To access Service details:

  1. Go to the Dashboard in the main navigation pane.

  2. Click the relevant Service.

  3. A pop-up window will appear, showing a summary of the SLOs with the remaining error budget.

tip

You can see the details of the specific SLOs by clicking them on the pop-up list.

Video 1: Accessing SLO details from the Dashboard

Filtering Dashboard

You can filter the data displayed on the Service Health Dashboard to see only those services that are healthy, at risk, or exhausted. To apply filters, hover over the filter icon at the top of the screen and choose the relevant category.

Applying filters on Service Health Dashboard
Image 4: Applying filters on Service Health Dashboard

You can also filter the Service Health Dashboard using labels. This method enables filtering the Dashboard based on labels added to services, projects, and SLOs. As such, you can easily customize the set of Services displayed in the Dashboard.

tip

The applied filters are now persisted in the Dashboard URL. Use them as a deep link to a selected filtered view.

Sorting Dashboard

The Dashboard also allows users to sort the display. This feature affects how the Projects are ordered on the Dashboard. By default, Projects are sorted by State. This display mode follows the below assumptions (from the left-hand side to the right-hand side of the UI):

  • Projects with the highest amount of Exhausted Services are displayed first.

  • They are followed by the Projects that have the highest number of At risk Services.

  • The Project that have all Healthy Services are displayed last.

For example:

Image 5: Example of Projects sorted by State

The same rules apply to how the individual Services are ordered in a Project (from top to bottom):

  • Services with the highest amount of Exhausted Services are displayed first.

  • They are followed by the Services with the highest number of At risk Services.

  • The Healthy Services are displayed last.

For example:

Image 6: Example of Services ordered in a Project

This way, users can quickly identify these Projects and Services that require immediate attention.

Projects can also be sorted alphabetically:

  • In an ascending (A-Z) order or

  • In a descending (Z-A) order.

This display mode affects only the order of projects on the Dashboard. It does not affect the logic of displaying Services in a Project. This way, users can readily see on the left-hand side of the page those Services that are Exhausted or At risk.

Changing the Sorting Mode

To change the sorting mode:

  1. Click the ‘Sort by’ drop-down menu on the top right corner of the Dashboard.

  2. Choose and click one of the following options:

  • Projects A-Z sorts the Dashboard view by Project names in ascending order.

  • Projects Z-A sorts the Dashboard view by Project names in descending order.

  • State: sorts the Dashboard view by the state of Services in each Project. Projects with the highest number of exhausted Services or Services at risk of burning their error budget are displayed first.

Remaining Error Budget

By clicking on a Service icon on the Dashboard, you can see the values of the Error Budget Remaining for all SLOs within that Service. Depending on the type of SLO (standard/composite), there's a difference in the logic of the displayed value:

  • For standard SLOs, Nobl9 displays the remaining error budget of the most burned objective.

  • For Composite SLOs, Nobl9 displays the remaining budget of the composite SLO. Check CompositeSLOs Guide for more details about error budget calculations for composite SLOs.

Searching a Project

If you are searching for a specific project, click the ‘Search’ box in the top-right corner of the Dashboard and enter the name of the Project. The screen will update once you enter the Project name.