Skip to main content

Service Health Dashboard

Service Health Dashboard (or Dashboard) provides a high-level overview of the reliability of your organization’s services. The Dashboard targets product managers or executives who do not require a granular view of each SLO and instead are looking for a holistic view of reliability within their organization. Engineers or SREs can also use it to see a snapshot of the current state of their environment and drill down for more information.

The Dashboard gives an aggregated view of the overall organizational health. It shows which Services are at risk or have burned through their error budget. All Services are grouped by Projects in the Dashboard.

Users can check the following things using the Dashboard:

  • Which Services are healthy so that the maintenance of other services can be prioritized.

  • Which Services are trending in the wrong direction so that relevant engineering teams can be used to pre-empt a potential failure.

  • Which Services have burnt their error budget so that a relevant engineering team can fix them.

Monitoring Health of Services

The Dashboard displays services in terms of their health, following the below coloring pattern and naming logic:

  • Healthy: All SLOs in this Service have more than 20% of the error budget still available.
  • At risk: All SLOs in this Service still have a remaining error budget, and at least one SLO for this Service has less than 20% of the error budget left. For example, Service X displays an ‘At Risk’ status because SLO C and SLO D are under 20%:
    Service XRemaining error budget
    SLO A84%
    SLO B45%
    SLO C 9%
    SLO D19%
    SLO E95%
  • Exhausted: At least one of the SLOs in this Service has burnt its error budget in the current time window, and at least one SLO for this Service has less than 20% of the error budget left.

    For example, Service Y displays an ‘Exhausted’ status because SLO B has already burnt its error budget in a specified time window:

    Service YRemaining error budget
    SLO A84%
    SLO B-55%
    SLO C 9%
    SLO D19%
    SLO E95%
  • No Data: There is no data available for the Service’s SLOs, or the error budget hasn’t been calculated yet.
note

Each Service is evaluated based on the current time window. This way, the Dashboard gives a high-level view of all Services in the user’s organization, even if the error budgets for each SLO are calculated differently (e.g., based on time slices or occurrences).

Changing the View Mode

Users can adjust how the Services are displayed on the Dashboard and choose the view mode that suits their needs best. To change the display mode:

  1. Go to the ‘View’ drop-down list at the top pane of your Dashboard.

  2. Click the drop-down list.

  3. Choose between the available view options:

  • View-circle:

    Circles

  • View-hexagon:

    hexagons

  • View-circle + icons view: We recommend this accessibility feature for those users that are color-blind:

    icons

  • View-circle + icons view: We recommend this accessibility feature for those users that are color-blind:

    • available icon icon means that the Service is healthy.
    • icon means that the Service is at risk.
    • icon means that the Service is exhausted.

Accessing Service Details

The Dashboard enables a more detailed view as well. Clicking a specific Service on the Dashboard shows a summary of the SLOs with their remaining budgets. To access Service details:

  1. Go to the Dashboard in the main navigation pane.

  2. Click the relevant Service.

  3. A pop-up window will appear, showing a summary of the SLOs with the remaining error budget.

tip

You can see the details of the specific SLOs by clicking them on the pop-up list.

Filtering Dashboard

Users can filter the data displayed on the Service Health Dashboard to see only those services that are healthy, at risk, or exhausted. To apply filters, hover over the filter icon at the top of the screen and choose the relevant category.

tip

Filters can be combined.

Sorting Dashboard

The Dashboard also allows users to sort the display. This feature affects how the Projects are ordered on the Dashboard. By default, Projects are sorted by State. This display mode follows the below assumptions (from the left-hand side to the right-hand side of the UI):

  • Projects with the highest amount of Exhausted Services are displayed first.

  • They are followed by the Projects that have the highest number of At risk Services.

  • The Project that have all Healthy Services are displayed last.

For example:


The same rules apply to how the individual Services are ordered in a Project (from top to bottom):

  • Services with the highest amount of Exhausted Services are displayed first.

  • They are followed by the Services with the highest number of At risk Services.

  • The Healthy Services are displayed last.

For example:

This way, users can quickly identify these Projects and Services that require immediate attention.

Projects can also be sorted alphabetically:

  • In an ascending (A-Z) order or

  • In a descending (Z-A) order.

This display mode affects only the order of projects on the Dashboard. It does not affect the logic of displaying Services in a Project. This way, users can readily see on the left-hand side of the page those Services that are Exhausted or At risk.

Changing the Sorting Mode

To change the sorting mode:

  1. Click the ‘Sort by’ drop-down menu on the top right corner of the Dashboard.

  2. Choose and click one of the following options:

  • Projects A-Z sorts the Dashboard view by Project names in ascending order.

  • Projects Z-A sorts the Dashboard view by Project names in descending order.

  • State: sorts the Dashboard view by the state of Services in each Project. Projects with the highest number of exhausted Services or Services at risk of burning their error budget are displayed first.

Searching a Project

If you are searching for a specific project, click the ‘Search’ box in the top-right corner of the Dashboard and enter the name of the Project. The screen will update once you enter the Project name.