Replay restrictions

Reading time: 0 minute(s) (0 words)

Limitations and RBAC

Two slots are allocated for every organization. This means you can run two Replays at a time.
Track job progress in the Job Status widget.
To replay SLOs, queue Replays, remove them from the queue, and cancel data import, your role must be Organization admin, Project owner, or Project editor.
Learn more about RBAC.

Data sources can alter data from the past

Metric gathering systems usually downsample older data using different aggregate functions like mean or sum or simply by dropping data points. This is aimed at saving space and can affect the result of a query made against a time range in the past. Refer to the documentation of your data source for more details.

Replaying a single SLO may take up to an hour depending on:

The length of the replayed period
The number of objectives in your SLO
The number of unique queries used in your SLO

Replay impact on connected resources

Running Replay for existing SLO has important consequences on SLI data and reports, alerts, composite SLOs.

Impact on SLI data and reports

Live data is gathered while Replay is in progress but isn't considered in calculating SLO's error budget until the process is complete.
Replay queries your data source once again for the entire selected historical period, even if data for part of this period is already collected by your SLO.
It can completely replace existing SLI data.
Data resolution can be lowered due to the downsampling of historical data. It depends on the data source.
As a result, the SLI chart can look different upon replaying, even when the query remains the same.
Replay won't always fill in missing data points. If there are gaps in data, Replay instead marks these gaps as shown in the examples below.
This happens when the data source doesn't keep data for as long as you're trying to retrieve, for example, according to the data retention policy.
Maximum Period for Historical Data Retrieval prevents exceeding the data source's retention period.

The example of missing SLI data before and after replaying:

Original input SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
2023-01-01 01:22:00 = 270
2023-01-01 01:24:00 = 220
2023-01-01 01:25:00 = 130
2023-01-01 01:26:00 = 280
2023-01-01 01:27:00 = 200

Retrieved SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
[...] # Gap in the data stream
2023-01-01 01:28:00 = 90
2023-01-01 01:29:00 = 220
2023-01-01 01:30:00 = 270
2023-01-01 01:31:00 = 190

Replayed SLI data
2023-01-01 01:20:00 = 100
2023-01-01 01:21:00 = 230
[...] # Gap in the data stream
2023-01-01 01:28:00 = 90
2023-01-01 01:29:00 = 220
2023-01-01 01:30:00 = 270
2023-01-01 01:31:00 = 190

Reports depend on accurate SLI data to provide insights on service performance or assess overall system health. Since Replay overrides previously collected data, and this can alter data resolution or introduce gaps, the calculations and insights provided by reports might differ from previous iterations, impacting trend observation.

Impact on alerts

Alerting is suspended for the entire Replay duration and resumed once Replay is complete.
Once Replay is complete:
- You won't receive already received alerts for the recalculated historical period again.
- You receive missed alerts: the alerts triggered when Replay was running.
  These alerts are triggered based on the recalculated data.

Impact on composite SLOs

Currently, you can't replay a composite SLO, but only its components.

Replaying components of a composite causes no retroactive changes to the composite data. The replayed component stops reporting data until the process is complete. It is treated according to your maxDelay and, if longer, whenDelayed settings. The overall composite error budget calculations depend on the duration of the Replay process, the component's maxDelay settings, and the existence of components without a delay in the composite.

Non-delayed components?	Replay vs. `maxDelay`	Result
Yes	Replay<`maxDelay`	The composite pauses for the duration of Replay. Component's data collected after replaying is considered in calculations as usual.
Yes	Replay>`maxDelay`	Component's data is considered in calculations according to `whenDelayed`. Data delayed for the time surplus (once `maxDelay` ends) is calculated as usual.
No	Any ratio	The composite pauses for the duration of Replay. Upon replaying, the component's data fills the no-data gap.

No support for composite SLOs 1.0

Turning an existing SLO into a composite 1.0, while this SLO is being replayed, results in the following:

Replay continues for the original objectives of this SLO
Historical data won't be considered in calculating the composite SLO 1.0 error budget. It's calculated without Replay, from the moment of creating the composite 1.0 objective.

Useful links

For a more in-depth look, consult additional resources:

ReplayReplay

Replaying SLOsReplay

sloctl replayReplay

Replaying SLOs from other SLOsReplay

Limitations and RBAC​

Replay impact on connected resources​

Impact on SLI data and reports​

Impact on alerts​

Impact on composite SLOs​

Useful links​