Replay
Replay provides the ability to retrieve SLI data collected by your data source in the past.
You can use Replay when you need to recover missing or corrupted SLI data from your data source. With it, you can also retrieve data for your newly created SLO for the period when your SLO didn't exist yet. This eliminates the need to wait for data to accumulate naturally over time.
So, if you have a backlog of SLI data from the past few days or weeks, you can retrieve it using Replayβthis data will be used to calculate the remaining error budget for your SLO.
How far back in time Replay can retrieve SLI data depends on the maximum period for historical data retrieval initially allowed by your data source.
You might want to limit this maximum duration even further since Replay can consume significant resources. Set your preferred Maximum Period for Historical Data Retrieval for your data source integrated with Nobl9.
At the same step, you can also set a default period for all SLOs using this data source to replay them at creation. This default period can be overridden for individual SLOs.
Replay always retrieves SLI data up to the time it's been launchedβyou select only how far back the data will be retrieved.
Every organization in Nobl9 is limited to two concurrent Replays.
Subsequent Replays are queued and processed in order as Replay slots open up.
Replay at a glanceβ
- With Replay, you can access historical data for new and existing SLOs.
- Replay fetches historical data while your SLO collects new data in real-time. The historical and current data are merged, producing an error budget calculated for the entire period.
- Data sources limit the data retrieval period. This period can be limited even more for your data source in Nobl9.
- Replay always retrieves data until now.1
- You can run two Replays simultaneously; further Replays are queued.
- You can track Replays in the Job Status widget.
- You can remove queued from the queue and cancel ongoing Replays in the Job Status widget and with
sloctl
. - SLOs with ongoing Replays display no data in the tiles and charts over the retrieved period until Replay is complete. Queued Replays don't affect SLO's tiles and charts.
- Cancelling Replay is available only at the data import step.
1"Now" means the moment of triggering Replay
Scope of supportβ
Supported data sources for Replay, their minimum required agent versions, and the maximum period for historical data retrieval:
Data source name | Nobl9 agent minimum version | Maximum period for historical data retrieval |
---|---|---|
Amazon CloudWatch | 0.65.0 | 15 days |
Amazon Prometheus | 0.65.0 | 30 days |
AppDynamics | 0.68.0 | 30 days |
Azure Monitor | 0.69.0-beta01 | 30 days |
Azure Monitor managed service for Prometheus | 0.78.0-beta | 30 days |
Coralogix | 0.65.0 | 30 days |
Datadog | 0.65.0 | 30 days |
Dynatrace | 0.66.0 | 28 days |
Elasticsearch | 0.88.0 | 30 days |
Google Cloud Monitoring | 0.80.0 | 30 days |
Graphite | 0.65.0 | 30 days |
LogicMonitor | 0.81.0-beta | 30 days |
New Relic | 0.65.0 | 30 days |
Prometheus | 0.65.0 | 30 days |
ServiceNow Cloud Observability | 0.65.0 | 30 days |
Splunk | 0.82.2 | 30 days |
ThousandEyes | 0.97.0-beta | 30 days |
Replay configurationβ
- Nobl9 Web
- sloctl
In the Data source wizard, define the Advanced settings > Historical data retrieval:
- The Maximum period for historical data retrieval value sets the per-integration limits for how far back to the past Nobl9 can query data from your data source.
- The Default period for historical data retrieval defines the period applied to replay SLOs based on your data source. This value is suggested by default as Period for historical data retrieval.
Requirements:
- Each value must be the whole positive number or zero.
- Default period for historical data retrieval must be up to the maximum period set for this data source. You can override this value when creating SLOs.
- Maximum period for historical data retrieval must be up to the maximum period allowed initially by your data source.
- To enable Replay for a data source, set its Maximum period for historical data retrieval to a non-zero value.
In your data source's YAML definition for agent or direct, define the values for historicalDataRetrieval
:
maxDuration
sets the limits for how far back to the past Nobl9 can query data from your data source.
The maximum period must be less than or equal to the period initially allowed by your data source.defaultDuration
defines the period applied to replay SLOs based on your data source.
- apiVersion: n9/v1alpha
# Oneof: Agent, Direct
kind: Direct
metadata:
displayName: My data source with replay
name: my-data-source
project: my-project
spec:
# Your data source-specific fields
# ...
historicalDataRetrieval:
defaultDuration:
unit: Day
value: 0
maxDuration:
unit: Day
value: 30
# Query parameters configuration: query interval, query delay, jitter, and timeont
interval:
unit: Minute
value: 1
queryDelay:
unit: Minute
value: 5
jitter:
unit: Second
value: 45
timeout:
unit: Second
value: 50
# Event logs available for direct connections only
logCollectionEnabled: true
# Oneof: beta, stable. Use beta for Replay
releaseChannel: beta
Run sloctl apply
to proceed.
Learn more about YAML configuration for data sources connected using the agent or direct methods.
Requirements:
- Each value must be the whole positive number or zero.
- The value for
defaultPeriod
must be up to the maximum period set for this data source. - The
maxDuration
value must be up to the maximum period allowed originally by your data source. - To enable Replay for a data source, set its
maxDuration
to a non-zero value.
You can also set up replaying SLOs immediately on creation with the Nobl9 Terraform provider.
For this, specify the value for retrieve_historical_data_from
in your SLO definition.