Skip to main content

Query delay

Reading time: 0 minute(s) (0 words)

The query delay (or query customization) feature functions as an offset to your queries. Instead of requesting data for the last minute, Nobl9 can ask your data source for a specific time interval based on the Query delay to ensure SLI data consistency

note

Take the following example:

  • Suppose it is 15:00 UTC: without a custom query delay; the Nobl9 agent would ask the data source for data from 14:59-15:00.
  • If you set the query delay to 10 minutes, the Nobl9 agent will instead ask the data source for data from 14:49-14:50.

Setting a longer query delay can be helpful when your data source may not have a complete data set during the standard Nobl9 agent workflow. Setting a query delay can help with all kinds of SLI problems, for example:

  • Reliability Burn Down chart indicating over 100%.
    This might happen in cases when the Nobl9 agent hasn't collected some of the data points due to a delay greater than one minute, and your ratio metric is collecting more good data points than the total:

    replay source config
    Image 1: Reliability Burndown Chart showing reliability >100%
  • Differences in raw data between Nobl9 and a data source.
    Sometimes the data in the observability platform may not be available. Specifying a longer Query delay allows you to pull data from a time further than one minute in the past.

note

You can also configure query delay via Agent's environment variables.

Configuring query delay

Configuring query delay in the UI

You can configure query delay in the Source Configuration Wizard in Integrations > Sources:

replay source config
Image 2: Configuring query delay in the UI

Enter the required value for the query delay and click the Save Data Source button to finish your configuration.

Configuring query delay via sloctl

You can configure query delay via sloctl by defining the spec[n].queryDelay.unit and spec[n].queryDelay.value fields in your YAML.

Refer to the specific Source's documentation for more details. The following are YAML samples containing the configuration of query delay for Datadog's agent and direct connections:

Here's a YAML snippet with a defined queryDelay for Datadog agent:

- apiVersion: n9/v1alpha
kind: Agent
metadata:
name: datadog-agent
project: default
spec:
datadog:
site: com
sourceOf:
- Metrics
releaseChannel: beta # string, one of: beta || stable
queryDelay:
unit: Minute # string, one of: Second || Minute
value: 720 # numeric, must be a number less than 1440 minutes (24 hours)
status:
agentType: Datadog
note

If you select a value greater than 24 hours (only Minutes and Seconds are allowed) sloctl will return the following error:

Error: agent.yaml: Key: 'Agent.spec.QueryDelayDuration' Error:Field validation for 'QueryDelayDuration' failed on the 'queryDelayDurationBiggerThanMaximumAllowed' tag
tip

We recommend selecting more sensible query delay values. For most data sources, query delay should be set to at most 15 minutes, but it can vary per specific use case.

Default values for query delay

The following table lists default values for query delay for all available data sources (for Agent and Direct connection methods). m stands for minute(s), s stands for second(s).

SourceDefault query delay
Amazon Redshift30s
Amazon CloudWatch1m
AppDynamics1m
Azure Monitor5m
BigQuery0s
Datadog1m
Dynatrace2m
Elasticsearch1m
Google Cloud Monitoring2m
GrafanaLoki1m
Graphite1m
InfluxDB1m
Instana1m
ServiceNow Cloud Observability2m
NewRelic1m
OpenTSDB1m
Pingdom1m
Prometheus 10s
Splunk5m
Splunk Observability5m
Sumo Logic4m
ThousandEyes1m
1 Applicable to Amazon Prometheus and Prometheus.

Impact of query delay on the SLI data

Editing query delay

caution

Editing query delay might lead to inconsistencies and periods of incorrect results in Nobl9.

Reducing the query delay from a higher value to a lower value may result in data gaps. For instance, if the query delay is currently set to 10 minutes, and you reduce it to 2 minutes, Nobl9 will miss 8 minutes of data, as the periods do not overlap.

Furthermore, increasing the query delay may lead to duplicated points. Ideally, data sources would respond with points with precisely identical timestamps, but the Nobl9 agent cannot guarantee this. What might happen is that those duplicated points could be dropped by Nobl9 agent (since this specific period was already queried).

The following SLI Graph illustrates data gaps caused by shortening the query delay from 4 to 15 minutes:

replay source config
Image 3: Data gap in the SLI Graph
caution

To ensure the health of your SLOs, do not change query delay too often.

As the best practice, we advise figuring out the correct value of query delay on some test data and then using this information to set a proper custom query delay per each Agent or Direct production instance.

Changes in query delay will affect all your SLOs attached to the data source where you've customized it.

Impact on alerting

caution

When you set query delay to a high value, your SLO alerting capabilities will also be delayed.

Since the query delay acts as a data offset, setting an extended query delay will affect Nobl9's alerting capabilities. After setting query delay, Nobl9 agent will retrieve data with a specific delay - and so live processing and alerting is based on not-so-live data.

For example, if you set the query delay to 10 minutes, Nobl9 can only generate alerts for that period - it can't predict the future, even though it is the "present state" of your data source.

Impact on Replay

In the Custom query delay setting, high query delay (in this context, over 15 minutes) values might lead to inconsistencies and short periods of incorrect results for Replay-based SLOs. It may happen that for narrow Replay time windows (in this context, depending on a data source, but less than 3 hours, for example), Replay could finish importing historical data faster than the query delay set in the Agent and, in return, overwrite a small portion of the data already gathered by Replay.

Query delay best practices

Since query delay takes place at the Agent or Direct level, when you change its value, it is applied to all SLOs on that data source.

In such cases, we recommend creating a new data source with the customized query delay to support only the SLOs requiring it (for example, those SLIs with long transaction times). We don't recommend changing the configuration of your existing data source and altering all existing SLOs by doing so.

You can quickly move an SLO to a new data source using sloctl. To achieve this, copy the YAML configuration from SLO Details > View YAML in the UI, modify the metricSource[n].name attribute in the YAML file, and then apply the new YAML using the sloctl apply command.

For more details on copying and moving SLOs, check the SLO troubleshooting Q&A.