Amazon CloudWatch
Amazon CloudWatch is a monitoring and observability service and a repository that aggregates data from more than 70 AWS data sources. CloudWatch also allows users to publish custom metrics from their services. Creating SLOs using this data is a powerful tool to monitor large portfolios of products.
Nobl9 integration with CloudWatch supports CloudWatch Metrics Insights. Leveraging Metrics Insights, Nobl9 users can retrieve metrics even faster and gain added flexibility in querying raw service level indicator (SLI) data to use for their SLOs.
Using CloudWatch as a Source in Nobl9, users can configure their SLOs by leveraging data in CloudWatch-specific groupings β i.e., by region, namespaces, and dimensions.
Scope of supportβ
The following CloudWatch metric features are not supported:
-
High-resolution metrics (for details, see Put Metric Data | Amazon CloudWatch documentation)
-
Metrics that use more than one Unit.
AWS Cross-account observabilityβ
Nobl9 supports AWS cross-account observability AWS cross-account observability for CloudWatch through the AWS Account ID parameter that you can enter in Step 2 of the SLO wizard.
The AWS Account ID as an optional parameter for CloudWatch (Direct or Agent connection methods) if you'd like to access your SLO data from multiple accounts within a Region. It is a 12-digit identification number of your AWS account. Check AWS Documentation to learn more about the Account ID.
AWS cross-account observability is available for Configuration and JSON metric types. SQL and SQL within JSON metrics for CloudWatch do not support AWS cross-account observability.
Nobl9 accepts only a numeric form of an AWS account ID (AWS account alias isn't accepted).
Authenticationβ
Cross-Account IAM rolesβ
You can activate cross-account access in AWS using the External ID and Nobl9 AWS Account ID. Copy these values in the Data source wizard. You need them to create an IAM role ARN with cross-account access.
You can retrieve External ID and Nobl9 AWS Account ID using sloctl aws-iam-ids direct [direct-name]
command which returns External ID and Nobl9 AWS Account ID for the specific direct.
IAM role ARN creationβ
Check Cross Account Resource Access in IAM | AWS documentation to learn more.
Sign in to the AWS Management Console. Open the IAM console.
- Choose Roles on the navigation pane.
The Roles section opens.
-
Click Create Role:
To create the access role, select a trusted entity first.
- Choose AWS account role.
-
Choose Another AWS account. Paste the
Nobl9 Account ID
you copied in the Nobl9 Data source wizard.
This is the account you're granting access to your resources. -
Select Require External ID. Paste the
Nobl9 External ID
you copied in the Nobl9 Data source wizard.
This option automatically adds a condition to the trust policy, allowing users to assume the role only if the request includes the correctsts:ExternalID
.
- Click Next.
- Attach the
CloudWatchReadOnlyAccess
permission for your account:
- Click Next and save the role. Then, copy its IAM Role ARN to the Data source wizard in Nobl9 UI.
Adding Amazon CloudWatch as a data sourceβ
To ensure data transmission between Nobl9 and your data source, it may be necessary to list Nobl9 IP addresses as trusted.
- 18.159.114.21
- 18.158.132.186
- 3.64.154.26
You can add the Amazon CloudWatch data source using the direct or agent connection methods. For both methods, start with these steps:
- Navigate to Integrations > Sources.
- Click .
- Click the relevant Source icon.
- Choose a relevant connection method (Agent or Direct), then configure the source as described below.
CloudWatch directβ
Direct configuration in the UIβ
Direct connection to CloudWatch requires users to enter their credentials which Nobl9 stores safely. To set up this type of connection:
-
Select one of the following Release Channels:
- The
stable
channel is fully tested by the Nobl9 team. It represents the final product; however, this channel does not contain all the new features of abeta
release. Use it to avoid crashes and other limitations. - The
beta
channel is under active development. Here, you can check out new features and improvements without the risk of affecting any viable SLOs. Remember that features in this channel may be subject to change.
- The
-
Enter the IAM Role ARN.
Check the instructions above for more details.
- Select a Project.
Specifying a project is helpful when multiple users are spread across multiple teams or projects. When the Project field is left blank, Nobl9 uses thedefault
project. - Enter a Display Name.
You can enter a user-friendly name with spaces in this field. - Enter a Name.
The name is mandatory and can only contain lowercase, alphanumeric characters, and dashes (for example,my-project-1
). Nobl9 duplicates the display name here, transforming it into the supported format, but you can edit the result. - Enter a Description.
Here you can add details such as who is responsible for the integration (team/owner) and the purpose of creating it. - Specify the Query delay to set a customized delay for queries when pulling the data from the data source.
- The default value in Amazon Cloudwatch integration for Query delay is
1 minute
.
infoChanging the Query delay may affect your SLI data. For more details, check the Query delay documentation. - The default value in Amazon Cloudwatch integration for Query delay is
- Enter a Maximum Period for Historical Data Retrieval.
- This value defines how far back in the past your data will be retrieved.
- The value for the maximum period of data retrieval depends on the data source. Check the Replay documentation for details.
- A greater period can extend the loading time when creating an SLO.
- The value must be a positive integer.
- Enter a Default Period for Historical Data Retrieval.
- It is used by SLOs connected to this data source.
- The value must be a positive integer or
0
. - By default, this value is set to 0. When you set it to
>0
, you will create SLOs with Replay.
- Click Add Data Source
The value for the Maximum Period for Data Retrieval for CloudWatch Configurations queries is 15 days.
Replay for CloudWatch doesn't support SQL and JSON queries.
If you set the Default Value Historical Data Retrieval to >0, you wonβt be able to use JSON and SQL queries.
Direct using CLI - YAMLβ
The YAML for setting up a Direct connection to CloudWatch looks like this:
apiVersion: n9/v1alpha
kind: Direct
metadata:
name: cloudwatch-direct
displayName: CloudWatch direct
project: cloudwatch-direct
spec:
description: Direct integration with CloudWatch
sourceOf:
- Metrics
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
cloudWatch:
roleARN: ""
logCollectionEnabled: false
historicalDataRetrieval:
maxDuration:
value: 15 # Max duration for CloudWatch β₯ 15 days
unit: Day
defaultDuration:
value: 0
unit: Day
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
logCollectionEnabled optional | boolean | Optional. Defaults to false . Set to true if you'd like your direct to collect event logs. Beta functionality available only through direct release channel. Reach out to support@nobl9.com to activate it. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Source-specific fields | ||
cloudwatch.roleARN mandatory | string | See authentication section above for more details. |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
If you set the value for the Default Value Historical Data Retrieval to >0
, you wonβt be able to use JSON and SQL queries. Refer to the replay documentation for more details.
CloudWatch agentβ
Agent configuration in the UIβ
Follow the instructions below to create your CloudWatch agent connection. Refer to the section above for the description of the fields.
- Enter a Project.
- Enter a Name.
- Create a Description.
- In the Advanced Settings you can:
- Enter a Maximum Period for Historical Data Retrieval.
- Enter a Default Period for Historical Data Retrieval.
- Click Add Data Source
See notes above for the Maximum Period for Data Retrieval for CloudWatch.
Agent using CLI - YAMLβ
The YAML for setting up an Agent connection to CloudWatch looks like this:
apiVersion: n9/v1alpha
kind: Agent
metadata:
name: cloudwatch
displayName: AWS CloudWatch
project: cloudwatch
spec:
description: Integration with CloudWatch
sourceOf:
- Metrics
releaseChannel: beta
queryDelay:
unit: Minute
value: 720
cloudWatch: {}
historicalDataRetrieval:
maxDuration:
value: 15
unit: Day
defaultDuration:
value: 0
unit: Day
Field | Type | Description |
---|---|---|
queryDelay.unit mandatory | enum | Specifies the unit for the query delay. Possible values: Second | Minute . β’ Check query delay documentation for default unit of query delay for each source. |
queryDelay.value mandatory | numeric | Specifies the value for the query delay. β’ Must be a number less than 1440 minutes (24 hours). β’ Check query delay documentation for default unit of query delay for each source. |
releaseChannel mandatory | enum | Specifies the release channel. Accepted values: beta | stable . |
Replay-related fields | ||
historicalDataRetrieval optional | n/a | Optional structure related to configuration related to Replay. β Use only with supported sources. β’ If omitted, Nobl9 uses the default values of value: 0 and unit: Day for maxDuration and defaultDuration . |
maxDuration.value optional | numeric | Specifies the maximum duration for historical data retrieval. Must be integer β₯ 0 . See Replay documentation for values of max duration per data source. |
maxDuration.unit optional | enum | Specifies the unit for the maximum duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
defaultDuration.value optional | numeric | Specifies the default duration for historical data retrieval. Must be integer β₯ 0 and β€ maxDuration . |
defaultDuration.unit optional | enum | Specifies the unit for the default duration of historical data retrieval. Accepted values: Minute | Hour | Day . |
You can deploy only one Agent in one YAML file by using the sloctl apply
command.
Deploying CloudWatch agentβ
When you add the data source, Nobl9 automatically generates a Kubernetes configuration and a Docker command line for you to use to deploy the Agent. Both of these are available in the web UI, under the Agent Configuration section. Be sure to swap in your credentials (e.g., if you are using AWS Access Key ID and Secret Access Key, replace AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
in the following deployment descriptions).
Ensure AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
variables are set appropriately if you are using Access/Secret Keys. If these variables are not set, a Default Credential Provider Chain will be used.
- Kubernetes
- Docker
If you use Kubernetes, you can apply the supplied YAML config file to a Kubernetes cluster to deploy the Agent. It will look something like this:
# DISCLAIMER: This deployment description contains only the fields necessary for the purpose of this demo.
# It is not a ready-to-apply k8s deployment description, and the client_id and client_secret are only exemplary values.
apiVersion: v1
kind: Secret
metadata:
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
namespace: default
type: Opaque
stringData:
aws_access_key_id: <AWS_ACCESS_KEY_ID>
aws_secret__access_key: <AWS_SECRET_ACCESS_KEY>
client_id: "unique_user_id"
client_secret: "unique_client_secret"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
namespace: default
spec:
replicas: 1
selector:
matchLabels:
nobl9-agent-name: cloudwatch
nobl9-agent-project: cloudwatch
nobl9-agent-organization: nobl9-dev
template:
metadata:
labels:
nobl9-agent-name: cloudwatch
nobl9-agent-project: cloudwatch
nobl9-agent-organization: nobl9-dev
spec:
containers:
- name: agent-container
image: nobl9/agent:0.73.2
resources:
requests:
memory: "350Mi"
cpu: "0.1"
env:
- name: N9_CLIENT_ID
valueFrom:
secretKeyRef:
key: client_id
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
- name: N9_CLIENT_SECRET
valueFrom:
secretKeyRef:
key: client_secret
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: aws_access_key_id
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: aws_secret_access_key
name: nobl9-agent-nobl9-dev-cloudwatch-cloudwatch
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
- name: N9_METRICS_PORT
value: "9090"
If you use Docker, you can run the Docker command to deploy the Agent. It will look something like this:
# DISCLAIMER: This docker command description is containing only the necessary fields for the purpose of this demo.
# It is not a ready-to-apply docker command.
docker run -d --restart on-failure \
--name nobl9-agent-nobl9-dev-cloudwatch-cloudwatch \
-e N9_CLIENT_ID="unique_client_id" \
-e N9_CLIENT_SECRET="unique_client_secret" \
# The N9_METRICS_PORT is a variable specifying the port to which the /metrics and /health endpoints are exposed.
# The 9090 is the default value and can be changed.
# If you donβt want the metrics to be exposed, comment out or delete the N9_METRICS_PORT variable.
-e N9_METRICS_PORT=9090 \
-e AWS_ACCESS_KEY_ID="<AWS_ACCESS_KEY_ID>" \
-e AWS_SECRET_ACCESS_KEY="<AWS_SECRET_ACCESS_KEY>" \
nobl9/agent:0.73.2
Creating SLOs with CloudWatchβ
Using Amazon CloudWatch, you can create SLOs by:
-
Entering standard threshold and ratio metrics
-
Entering an SQL query
-
Entering multiple queries through JSON
All three methods are available both in the UI and through applying YAML (see the Creating CloudWatch SLOs - YAML section).
Creating SLOs in the UIβ
Follow the instructions below to create your SLOs with CloudWatch in the UI:
-
Navigate to Service Level Objectives.
-
Click .The SLO wizard opens.
-
In step 1, select the Service the SLO will be associated with.
-
In step 2, select Amazon CloudWatch as the data source for your SLO.
-
Specify the Metric. You can choose either:
-
A Threshold Metric where a single time series is evaluated against a threshold.
-
A Ratio Metric that allows you to enter two-time series for comparison. You can choose one of the following metric types:
-
Good Metric, meaning a ratio of
good
requests andtotal
requests -
Bad Metric meaning a ratio of
bad
requests andtotal
requests
- Choose the Data Count Method for your ratio metric:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
-
SLI values for good and totalWhen choosing the query for the ratio SLI (countMetrics
), keep in mind that the values ββresulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
-
-
Configure the metric.
CloudWatch allows you to create your query in the following ways:- Enter standard threshold and ratio metrics (click Configurations)
- Enter an SQL query
- Enter a multiple query through JSON
Read Entering CloudWatch Query for detailed instructions.
-
In step 3, define a Time Window for the SLO.
-
In step 4, specify the Error Budget Calculation Method and your Objective(s).
-
In step 5, add a Name, Description, and other details about your SLO. You can also select Alert Policies and Labels on this screen.
-
When youβre done, click Create SLO.
Entering CloudWatch queryβ
Both, Ratio and Threshold metrics for a standard CloudWatch metric use the same parameters. For the Ratio Metric, choose one of the following metric types:
- Good Metric, meaning a ratio of
good
requests andtotal
requests - Bad Metric, meaning a ratio of
bad
requests andtotal
requests
and define the parameters separately.
- Standard Configuration
- SQL Query
- JSON
- Enter an Account ID (optional). Use Account ID to access your SLO data from multiple accounts within a Region. An AWS account ID is a 12-digit identification number of your AWS account. Check AWS Documentation to learn more.
- Add a Region. It is a region code in AWS. Use one of the regional codes that are listed here.
- Add a Namespace (mandatory, max. number of characters 255). A namespace can contain alphanumeric characters, period, a hyphen, underscore, forward slash, hash, or colon. A Namespace is a container for CloudWatch metrics. For further details, see CloudWatch Concepts | Amazon CloudWatch Documentation.
- Add a Metric Name (mandatory, max. number of characters 255).
- Add Statistic function. Statistic functions are aggregations of metric data over specified periods. For example, you can use
- Add Dimensions (optional, list). A dimension is a name/value pair that is part of the identity of a metric. Users can assign a max. of 10 dimensions to a metric.
- Add a Name (mandatory, max. number of characters 255, don't trim whitespaces). The name of the dimension. Dimension names must contain only ASCII characters and must include at least one non-whitespace character.
- Add a Value required (max. number of characters 255). It is the value of the dimension. Dimension values must contain only ASCII characters and must include at least one non-whitespace character.
AWS cross-account observability is available for configuration-type metrics only. This field is supported only through the Beta release channel.
Maximum
, Minimum
, Sum
, Average
. To see all statistics are supported by CloudWatch for metrics, go to Statistics Definition | Amazon CloudWatch Documentation.- Select SQL in the feature toggle.
- Select a Region.
- Select a type of Metric, and enter a Query. Sample SQL queries for CloudWatch:
- SQL Threshold metric for Cloudwatch: Query:
SELECT AVG(CPUUtilization) FROM "AWS/EC2β
- SQL Ratio metric for CloudWatch:
- Good Query:
SELECT AVG(CPUUtilization) FROM "AWS/EC2"
- Total Query:
SELECT MAX(CPUUtilization) FROM "AWS/EC2"
CloudWatch integration lets you query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics. You can do this by entering multiple JSON queries:
- Choose JSON in the feature toggle.
- Choose a Region.
- Select a type of Metric, and enter a Query.
- Enter your JSON query.
- For samples of multiple JSON queries, refer to the Amazon CloudWatch JSON Queries section in the Nobl9 Documentation.
- For further details on CloudWatch metric math functions, go to Using Metric Math | Amazon CloudWatch Documentation.
Your query must be a valid JSON query. It must contain arrays of metrics. Refer to the CloudWatch Metrics Insights Queries | Amazon CloudWatch Documentation for more detailed information.
SLO using CloudWatch - YAML samplesβ
SLO using CloudWatch - standard configurationβ
- rawMetric
- countMetric
Hereβs an example of CloudWatch using a rawMetric
(threshold metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-occurrences-threshold
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.8
op: lte
rawMetric:
query:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/RDS
metricName: ReadLatency
stat: Average
dimensions:
- name: DBInstanceIdentifier
value: <identifier_of_your_db_instance> # replace with value that corresponds to your DBInstanceIdentifier
value: 0.0004
timeWindows:
- calendar:
startTime: "2020-11-14 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-timeslices-threshold
project: cloudwatch
spec:
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.8
op: lte
rawMetric:
query:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/RDS
metricName: ReadLatency
stat: Average
dimensions:
- name: DBInstanceIdentifier
value: <identifier_of_your_db_instance> # replace with value that corresponds to your DBInstanceIdentifier
value: 0.0004
timeSliceTarget: 0.5
timeWindows:
- calendar:
startTime: "2020-11-14 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
Hereβs an example of CloudWatch using a countMetric
(ratio metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-calendar-occurrences-ratio
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.9
countMetrics:
good:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: HTTPCode_Target_2XX_Count
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/prod-default-appingress
incremental: false
total:
cloudwatch:
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: RequestCount
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/prod-default-appingress
displayName: ""
value: 1
timeWindows:
- calendar:
startTime: "2020-11-14 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-rolling-occurrences-ratio
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.7
countMetrics:
good:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: HTTPCode_Target_2XX_Count
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/prod-default-appingress
incremental: false
total:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: RequestCount
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/prod-default-appingress
displayName: ""
value: 1
timeWindows:
- count: 1
isRolling: true
unit: Hour
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-calendar-timeslices-ratio
project: cloudwatch
spec:
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.5
countMetrics:
good:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: HTTPCode_Target_2XX_Count
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/main-default-appingress
incremental: false
total:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: RequestCount
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/main-default-appingress
displayName: ""
timeSliceTarget: 0.5
value: 1
timeWindows:
- calendar:
startTime: "2020-11-14 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
---
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-rolling-timeslices-ratio
project: cloudwatch
spec:
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.5
countMetrics:
good:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: HTTPCode_Target_2XX_Count
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/main-default-appingress
incremental: false
total:
cloudwatch:
#accountID: "123456789012" optional
region: eu-central-1
namespace: AWS/ApplicationELB
metricName: RequestCount
stat: SampleCount
dimensions:
- name: LoadBalancer
value: app/main-default-appingress
timeSliceTarget: 0.5
value: 1
timeWindows:
- count: 1
isRolling: true
unit: Hour
Important notes:
Both ratio and threshold metrics for CloudWatch use the same parameters.
For ratio metric, define these parameters separately for the good/bad metric and total metric.
-
region
is required. It is a region code in AWS. Use one of the regional codes listed here. -
namespace
is required (string, max. number of characters 255). It can contain alphanumeric characters, period.
, hyphen-
, underscore_
, forward slash/
, hash#
, or colon:
. Anamespace
is a container for CloudWatch metrics. For further details, see CloudWatch Concepts | Amazon CloudWatch documentation. Example:AWS/ApplicationELB
. -
metricName
is required (string, max. number of characters 255). -
stat
is required. stats are aggregations of metric data over specified periods of time. To see what statistics are supported by CloudWatch for metrics, go to Statistics Definitions | Amazon CloudWatch documentation. Examples:SampleCount, Average, p95, TC(0.005:0.030)
. -
dimensions
field is optional (list). A dimension is a name/value pair that is part of the identity of a metric. Users can assign a max. of 10 dimensions to a metric.-
name
is required (string, max. number of characters 255). Dimension names must contain only ASCII characters and must include at least one non-whitespace character. -
value
is required (string, max. number of characters 255). Dimension values must contain only ASCII characters and must include at least one non-whitespace character.
-
SLO using CloudWatch SQL queryβ
- rawMetric
- countMetric-good
- countMetric-bad
Hereβs an example of CloudWatch SQL query using rawMetric
(threshold metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-occurrences-threshold-via-sql
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.8
op: lte
rawMetric:
query:
cloudwatch:
region: us-east-1
sql: 'SELECT AVG(CPUUtilization)FROM "AWS/EC2"'
value: 0.0004
timeWindows:
- calendar:
startTime: "2021-10-01 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
Hereβs an example of CloudWatch SQL query using countMetric
(good ratio metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-calendar-occurrences-ratio-sql
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
name: cloudwatch
service: cloudwatch-service
objectives:
- target: 0.9
countMetrics:
good:
cloudwatch:
region: eu-central-1
sql: 'SELECT AVG(CPUUtilization) FROM "AWS/EC2"'
incremental: false
total:
cloudwatch:
region: eu-central-1
sql: 'SELECT MAX(CPUUtilization) FROM "AWS/EC2"'
displayName: ""
value: 1
timeWindows:
- calendar:
startTime: "2020-11-14 12:30:00"
timeZone: Etc/UTC
count: 1
isRolling: false
unit: Day
Hereβs an example of CloudWatch using a countMetric
(bad ratio metric):
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-sql-bad-over-total
project: cloudwatch-direct
spec:
alertPolicies: []
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
kind: Direct
name: cloudwatch-direct
project: cloudwatch-direct
objectives:
- countMetrics:
bad:
cloudWatch:
region: us-east-1
sql: SELECT COUNT(HTTPCode_Target_4XX_Count) FROM "AWS/EC2"
incremental: false
total:
cloudWatch:
region: us-east-1
sql: SELECT COUNT(RequestCount) FROM "AWS/EC2"
displayName: ""
name: bad
target: 0.1
value: 0.9
- countMetrics:
good:
cloudWatch:
region: us-east-1
sql: SELECT COUNT(HTTPCode_Target_2XX_Count) FROM "AWS/EC2"
incremental: false
total:
cloudWatch:
region: us-east-1
sql: SELECT COUNT(RequestCount) FROM "AWS/EC2"
displayName: ""
name: good
target: 0.1
value: 1
service: cloudwatch-direct-service
timeWindows:
- count: 28
isRolling: true
period:
begin: "2023-08-01T06:00:16Z"
end: "2023-08-29T06:00:16Z"
unit: Day
Important notes:
Both ratio and threshold metrics for CloudWatch use the same parameters.
For ratio metric, define these parameters separately for the good/bad metric and total metric.
When using SQL query, only these fields are required:
-
region
is mandatory. It is a regional code in AWS. Use one of the regional codes listed here. Note: CloudWatch SQL query is available in all AWS Regions, except China. -
sql
is mandatory. It is an SQL query to compare, aggregate, and group metrics by labels to gain real-time operational insights.
CloudWatch SLOs using multiple metrics (JSON)β
- rawMetric
- countMetric
Hereβs an example of CloudWatch JSON query using rawMetric
(threshold metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-rawmetric-via-json
project: cloudwatch
spec:
budgetingMethod: Occurrences
description: ""
indicator:
metricSource:
kind: Agent
name: cloudwatch
project: cloudwatch
objectives:
- displayName: ""
op: lte
rawMetric:
query:
cloudWatch:
json: |-
[
{
"Id": "e1",
"Expression": "m1 / m2",
"Period": 60
},
{
"Id": "m1",
"MetricStat": {
"Metric": {
"Namespace": "AWS/ApplicationELB",
"MetricName": "HTTPCode_Target_2XX_Count",
"Dimensions": [
{
"Name": "LoadBalancer",
"Value": "app/main-default-appingress-350b/904311bedb964754"
}
]
},
"Period": 60,
"Stat": "SampleCount"
},
"ReturnData": false
},
{
"Id": "m2",
"MetricStat": {
"Metric": {
"Namespace": "AWS/ApplicationELB",
"MetricName": "RequestCount",
"Dimensions": [
{
"Name": "LoadBalancer",
"Value": "app/main-default-appingress-350b/904311bedb964754"
}
]
},
"Period": 60,
"Stat": "SampleCount"
},
"ReturnData": false
}
]
region: eu-central-1
target: 0.8
value: 0.9
service: cloudwatch-service
timeWindows:
- count: 1
isRolling: true
period:
begin: "2021-11-10T14:49:37Z"
end: "2021-11-10T15:49:37Z"
unit: Hour
Hereβs an example of CloudWatch JSON query using countMetric
(ratio metric):
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: cloudwatch-timeslices-json
project: cloudwatch
spec:
budgetingMethod: Timeslices
description: ""
indicator:
metricSource:
name: cloudwatch
objectives:
- countMetrics:
good:
cloudWatch:
json: |
[
{
"Id": "e1",
"MetricStat": {
"Metric": {
"Namespace": "AWS/ApplicationELB",
"MetricName": "HTTPCode_Target_2XX_Count",
"Dimensions": [
{
"Name": "LoadBalancer",
"Value": "app/main-default-appingress-350b/123456789"
}
]
},
"Period": 60,
"Stat": "SampleCount"
}
}
]
region: eu-central-1
incremental: false
total:
cloudWatch:
json: |
[
{
"Id": "e2",
"MetricStat": {
"Metric": {
"Namespace": "AWS/ApplicationELB",
"MetricName": "RequestCount",
"Dimensions": [
{
"Name": "LoadBalancer",
"Value": "app/main-default-appingress-350b/123456789"
}
]
},
"Period": 60,
"Stat": "SampleCount"
}
}
]
region: eu-central-1
displayName: ""
target: 0.5
timeSliceTarget: 0.5
value: 1
service: cloudwatch-service
timeWindows:
- count: 1
isRolling: true
period:
begin: "2021-11-10T12:19:58Z"
end: "2021-11-10T13:19:58Z"
unit: Hour
Important notes:
Both ratio and threshold metrics for CloudWatch use the same parameters.
For ratio metric, define these parameters separately for the good/bad metric and total metric.
When using multiple queries (JSON) it is important to remember about:
-
region
field is mandatory. It is a regional code in AWS. Use one of the regional codes listed here. -
json
field is mandatory. It is a JSON query that lets you query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics.
The following JSON validation applies:
-
The JSON query must be valid.
-
The JSON query should be an array of metrics.
-
Only one
ReturnData
field can be set to true (when it is not set, by default it is true), and the rest of theReturnData
fields in other metrics has to be set explicitly to false. -
The
Period
field inMetricStat
is required, and it has to be equal to 60, ifMetricStat' does not exist
, thePeriod
field should be set in the base object to 60.
For further details on CloudWatch metric math functions, go to Using Metric Math | Amazon CloudWatch documentation.
Querying the CloudWatch serverβ
Once the SLO is set up, Nobl9 queries the CloudWatch server every 60 seconds.
CloudWatch API rate limitsβ
For GetMetricData
API, CloudWatch has limit of 50TPS per Region set by default. This is the maximum number of operation requests you can make per second. For more information, refer to the CloudWatch service quotas | CloudWatch documentation.
CloudWatch has minimum query and store period - one second. By default, CloudWatch stores data with a 1-minute period.
CloudWatch retains metric data differently for various store period. For more information, refer to the GetMetricData | CloudWatch documentation.
Known limitationsβ
CloudWatch SQL query is available in all AWS Regions, except China.
Useful linksβ
Put Metric Data | Amazon CloudWatch documentation
Get Metric Data | Amazon CloudWatch documentation
Amazon CloudWatch Concepts | Amazon CloudWatch documentation
CloudWatch Statistics Definitions | Amazon CloudWatch documentation
AWS Regional Endpoints | Amazon CloudWatch documentation
CloudWatch Metrics Insights | Amazon CloudWatch documentation