Instana
Instana is an observability platform that delivers automated Application Performance Monitoring (APM), used for website, infrastructure, and application monitoring.
Instana parameters and supported features in Nobl9
- General support:
- Release channel: Stable, Beta
- Connection method: Agent, Direct
- Replay and SLI Analyzer: Not supported
- Event logs: Supported
- Query checker: Not supported
- Query parameters retrieval: Not supported
- Timestamp cache persistence: Supported
- Query parameters:
- Query interval: 1 min
- Query delay: 1 min
- Jitter: 15 sec
- Timeout: 30 sec
- Agent details and minimum required versions for supported features:
- Plugin name: n9instana
- Query delay environment variable: INSTANA_QUERY_DELAY
- Timestamp cache persistence: 0.65.0
- Additional notes:
- No support for website and application (ratio) monitoring metrics
- Learn more
Creating SLOs with Instana
Instana allows you to create SLOs based on:
-
Infrastructure metrics (for infrastructure components)
-
Application metrics (for defined applications, discovered services, and endpoints)
Infrastructure metrics can be defined as either Threshold metrics or Ratio metrics, but Application metrics can only be defined as Threshold metrics.
See the instructions in the following sections for more details.
Nobl9 Web
- Threshold – Infrastructure
- Threshold – Application
- Ratio – Infrastructure
Follow the instructions below to create an SLO based on a Threshold metric using the Infrastructure type:
- Navigate to Service Level Objectives.
- Click the button.
- In step 1 of the SLO wizard, select the Service the SLO will be associated with.
- In step 2, select Instana as the data source for your SLO, then specify the Metric.
- Select Threshold metric > Infrastructure.
- Enter the Plugin ID (the ID of the plugin available in your monitored system for which you want to retrieve the metric). For more information, refer to the Instana metrics.
- Enter the Metric ID, meaning the ID of the metric you want to retrieve. For more information, refer to the Instana metrics.
- From the Metric Retrieval Method picklist, select a method to obtain the specific metrics with:
- Query, using Dynamic Focus search and filter function:
- To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example,
entity.selfType:zookeeper AND entity.label:replica.1
. - Snapshot ID, a unique immutable set of metadata, a snapshot:
- You can get the Snapshot ID from the URL in Instana’s UI by looking for the
snapshotId=[SNAPSHOT_ID]
parameter, for example,GbMUvWHy12TTRsIm3Lko4LDAklw
. - Enter the Query or the Snapshot ID.
- In step 3, define a Time Window for the SLO.
- In step 4, specify the Error Budget Calculation Method and your Objective(s).
- In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
- When you've finished, click Create SLO.
For more information, refer to the Instana metrics.
Follow the instructions below to create an SLO based on a Threshold metric using the Application type:
- Navigate to Service Level Objectives.
- Click the button.
- In step 1 of the SLO wizard, select the Service the SLO will be associated with.
- In step 2, select Instana as the data source for your SLO, then specify the Metric.
- Select Threshold metric > Application.
- Select the Metric ID you want to use from the following list:
- Calls - to monitor the number of received calls
- Erroneous Calls - to monitor the number of erroneous calls
- Erroneous Calls Rate - to monitor the error rate of received calls
- Latency - to monitor the latency of received calls in milliseconds
- Select the Aggregation. The following list shows the aggregations available for each Metric ID:
- Calls: sum
- Erroneous Calls: sum
- Erroneous Calls Rate: mean
- Latency: sum, mean, max, min, p25, p50, p75, p90, p95, p98, p99
- Enter the API query. You must create this in the Instana UI and copy and paste it here. There are two methods you can use to specify the query:
- In the Filter field, you can already see a partially defined query. The applied filter must point to the exact entity you want to be observed, for example:
- Any additional manual selections from the left panel in Instana UI will be included in the API query, for example, HTTP Status or Technologies.
- Decide which of the hidden calls you’d like to include - Synthetic or Internal. They are not included in the API query and need to be passed to Nobl9 manually.
- Copy the API query. Make sure you have the toggle Include filter sidebar items on, otherwise, the additional manual selections won’t be included in the API query.
- Select the Group, meaning the single entity you want to be observed. Group by the most specific parameter in the created filter. You can always view the resulting groups below the charts in the Analytics view.
- A combination of filters and group elements must point to a single entity.
- Group by the last logical element in the defined filter. Be as specific as possible.
- The state of the monitored system may change and more than one group can be associated with the previously applied grouping. Therefore, group by entity names as accurately as possible.
- You may have to change the group element accordingly when you change the applied filter.
- Enter the Tag, Tag Entity, and Tag Second Level Key (if applicable). You can get the
Group
details in two ways: - In the Instana UI, look for the
groupBy=(...)
section in the URL. - Parse the deeplink by using the following script and convert it to YAML for the Nobl9
sloctl
: - If you want to include the hidden calls, check the Synthetic or Internal checkbox in the Include Hidden Calls field.
- In step 3, define a Time Window for the SLO.
- In step 4, specify the Error Budget Calculation Method and your Objective(s).
- In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
- When you’ve finished, click Create SLO.
Note that the value in the Aggregation field is selected by default for Calls, Erroneous Calls, and Erroneous Calls Rate Metric ID values.
Method 1:
From the Applications tab, by selecting the Application, Service, or Endpoint you want to observe and clicking the Analyze Calls button.
Filter using entity name - be as specific as possible. Providing only the endpoint name or a service name in the filter will most likely be insufficient. There can be a lot of GET /
endpoints belonging to different services and applications. Likewise, the same service name can appear in various applications.
Follow these guidelines to apply the correct grouping:
Method 2:
Go to Analytics, specify the query, and follow the instructions above.
For more information, refer to the Application Analyze | Instana Documentation.
Note that field names vary between the Instana API and the Nobl9 API:
Field name in Instana | Field name in Nobl9 |
---|---|
groupbyTag | tag |
groupbyTagEntity | tagEntity |
groupbyTagSecondLevelKey | tagSecondLevelKey |
#!/bin/bash
echo "$DEEPLINK" |
sed 's/;/\n/g' |
grep groupBy |
sed 's/groupBy=(//' |
sed 's/).*//' |
tr -d '\n' |
awk '
BEGIN {RS = "~"; print "groupBy:"}
{ctr++}
ctr == 1 {key = $0}
ctr == 2 {value = $0; printf " %s: \"%s\"\n", key, value; ctr = 0}' |
sed 's/groupbyT/t/g' |
awk -F':' '
BEGIN {tagEntityExists = 0}
$1 ~ "tagEntity" {tagEntityExists = 1}
{print}
END {
if(tagEntityExists == 0)
{print " tagEntity: \"NOT_APPLICABLE\""}
}'
Follow the instructions below to create an SLO based on a Ratio metric using the Infrastructure type:
- Navigate to Service Level Objectives.
- Click the button.
- In step 1 of the SLO wizard, select the Service the SLO will be associated with.
- In step 2, select Instana as the data source for your SLO, then specify the Metric.
- Select Ratio metric > Infrastructure.
- Choose the Data Count Method:
- Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
- Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.
- Enter the Plugin ID for Good and Total metrics (that is, the ID of the plugin available in your monitored system for which you want to retrieve the metric). For more information, refer to the Instana Metrics | Nobl9 Documentation.
- Enter the Metric ID for the Good and Total metrics (the ID of the metric you want to retrieve). For more information, refer to the Instana Metrics | Nobl9 Documentation.
- From the Metric Retrieval Method picklist, select a method to obtain the specific metrics with:
- Query using Dynamic Focus search and filter function
- Snapshot ID, a unique immutable set of metadata, a snapshot.
- Enter the Query or the Snapshot ID for the good and total metrics:
- To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example,
entity.selfType:zookeeper AND entity.label:replica.1
. - You can get the Snapshot ID from the URL in Instana’s UI by looking for the
snapshotId=[SNAPSHOT_ID]
parameter, for example,GbMUvWHy12TTRsIm3Lko4LDAklw
. - In step 3, define a Time Window for the SLO.
- In step 4, specify the Error Budget Calculation Method and your Objective(s).
- In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
- When you’ve finished, click Create SLO.
Build the query using entity names filters - be as specific as possible. Since Nobl9 can only process a single dataset and there is no aggregation on the Nobl9 side, the applied filter must point to the exact entity you want to be observed.
For more information, refer to the Instana metrics.
Note: Ratio metric allows you to combine different Metric Retrieval Methods. For example, you can use query method for the good metric, and Snapshot ID method for the total metric.
countMetrics
), keep in mind that the values resulting from that query for both good and total:- Must be positive.
- While we recommend using integers, fractions are also acceptable.
- If using fractions, we recommend them to be larger than
1e-4
=0.0001
. - Shouldn't be larger than
1e+20
.
sloctl
Generic schema with a description of objects and field validations:
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: string
displayName: string # optional
project: string
spec:
description: string # optional
service: [service name] # name of the service you defined in the same project as the SLO
indicator:
metricSource:
name: [datasource name] # name of the data source you defined
project: [project name] # optional if not defined, project is the same as the SLO
rawMetric:
# exactly one of possible source types which depends on selected metricSource for the SLO
instana: # application XOR infrastructure
metricType: oneOf{"application", "infrastructure"} # mandatory
infrastructure:
metricRetrievalMethod: oneOf{"query", "snapshot"} # mandatory
query: "string" # XOR with snapshotId
snapshotId: "string" # XOR with query
metricId: "string" # mandatory
pluginId: "string" # mandatory
application:
metricId: # mandatory, oneOf{"calls", "erroneousCalls", "errors", "latency"}
aggregation: "" # mandatory, value depends on the metricId type. See notes below
groupBy: # mandatory
tag: "" # mandatory
tagEntity: "" # mandatory, oneOf{"DESTINATION", "SOURCE", "NOT_APPLICABLE"}
tagSecondLevelKey: "" # mandatory
apiQuery: "{}" # mandatory, API query user passes in a JSON format. Must be a valid JSON
includeInternal: false # optional, default value is false
includeSynthetic: false # optional, default value is false
Notes:
-
aggregation
- Depends on the value specified formetricId
:-
For
calls
anderroneousCalls
: usesum
. -
For
errors
: usemean
. -
For
latency
: use one of the valuessum
,mean
,min
,max
,p25
,p50
,p75
,p90
,p95
,p98
,p99
.
-
Threshold (rawMetric)
- Application query
- Infrastructure query
- Infrastructure snapshot
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Instana SLO
indicator:
metricSource:
name: instana
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 200
name: ok
target: 0.95
rawMetric:
query:
instana:
metricType: application
application:
metricId: calls
aggregation: sum
groupBy:
tag: application.name
tagEntity: DESTINATION
apiQuery: |
{
"type": "EXPRESSION",
"logicalOperator": "AND",
"elements": [
{
"type": "TAG_FILTER",
"name": "kubernetes.cluster.name",
"operator": "EQUALS",
"entity": "DESTINATION",
"value": "n9-dev-tooling-cluster"
},
{
"type": "TAG_FILTER",
"name": "kubernetes.container.name",
"operator": "EQUALS",
"entity": "DESTINATION",
"value": "data-node"
},
{
"type": "TAG_FILTER",
"name": "call.type",
"operator": "EQUALS",
"entity": "NOT_APPLICABLE",
"value": "HTTP"
},
{
"type": "TAG_FILTER",
"name": "endpoint.name",
"operator": "EQUALS",
"entity": "DESTINATION",
"value": "GET /"
}
]
}
op: lte
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Instana SLO
indicator:
metricSource:
name: instana
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 200.0
name: ok
target: 0.95
rawMetric:
query:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: query
query: entity.selfType:zookeeper AND entity.label:replica.1
metricId: max_request_latency
pluginId: zooKeeper
op: lte
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Instana SLO
indicator:
metricSource:
name: instana
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 200.0
name: ok
target: 0.95
rawMetric:
query:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: snapshot
snapshotId: 00u2y4e4atkzaYkXP4x8
metricId: max_request_latency
pluginId: zooKeeper
op: lte
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
Ratio (countMetric)
- Infrastructure query
- Infrastructure snapshot
apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Instana SLO
indicator:
metricSource:
name: instana
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 1
name: ok
target: 0.95
countMetrics:
incremental: true
good:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: query
query: entity.selfType:zookeeper AND entity.label:replica.1
metricId: error_requests_count
pluginId: zooKeeper
total:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: query
query: entity.selfType:zookeeper AND entity.label:replica.1
metricId: total_requests_count
pluginId: zooKeeper
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01T00:00:00.000Z
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
- apiVersion: n9/v1alpha
kind: SLO
metadata:
name: api-server-slo
displayName: API Server SLO
project: default
labels:
area:
- latency
- slow-check
env:
- prod
- dev
region:
- us
- eu
team:
- green
- sales
annotations:
area: latency
env: prod
region: us
team: sales
spec:
description: Example Instana SLO
indicator:
metricSource:
name: instana
project: default
kind: Agent
budgetingMethod: Occurrences
objectives:
- displayName: Good response (200)
value: 1.0
name: ok
target: 0.95
countMetrics:
incremental: true
good:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: snapshot
snapshotId: 00u2y4e4atkzaYkXP4x8
metricId: error_requests_count
pluginId: zooKeeper
total:
instana:
metricType: infrastructure
infrastructure:
metricRetrievalMethod: snapshot
snapshotId: 00u2y4e4atkzaYkXP4x8
metricId: total_requests_count
pluginId: zooKeeper
primary: true
service: api-server
timeWindows:
- unit: Month
count: 1
isRolling: false
calendar:
startTime: 2022-12-01 00:00:00
timeZone: UTC
alertPolicies:
- fast-burn-5x-for-last-10m
attachments:
- url: https://docs.nobl9.com
displayName: Nobl9 Documentation
anomalyConfig:
noData:
alertMethods:
- name: slack-notification
project: default
Instana metrics
pluginId
Plugins are entities for which metrics are collected. You cannot get the pluginId
from the Instana UI. To fetch the list of available plugins, use the following API request:
curl --request GET \
--url https://${BASE_URL}/api/infrastructure-monitoring/catalog/plugins \
--header "authorization: apiToken ${API_TOKEN}"
This request returns a list of plugins. The plugin
value is the pluginId
while label
is used in the Instana UI as a display name:
[
{
"plugin": "zooKeeper",
"label": "ZooKeeper"
}
]
metricId
The metricId
is the ID of the metric you want to retrieve. You can get the metricId
by using the following API request:
curl --request GET \
--url https://${BASE_URL}/api/infrastructure-monitoring/catalog/metrics/${PLUGIN_ID} \
--header "authorization: apiToken ${API_TOKEN}"
PLUGIN_ID
is the ID of the plugin you want to retrieve the metricId
for. It's the plugin
from the /api/infrastructure-monitoring/catalog/plugins
response.
This request returns a list of all available metrics for this specific plugin, what you are looking for in the response is metricId
. Here's an example:
[
{
"formatter": "UNDEFINED",
"label": "ZooKeepers Max request latency",
"description": "Max request latency",
"metricId": "max_request_latency",
"pluginId": "zooKeeper",
"custom": false
}
]
Currently, Nobl9 only allows you to retrieve one metric at a time.
query
The Query
is built using Instana's Dynamic Focus search and filter function.
Dynamic Focus queries use Lucene query syntax. The query must be constructed in the Instana UI and copied unchanged. For more information, refer to the Dynamic Focus Query | Instana documentation.
To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example, entity.selfType:zookeeper AND entity.label:replica.1
.
Instana does not allow aggregation of infrastructure metrics. Since Nobl9 can only process a single dataset and there is no aggregation on the N9 side, you have to make sure your query is specific and includes, for example, the name of the target cluster, zone, or node.
snapshotId
A snapshot represents static information about an entity as it was at a specific point in time. For more information, refer to the Search Snapshots | Instana API Docs.
You can get the snapshotId
from the URL in Instana's UI by looking for the snapshotId=${SNAPSHOT_ID}
parameter. For example,
for this URL: https://${BASE_URL}/#/physical/dashboard?timeline.ws=1728000000&timeline.to=1642719600000&timeline.fm=1642719600000&timeline.ar=false&snapshotId=GbMUvQHy12TTRsIm3Lko4LDAklw
the snapshotId is GbMUvQHy12TTRsIm3Lko4LDAklw
.
Currently, Nobl9 only allows you to retrieve one snapshotId
at a time.
Changing the metadata may result in changing the snapshot. When a new snapshotId
is generated for the entity you monitor, you must update your SLO. Otherwise, Nobl9 cannot collect any more measurements.
API query example
Here's an example API query for a Threshold > Application metric:
{
"type":"EXPRESSION",
"logicalOperator":"AND",
"elements": [
{
"type":"TAG_FILTER",
"name":"call.inbound_of_application",
"operator":"EQUALS",
"entity":"NOT_APPLICABLE",
"value":"All Services"
},
{
"type":"TAG_FILTER",
"name":"service.name",
"operator":"EQUALS",
"entity":"DESTINATION",
"value":"datanode"
},
{
"type":"TAG_FILTER",
"name":"endpoint.name",
"operator":"EQUALS",
"entity":"DESTINATION",
"value":"GET /"
}
]
}
Querying the Instana server
Nobl9 queries the Instana server on a per-minute basis. This allows Nobl9 to collect up to one data point per minute.
Instana API rate limits
The following rate limits apply to the Instana API:
- Up to 5000 calls per hour can be made. For more information, refer to the Rate Limits | Instana documentation.