Instana

Reading time: 0 minute(s) (0 words)

Instana is an observability platform that delivers automated Application Performance Monitoring (APM), used for website, infrastructure, and application monitoring.

Instana parameters and supported features in Nobl9

General support:: Release channel: Stable, Beta; Connection method: Agent, Direct; Replay and SLI Analyzer: Not supported; Event logs: Supported; Query checker: Not supported; Query parameters retrieval: Not supported; Timestamp cache persistence: Supported
Query parameters:: Query interval: 1 min; Query delay: 1 min; Jitter: 15 sec; Timeout: 30 sec
Agent details and minimum required versions for supported features:: Plugin name: n9instana; Query delay environment variable: INSTANA_QUERY_DELAY; Timestamp cache persistence: 0.65.0
Additional notes:: No support for website and application (ratio) monitoring metrics; Learn more

Creating SLOs with Instana

Instana allows you to create SLOs based on:

Infrastructure metrics (for infrastructure components)
Application metrics (for defined applications, discovered services, and endpoints)

Infrastructure metrics can be defined as either Threshold metrics or Ratio metrics, but Application metrics can only be defined as Threshold metrics.

See the instructions in the following sections for more details.

Nobl9 Web

Threshold – Infrastructure
Threshold – Application
Ratio – Infrastructure

Follow the instructions below to create an SLO based on a Threshold metric using the Infrastructure type:

Navigate to Service Level Objectives.
Click the button.
In step 1 of the SLO wizard, select the Service the SLO will be associated with.
In step 2, select Instana as the data source for your SLO, then specify the Metric.
Select Threshold metric > Infrastructure.
Enter the Plugin ID (the ID of the plugin available in your monitored system for which you want to retrieve the metric). For more information, refer to the Instana metrics.
Enter the Metric ID, meaning the ID of the metric you want to retrieve. For more information, refer to the Instana metrics.
From the Metric Retrieval Method picklist, select a method to obtain the specific metrics with:

Query, using Dynamic Focus search and filter function:

To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example, entity.selfType:zookeeper AND entity.label:replica.1.

Snapshot ID, a unique immutable set of metadata, a snapshot:

You can get the Snapshot ID from the URL in Instana’s UI by looking for the snapshotId=[SNAPSHOT_ID] parameter, for example, GbMUvWHy12TTRsIm3Lko4LDAklw.

For more information, refer to the Instana metrics.

Enter the Query or the Snapshot ID.
In step 3, define a Time Window for the SLO.
In step 4, specify the Error Budget Calculation Method and your Objective(s).
In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
When you've finished, click Create SLO.

Follow the instructions below to create an SLO based on a Threshold metric using the Application type:

Navigate to Service Level Objectives.

Click the button.

In step 1 of the SLO wizard, select the Service the SLO will be associated with.

In step 2, select Instana as the data source for your SLO, then specify the Metric.

Select Threshold metric > Application.

Select the Metric ID you want to use from the following list:

Calls - to monitor the number of received calls
Erroneous Calls - to monitor the number of erroneous calls
Erroneous Calls Rate - to monitor the error rate of received calls
Latency - to monitor the latency of received calls in milliseconds

Select the Aggregation. The following list shows the aggregations available for each Metric ID:

Calls: sum
Erroneous Calls: sum
Erroneous Calls Rate: mean
Latency: sum, mean, max, min, p25, p50, p75, p90, p95, p98, p99

Note that the value in the Aggregation field is selected by default for Calls, Erroneous Calls, and Erroneous Calls Rate Metric ID values.

Enter the API query. You must create this in the Instana UI and copy and paste it here. There are two methods you can use to specify the query:

Method 1:
From the Applications tab, by selecting the Application, Service, or Endpoint you want to observe and clicking the Analyze Calls button.

In the Filter field, you can already see a partially defined query. The applied filter must point to the exact entity you want to be observed, for example:

Filter using entity name - be as specific as possible. Providing only the endpoint name or a service name in the filter will most likely be insufficient. There can be a lot of GET / endpoints belonging to different services and applications. Likewise, the same service name can appear in various applications.

Any additional manual selections from the left panel in Instana UI will be included in the API query, for example, HTTP Status or Technologies.

Decide which of the hidden calls you’d like to include - Synthetic or Internal. They are not included in the API query and need to be passed to Nobl9 manually.

Copy the API query. Make sure you have the toggle Include filter sidebar items on, otherwise, the additional manual selections won’t be included in the API query.

Select the Group, meaning the single entity you want to be observed. Group by the most specific parameter in the created filter. You can always view the resulting groups below the charts in the Analytics view.

Follow these guidelines to apply the correct grouping:

A combination of filters and group elements must point to a single entity.
Group by the last logical element in the defined filter. Be as specific as possible.
The state of the monitored system may change and more than one group can be associated with the previously applied grouping. Therefore, group by entity names as accurately as possible.
You may have to change the group element accordingly when you change the applied filter.

Method 2:
Go to Analytics, specify the query, and follow the instructions above.
For more information, refer to the Application Analyze | Instana Documentation.

Enter the Tag, Tag Entity, and Tag Second Level Key (if applicable). You can get the Group details in two ways:

In the Instana UI, look for the groupBy=(...) section in the URL.

Note that field names vary between the Instana API and the Nobl9 API:

Field name in Instana	Field name in Nobl9
`groupbyTag`	`tag`
`groupbyTagEntity`	`tagEntity`
`groupbyTagSecondLevelKey`	`tagSecondLevelKey`

Parse the deeplink by using the following script and convert it to YAML for the Nobl9 sloctl:

#!/bin/bash
  echo "$DEEPLINK" |
    sed 's/;/\n/g' |
    grep groupBy |
    sed 's/groupBy=(//' |
    sed 's/).*//' |
    tr -d '\n' |
    awk '
      BEGIN {RS = "~"; print "groupBy:"}
      {ctr++}
      ctr == 1 {key = $0}
      ctr == 2 {value = $0; printf "  %s: \"%s\"\n", key, value; ctr = 0}' |
    sed 's/groupbyT/t/g' |
    awk -F':' '
      BEGIN {tagEntityExists = 0}
      $1 ~ "tagEntity" {tagEntityExists = 1}
      {print}
      END {
        if(tagEntityExists == 0)
          {print "  tagEntity: \"NOT_APPLICABLE\""}
      }'

If you want to include the hidden calls, check the Synthetic or Internal checkbox in the Include Hidden Calls field.

In step 3, define a Time Window for the SLO.

In step 4, specify the Error Budget Calculation Method and your Objective(s).

In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.

When you’ve finished, click Create SLO.

Follow the instructions below to create an SLO based on a Ratio metric using the Infrastructure type:

Navigate to Service Level Objectives.

Click the button.

In step 1 of the SLO wizard, select the Service the SLO will be associated with.

In step 2, select Instana as the data source for your SLO, then specify the Metric.

Select Ratio metric > Infrastructure.

Choose the Data Count Method:

Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.

Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.

Enter the Plugin ID for Good and Total metrics (that is, the ID of the plugin available in your monitored system for which you want to retrieve the metric).

Instana Metrics | Nobl9 Documentation

Enter the Metric ID for the Good and Total metrics (the ID of the metric you want to retrieve).

Instana Metrics | Nobl9 Documentation

From the Metric Retrieval Method picklist, select a method to obtain the specific metrics with:

Query using Dynamic Focus search and filter function

Snapshot ID, a unique immutable set of metadata, a snapshot.

Enter the Query or the Snapshot ID for the good and total metrics:

To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example, entity.selfType:zookeeper AND entity.label:replica.1.

Build the query using entity names filters - be as specific as possible. Since Nobl9 can only process a single dataset and there is no aggregation on the Nobl9 side, the applied filter must point to the exact entity you want to be observed.

You can get the Snapshot ID from the URL in Instana’s UI by looking for the snapshotId=[SNAPSHOT_ID] parameter, for example, GbMUvWHy12TTRsIm3Lko4LDAklw.

For more information, refer to the Instana metrics.

Note: Ratio metric allows you to combine different Metric Retrieval Methods. For example, you can use query method for the good metric, and Snapshot ID method for the total metric.

In step 3, define a Time Window for the SLO.

In step 4, specify the Error Budget Calculation Method and your Objective(s).

In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.

When you’ve finished, click Create SLO.

SLI values for good and total

When choosing the query for the ratio SLI (countMetrics), keep in mind that the values resulting from that query for both good and total:

Must be positive.
While we recommend using integers, fractions are also acceptable.

If using fractions, we recommend them to be larger than 1e-4 = 0.0001.

Shouldn't be larger than 1e+20.

sloctl

Generic schema with a description of objects and field validations:

apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: string
  displayName: string # optional
  project: string
spec:
  description: string # optional
  service: [service name] # name of the service you defined in the same project as the SLO
  indicator:
    metricSource:
      name: [datasource name] # name of the data source you defined
      project: [project name] # optional if not defined, project is the same as the SLO
    rawMetric:
      # exactly one of possible source types which depends on selected metricSource for the SLO
      instana: # application XOR infrastructure
        metricType: oneOf{"application", "infrastructure"} # mandatory
        infrastructure:
          metricRetrievalMethod: oneOf{"query", "snapshot"} # mandatory
          query: "string" # XOR with snapshotId
          snapshotId: "string" # XOR with query
          metricId: "string" # mandatory
          pluginId: "string" # mandatory
        application:
          metricId: # mandatory, oneOf{"calls", "erroneousCalls", "errors", "latency"}
          aggregation: "" # mandatory, value depends on the metricId type. See notes below
          groupBy: # mandatory
            tag: "" # mandatory
            tagEntity: "" # mandatory, oneOf{"DESTINATION", "SOURCE", "NOT_APPLICABLE"}
            tagSecondLevelKey: "" # mandatory
          apiQuery: "{}" # mandatory, API query user passes in a JSON format. Must be a valid JSON
          includeInternal: false # optional, default value is false
          includeSynthetic: false # optional, default value is false

Notes:

aggregation - Depends on the value specified for metricId:
- For calls and erroneousCalls: use sum.
- For errors: use mean.
- For latency: use one of the values sum, mean, min, max, p25, p50, p75, p90, p95, p98, p99.

Threshold (rawMetric)

Application query
Infrastructure query
Infrastructure snapshot

Sample Instana threshold SLO with the metricType: application
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Instana SLO
  indicator:
    metricSource:
      name: instana
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200
      name: ok
      target: 0.95
      rawMetric:
        query:
          instana:
            metricType: application
            application:
              metricId: calls
              aggregation: sum
              groupBy:
                tag: application.name
                tagEntity: DESTINATION
              apiQuery: |
                {
                  "type": "EXPRESSION",
                  "logicalOperator": "AND",
                  "elements": [
                    {
                      "type": "TAG_FILTER",
                      "name": "kubernetes.cluster.name",
                      "operator": "EQUALS",
                      "entity": "DESTINATION",
                      "value": "n9-dev-tooling-cluster"
                    },
                    {
                      "type": "TAG_FILTER",
                      "name": "kubernetes.container.name",
                      "operator": "EQUALS",
                      "entity": "DESTINATION",
                      "value": "data-node"
                    },
                    {
                      "type": "TAG_FILTER",
                      "name": "call.type",
                      "operator": "EQUALS",
                      "entity": "NOT_APPLICABLE",
                      "value": "HTTP"
                    },
                    {
                      "type": "TAG_FILTER",
                      "name": "endpoint.name",
                      "operator": "EQUALS",
                      "entity": "DESTINATION",
                      "value": "GET /"
                    }
                  ]
                }
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Instana threshold SLO with the metricType: infrastructure with the query retrieval method
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Instana SLO
  indicator:
    metricSource:
      name: instana
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200.0
      name: ok
      target: 0.95
      rawMetric:
        query:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: query
              query: entity.selfType:zookeeper AND entity.label:replica.1
              metricId: max_request_latency
              pluginId: zooKeeper
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Instana threshold SLO with the metricType: infrastructure with the snapshot ID retrieval method
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Instana SLO
  indicator:
    metricSource:
      name: instana
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200.0
      name: ok
      target: 0.95
      rawMetric:
        query:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: snapshot
              snapshotId: 00u2y4e4atkzaYkXP4x8
              metricId: max_request_latency
              pluginId: zooKeeper
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Ratio (countMetric)

Infrastructure query
Infrastructure snapshot

Sample Instana threshold SLO with the metricType: infrastructure with the query retrieval method
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Instana SLO
  indicator:
    metricSource:
      name: instana
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: query
              query: entity.selfType:zookeeper AND entity.label:replica.1
              metricId: error_requests_count
              pluginId: zooKeeper
        total:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: query
              query: entity.selfType:zookeeper AND entity.label:replica.1
              metricId: total_requests_count
              pluginId: zooKeeper
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Instana threshold SLO with the metricType: infrastructure with the snapshot ID retrieval method
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Instana SLO
  indicator:
    metricSource:
      name: instana
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1.0
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: snapshot
              snapshotId: 00u2y4e4atkzaYkXP4x8
              metricId: error_requests_count
              pluginId: zooKeeper
        total:
          instana:
            metricType: infrastructure
            infrastructure:
              metricRetrievalMethod: snapshot
              snapshotId: 00u2y4e4atkzaYkXP4x8
              metricId: total_requests_count
              pluginId: zooKeeper
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Instana metrics

`pluginId`

Plugins are entities for which metrics are collected. You cannot get the pluginId from the Instana UI. To fetch the list of available plugins, use the following API request:

curl --request GET \
       --url https://${BASE_URL}/api/infrastructure-monitoring/catalog/plugins \
       --header "authorization: apiToken ${API_TOKEN}"

This request returns a list of plugins. The plugin value is the pluginId while label is used in the Instana UI as a display name:

[
    {
      "plugin": "zooKeeper",
      "label": "ZooKeeper"
    }
]

`metricId`

The metricId is the ID of the metric you want to retrieve. You can get the metricId by using the following API request:

curl --request GET \
       --url https://${BASE_URL}/api/infrastructure-monitoring/catalog/metrics/${PLUGIN_ID} \
       --header "authorization: apiToken ${API_TOKEN}"

note

PLUGIN_ID is the ID of the plugin you want to retrieve the metricId for. It's the plugin from the /api/infrastructure-monitoring/catalog/plugins response.

This request returns a list of all available metrics for this specific plugin, what you are looking for in the response is metricId. Here's an example:

[
    {
      "formatter": "UNDEFINED",
      "label": "ZooKeepers Max request latency",
      "description": "Max request latency",
      "metricId": "max_request_latency",
      "pluginId": "zooKeeper",
      "custom": false
    }
]

note

Currently, Nobl9 only allows you to retrieve one metric at a time.

`query`

The Query is built using Instana's Dynamic Focus search and filter function.

Dynamic Focus queries use Lucene query syntax. The query must be constructed in the Instana UI and copied unchanged. For more information, refer to the Dynamic Focus Query | Instana documentation.

To provide the query, go to Infrastructure > Map in the Instana UI and build the query in the input field, for example, entity.selfType:zookeeper AND entity.label:replica.1.

note

Instana does not allow aggregation of infrastructure metrics. Since Nobl9 can only process a single dataset and there is no aggregation on the N9 side, you have to make sure your query is specific and includes, for example, the name of the target cluster, zone, or node.

`snapshotId`

A snapshot represents static information about an entity as it was at a specific point in time. For more information, refer to the Search Snapshots | Instana API Docs.

You can get the snapshotId from the URL in Instana's UI by looking for the snapshotId=${SNAPSHOT_ID} parameter. For example,

for this URL: https://${BASE_URL}/#/physical/dashboard?timeline.ws=1728000000&timeline.to=1642719600000&timeline.fm=1642719600000&timeline.ar=false&snapshotId=GbMUvQHy12TTRsIm3Lko4LDAklw

the snapshotId is GbMUvQHy12TTRsIm3Lko4LDAklw.

note

Currently, Nobl9 only allows you to retrieve one snapshotId at a time.

warning

Changing the metadata may result in changing the snapshot. When a new snapshotId is generated for the entity you monitor, you must update your SLO. Otherwise, Nobl9 cannot collect any more measurements.

API query example

Here's an example API query for a Threshold > Application metric:

{
  "type":"EXPRESSION",
  "logicalOperator":"AND",
  "elements": [
    {
        "type":"TAG_FILTER",
        "name":"call.inbound_of_application",
        "operator":"EQUALS",
        "entity":"NOT_APPLICABLE",
        "value":"All Services"
    },
    {
        "type":"TAG_FILTER",
        "name":"service.name",
        "operator":"EQUALS",
        "entity":"DESTINATION",
        "value":"datanode"
    },
    {
        "type":"TAG_FILTER",
        "name":"endpoint.name",
        "operator":"EQUALS",
        "entity":"DESTINATION",
        "value":"GET /"
    }
  ]
}

Querying the Instana server

Nobl9 queries the Instana server on a per-minute basis. This allows Nobl9 to collect up to one data point per minute.

Instana API rate limits

The following rate limits apply to the Instana API:

Up to 5000 calls per hour can be made. For more information, refer to the Rate Limits | Instana documentation.

Useful links

For a more in-depth look, consult additional resources:

Add Instana as a data sourceAdding data sources

Infrastructure metricsInstana docs

Application analyzeInstana docs

Dynamic focus queryInstana docs

IBM observabilityInstana docs

Rate limitsInstana docs

Creating SLOs via TerraformTerraform

Creating SLOs with Instana​

Nobl9 Web​

sloctl​

Threshold (rawMetric)​

Ratio (countMetric)​

Instana metrics​

pluginId​

metricId​

query​

snapshotId​

API query example​

Querying the Instana server​

Instana API rate limits​

Useful links​

Creating SLOs with Instana

Nobl9 Web

sloctl

Threshold (rawMetric)

Ratio (countMetric)

Instana metrics

`pluginId`

`metricId`

`query`

`snapshotId`

API query example

Querying the Instana server

Instana API rate limits

Useful links