Sumo Logic

Reading time: 0 minute(s) (0 words)

Sumo Logic is an observability platform that provides visibility into AWS, Azure, and GCP cloud applications and infrastructure.

Sumo Logic parameters and supported features in Nobl9

General support:: Release channel: Stable, Beta; Connection method: Agent, Direct; Replay and SLI Analyzer: Not supported; Event logs: Supported; Query checker: Not supported; Query parameters retrieval: Supported; Timestamp cache persistence: Supported
Query parameters:: Query interval: 2 min; Query delay: 4 min; Jitter: 30 sec; Timeout: 30 sec
Agent details and minimum required versions for supported features:: Plugin name: n9sumologic; Query delay environment variable: SUMOLOGIC_QUERY_DELAY; Query parameters retrieval: 0.73.2; Timestamp cache persistence: 0.65.0
Additional notes:: Supported authentication using <accessId>:<accessKey>

Creating SLOs with Sumo Logic

Sumo Logic allows you to create SLOs for both types of metrics by:

Entering logs
Entering metrics

See the instructions in the following sections for more details.

Nobl9 Web

Threshold – Metrics
Threshold – Logs
Ratio – Metrics
Ratio – Logs

Follow the instructions below to create Sumo Logic threshold metric using the Metrics type:

Navigate to Service Level Objectives.
Click the button.
In step 1 of the SLO wizard, select the Service the SLO will be associated with.
In step 2, select Sumo Logic as the data source for your SLO, then specify the Metric.
Select Threshold metric > Metrics.
Select value and units for Quantization.

In Sumo Logic, quantization is the process of aggregating metric data points for time series over an interval of time. The minimum value for this field is 15s.
For more details, refer to the Sumo Logic documentation.

Select value for Rollup. Rollup is an aggregation function Sumo Logic uses when quantizing metrics.

Select one of the following values: avg, sum, min, max, count, none.
Default value is none.

Enter a Query.

Sample query for Sumo Logic Threshold metric (Metrics type):metric=CPU_usage.

In step 3, define a Time Window for the SLO.
In step 4, specify the Error Budget Calculation Method and your Objective(s).
In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
When you’re done, click Create SLO.

Follow the instructions below to create Sumo Logic threshold metric using the Logs type:

Navigate to Service Level Objectives.
Click the button.
In step 1 of the SLO wizard, select the Service the SLO will be associated with.
In step 2, select Sumo Logic as the data source for your SLO, then specify the Metric.
Select Threshold metric > Logs.
Enter a Query

The Query must contain the keyword timeslice.
Sample query for Sumo Logic threshold metric:

_sourceCategory=uploads/nginx

| timeslice 1m as n9_time

| parse "HTTP/1.1" * * " as (status_code, size, tail)

| if (status_code matches "20" or status_code matches "30*",1,0) as resp_ok

| sum(resp_ok) as n9_value by n9_time

| sort by n9_time asc

In step 3, define a Time Window for the SLO.
In step 4, specify the Error Budget Calculation Method and your Objective(s).
In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
When you’re done, click Create SLO.

Follow the instructions below to create Sumo Logic ratio metric using the Logs type:

Navigate to Service Level Objectives.
Click the button.
In step 1 of the SLO wizard, select the Service the SLO will be associated with.
In step 2, select Sumo Logic as the data source for your SLO, then specify the Metric.
Select Ratio metric > Logs.
Choose the Data Count Method.

Non-incremental: counts incoming metric values one-by-one. So the resulting SLO graph is pike-shaped.
Incremental: counts the incoming metric values incrementally, adding every next value to previous values. It results in a constantly increasing SLO graph.

Enter a Query. The query must contain the keyword timeslice:

Good query for the ratio metric (logs type):

_sourceCategory=uploads/nginx

| timeslice 1m as n9_time

| parse "HTTP/1.1" * * " as (status_code, size, tail)

| if (status_code matches "20" or status_code matches "30*",1,0) as resp_ok

| sum(resp_ok) as n9_value by n9_time

| sort by n9_time asc

Total query for the ratio metric (logs type):

_sourceCategory=uploads/nginx

| timeslice 1m as n9_time

| parse "HTTP/1.1" * * " as (status_code, size, tail)

| count() as n9_value by n9_time

| sort by n9_time asc

In step 3, define a Time Window for the SLO.
In step 4, specify the Error Budget Calculation Method and your Objective(s).
In step 5, add a Name, Description, and other details about your SLO. You can also select Alert policies and Labels on this screen.
When you’re done, click Create SLO.

SLI values for good and total

When choosing the query for the ratio SLI (countMetrics), keep in mind that the values resulting from that query for both good and total:

Must be positive.
While we recommend using integers, fractions are also acceptable.

If using fractions, we recommend them to be larger than 1e-4 = 0.0001.

Shouldn't be larger than 1e+20.

sloctl

Sumo Logic metrics

Threshold (rawMetric)
Ratio (countMetric)

Sample Sumo Logic threshold metrics SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Sumo Logic SLO
  indicator:
    metricSource:
      name: sumo-logic
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200.0
      name: ok
      target: 0.95
      rawMetric:
        query:
          sumoLogic:
            type: metrics
            query: metric=CPU_Usage
            quantization: 15s
            rollup: Avg
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Sumo Logic ratio metrics SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Sumo Logic SLO
  indicator:
    metricSource:
      name: sumo-logic
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1.0
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          sumoLogic:
            type: metrics
            query: metric=Mem_Used
            quantization: 15s
            rollup: Avg
        total:
          sumoLogic:
            type: metrics
            query: metric=Mem_Total
            quantization: 15s
            rollup: Avg
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: "2022-12-01 00:00:00"
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Mandatory requirements for Sumo Logic metrics SLOs

Specification for Sumo Logic metrics has the following mandatory fields:

sumologic
- type - string field. Select only one of the following values: metrics or logs.
- quantization - integer field for the period of data aggregation.
  - In Sumo Logic, quantization is the process of aggregating metric data points for time series over an interval of time (e.g, s, h). The minimum value for this field is 15s.
  - For more details, refer to the Metric Quantization | Sumo Logic documentation.
- rollup - string field.
  Rollup is an aggregation function Sumo Logic uses when quantizing metrics. Choose one of the below values (default is none):
  - avg, sum, min, max, count, none.
  - For more details, refer to the Rollup Types | Sumo Logic documentation.
- query - string field.
  Your custom query. Example: metric=CPU_usage

Sumo Logic logs

Threshold (rawMetric)
Ratio (countMetric)

Sample Sumo Logic threshold logs SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Sumo Logic SLO
  indicator:
    metricSource:
      name: sumo-logic
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 200
      name: ok
      target: 0.95
      rawMetric:
        query:
          sumoLogic:
            type: logs
            query: >-
              _sourceCategory=uploads/nginx

              | timeslice 1m as n9_time

              | parse "HTTP/1.1" * * " as (status_code, size, tail)

              | if (status_code matches "20" or status_code matches "30*",1,0)
              as resp_ok

              | sum(resp_ok) as n9_value by n9_time

              | sort by n9_time asc
      op: lte
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Sample Sumo Logic ratio logs SLO
apiVersion: n9/v1alpha
kind: SLO
metadata:
  name: api-server-slo
  displayName: API Server SLO
  project: default
  labels:
    area:
      - latency
      - slow-check
    env:
      - prod
      - dev
    region:
      - us
      - eu
    team:
      - green
      - sales
  annotations:
    area: latency
    env: prod
    region: us
    team: sales
spec:
  description: Example Sumo Logic SLO
  indicator:
    metricSource:
      name: sumo-logic
      project: default
      kind: Agent
  budgetingMethod: Occurrences
  objectives:
    - displayName: Good response (200)
      value: 1
      name: ok
      target: 0.95
      countMetrics:
        incremental: true
        good:
          sumoLogic:
            type: logs
            query: |-
              _collector="app-cluster" _source="logs"
              | json "log"
              | timeslice 15s as n9_time
              | parse "level=* *" as (log_level, tail)
              | if (log_level matches "error" ,0,1) as log_level_not_error
              | sum(log_level_not_error) as n9_value by n9_time
              | sort by n9_time asc
        total:
          sumoLogic:
            type: logs
            query: |-
              _collector="app-cluster" _source="logs"
              | json "log"
              | timeslice 15s as n9_time
              | parse "level=* *" as (log_level, tail)
              | count(*) as n9_value by n9_time
              | sort by n9_time asc
      primary: true
  service: api-server
  timeWindows:
    - unit: Month
      count: 1
      isRolling: false
      calendar:
        startTime: '2022-12-01 00:00:00'
        timeZone: UTC
  alertPolicies:
    - fast-burn-5x-for-last-10m
  attachments:
    - url: https://docs.nobl9.com
      displayName: Nobl9 Documentation
  anomalyConfig:
    noData:
      alertMethods:
        - name: slack-notification
          project: default
      alertAfter: 1h

Mandatory requirements for Sumo Logic Logs queries

query:
- Must contain the keyword timeslice:
  - Sumo Logic supports only integers (15s , 1m, 1050ms).
  - The minimum value for timeslice is 15 sec.
- Must contain n9time and n9value: The n9time is the actual time, and the n9value is the metric value. The n9time must be a Unix timestamp and the n9value must be a float value.
- Must contain aggregation keyword, such as count(*) by n9_time as n9_value.
- Alias fields or your query by an as operator to ensure you get an n9_time and n9_value returned in your query. For details on the as operator, refer to Sumo Logic documentation.

For more details on constructing Sumo Logic queries, see the Querying for logs section below.

Querying for logs

Sumo Logic Search Syntax is based on Pipelines. Queries work similarly to Pipelines in Unix-like operating systems:

operator1 | operator2 | operator3

Each operator is separated by the | sign and passes the result to the next one, and they are progressively filtered, so eventually, you get the desired result.

All queries begin with a keyword or string search. Special characters:

* - a wildcard, for zero or more characters.
? - a question Mark, for a single character.

An example of Sumo Logic query looks like this:

_sourceCategory=uploads/nginx
| parse "HTTP/1.1\" * * *" as (status_code, size, tail)

In the example above, the first wildcard is evaluated as the status_code, the second - size, and the third will store the remaining message.

An example good query for count metrics (SLO based on HTTP status codes) looks like this:

_sourceCategory=uploads/nginx
| timeslice 1m as n9_time
| parse "HTTP/1.1\" * * *" as (status_code, size, tail)
| if (status_code matches "20*" or status_code matches "30*",1,0) as resp_ok
| sum(resp_ok) as n9_value by n9_time
| sort by n9_time asc

That will produce the following output:

"n9_time","n9_value"
"1645371960000","2.0"
"1645372020000","58.0"
"1645372080000","46.0"
"1645372140000","12.0"
"1645372200000","12.0"
"1645372260000","12.0"
"1645372320000","14.0"
"1645372380000","22.0"

A similar query, but for Total instead of Good:

  _sourceCategory=uploads/nginx
| timeslice 1m as n9_time
| parse "HTTP/1.1\" * * *" as (status_code, size, tail)
| count(*) as n9_value by n9_time
| sort by n9_time asc

For the full specification on Sumo Logic queries, refer to the official documentation.

Querying the Sumo Logic server

Nobl9 queries Sumo Logic leveraging the Search Job API or Metrics Query API every two minutes with a query delay of four minutes. The maximum resolution of the response must be 4 data points.

The query's Time range is set from the beginning and end of the 2-minute-time window being queried.

Sumo Logic API rate limits

Sumo Logic's Search Job API requests are rate limited (see Rate limit throttling | Sumo Logic documentation).

The Nobl9 agent requests several endpoints to gather data points according to the Process Flow described in the documentation. The Nobl9 agent distributes the required requests within the two-minute interval to reduce the number of requests per second.

To prevent Sumo Logic rate limits issues:

Prefer metrics queries over logs queries. Logs are at least 4 times more expensive than metrics (see how to convert your logs to metrics)
Logs queries should take at most two minutes (using Sumo Logic partitions and Sumo Logic scheduled views will help a lot)
If you're using the Nobl9 agent for Sumo Logic, stick to a single agent as your data source (this will allow Nobl9 to orchestrate querying Sumo Logic API). This does not apply to directs, having multiple of them doesn't impact rate limiting orchestration.
Keep the number of Sumo Logic logs objectives in check with your API limits (see Number of objectives directed)
Contact Sumo Logic customer support to increase your rate limits and prevent conflicts.

Number of directed objectives

Sumo Logic allows for a total of 240 requests per minute to its APIs combined. Nobl9 agent for Sumo Logic has a 2-minute query interval. It means that Nobl9 can make up to 480 API requests to Sumo Logic.

Querying for metrics

Querying metrics is synchronous—you query, and the API responds with data.

This means you could have at most 480 unique Metrics queries run against Sumo Logic API.

Querying for logs

Querying logs is more complicated. The following shows the lifecycle of obtaining the data:

Create a search logs job.
Wait 20 seconds and query if the job is completed (repeat until the process is completed).
Fetch data for the finished job.
Delete the job.

Each of the steps executed uses up one request to the Sumo Logic API. The optimistic count for a single logs query is 4. Step 2 (listed above) may, and most probably will, be repeated, as logs queries usually need more processing time. The pessimistic count is that step 2. will be repeated 6 times using up to 9 API requests per a single logs query.

This means that you can have anywhere from 54 to 120 logs queries.

Limitations

For direct connections, we only support orchestration of querying Sumo Logic within the same release channel. Having the direct connections both in the Stable and Beta release channels causes desynchronization of querying and may result in failures.

Useful links

For a more in-depth look, consult additional resources:

Add Sumo Logic as a data sourceAdding data sources

'as' operatorExternal docs

Metric quantizationExternal docs

Rate limit throttlingExternal docs

Process FlowExternal docs

Creating SLOs via TerraformTerraform

Creating SLOs with Sumo Logic​

Nobl9 Web​

sloctl​

Sumo Logic metrics​

Sumo Logic logs​

Querying for logs​

Querying the Sumo Logic server​

Sumo Logic API rate limits​

Number of directed objectives​

Querying for metrics​

Querying for logs​

Limitations​

Useful links​

Creating SLOs with Sumo Logic

Nobl9 Web

sloctl

Sumo Logic metrics

Sumo Logic logs

Querying for logs

Querying the Sumo Logic server

Sumo Logic API rate limits

Number of directed objectives

Querying for metrics

Querying for logs

Limitations

Useful links