Skip to main content

Sumo Logic

0. Required Metrics

Container and pod metrics are used to determine recommendations for individual workloads. Node and cluster metrics are used to determine cost and overall cluster health.

If you have custom metric names, please contact us for further assistance.

Expand to see all queries that Flightcrew runs (filtered by cluster)
Metric TypeQuery
CPU Capacity{"A": "metric=kube_node_status_capacity resource=cpu | sum by node"}
CPU Capacity{"A": "metric=kube_node_status_capacity resource=cpu | sum"}
CPU Limit{"A": "metric=kube_pod_container_resource_limits resource=cpu | sum by namespace,pod,container"}
CPU Request{"A": "metric=kube_pod_container_resource_requests resource=cpu | sum by namespace,pod,container"}
CPU Usage{"A": "metric=container_cpu_usage_seconds_total | rate counter over 1m | sum by namespace, pod, container"}
Container Restart Count{"A": "metric=kube_pod_container_status_restarts_total | rate counter over 1m | eval _value * 60 | sum by namespace, pod, container"}
Memory Capacity{"A": "metric=kube_node_status_capacity resource=memory | sum by node"}
Memory Capacity{"A": "metric=kube_node_status_capacity resource=memory | sum"}
Memory Limit{"A": "metric=kube_pod_container_resource_limits resource=memory | sum by namespace,pod,container"}
Memory Request{"A": "metric=kube_pod_container_resource_requests resource=memory | sum by namespace,pod,container"}
Memory Usage{"A": "metric=container_memory_working_set_bytes | avg by namespace,pod,container"}
Status Phase{"A": "metric=kube_pod_status_phase | max by namespace,pod,phase | filter latest > 0"}

1. Set up kube-state-metrics

Ensure your Sumo Logic instance includes kube-state-metrics by following Sumo Logic's documentation here.

2. Create access keys and set environment variables

  • Click here to create an access ID and key by hitting the + Add Access Key button.
    • Note: If you are unable to create an access key, you may have insufficient privileges. See the Sumo Logic documentation.

3. Verify API Access

We'll set up some environment variables and curl Sumo Logic's API directly, to make sure the response has the data Flightcrew will be looking for.

NameDescription
SUMO_ACCESS_IDSet to the access ID value created above.
SUMO_ACCESS_KEYSet to the access key value created above.
SUMO_CLUSTER_DISPLAY_NAMEIn the Sumo Logic explorer, find the name of the cluster you're installing on.
SUMO_REGION_CODEClick here to find your region code, which is between service and sumologic in the URL. For example, in https://service.jp.sumologic.com, the region code is jp.
If the URL is simply https://service.sumologic.com/, the region code should be left empty.
# If you need to set the region code, the URL will look like:
# https://api.${SUMO_REGION_CODE}.sumologic.com/api/v1/metricsQueries
export SUMO_API_ENDPOINT="https://api.sumologic.com/api/v1/metricsQueries"

curl --silent -u "${SUMO_ACCESS_ID}:${SUMO_ACCESS_KEY}" -X POST ${SUMO_API_ENDPOINT} --header 'Content-Type: application/json' --data '{
"queries": [
{
"rowId": "A",
"query": "metric=kube_pod_info cluster='"${SUMO_CLUSTER_DISPLAY_NAME}"'",
"quantization": 60000,
"rollup": "Avg",
"timeshift": 0
}
],
"timeRange": {
"type": "BeginBoundedTimeRange",
"from": {
"type": "RelativeTimeRangeBoundary",
"relativeTime": "-3m"
}
}
}' | head --bytes=2000

curl --silent -u "${SUMO_ACCESS_ID}:${SUMO_ACCESS_KEY}" -X POST ${SUMO_API_ENDPOINT} --header 'Content-Type: application/json' --data '{
"queries": [
{
"rowId": "A",
"query": "metric=container_cpu_usage_seconds_total cluster='"${SUMO_CLUSTER_DISPLAY_NAME}"' | rate counter over 1m | sum by namespace, pod, container",
"quantization": 60000,
"rollup": "Avg",
"timeshift": 0
}
],
"timeRange": {
"type": "BeginBoundedTimeRange",
"from": {
"type": "RelativeTimeRangeBoundary",
"relativeTime": "-3m"
}
}
}' | head --bytes=2000

The output should contain data in the "timeSeries" section. For example:

{"queryResult":[{"rowId":"A","timeSeriesList":{"timeSeries":[{"metricDefinition":{...}, "points":{"timestamps":[1704994080000,1704994140000],"values":[1.0,1.0]}}, ...]}}]}

If so, verification is done.

Troubleshooting

If instead it only has "timeSeries":[], and there is an error with code metrics:no_results it's most likely that one of the environment variables is incorrect, and Flightcrew will not be able to interact successfully with the Sumo Logic API. This will look like the following:

{
"queryResult": [
{
"rowId": "A",
"timeSeriesList": { "timeSeries": [] },
"errors": {
"id": "ABCDE-FGHIJ-LMNOP",
"errors": [
{
"code": "metrics:no_results",
"message": "No data in any of the time-series selected",
"detail": null,
"meta": {}
}
]
}
}
]
}