Skip to main content

Stackdriver (Google Cloud Operations)

1. Set up kube-state-metrics

Skip this step if: kube-state-metrics is already set up within GKE. This GKE resource dashboard should contain data.

Actions: Set up metrics collection within Stackdriver:

  1. Enable system metrics collection by following the documentation here.
  2. Set up kube-state-metrics either by self-hosting, or by enabling GCP-managed Prometheus here.

Verify: Check the dashboard link above to ensure data appears in the graphs.

2. Set up Workload Identity

To give the Control Tower access to Stackdriver metrics, we need to give it access to GCP Service Account. The documentation for the following steps can be found in GKE's Workload Identity docs.

NOTE: These steps grant monitoring access for one project only. Although it's possible to use one GCP Service Account across an entire organization, we recommend keeping permissions isolated and doing this same setup for each project.

  1. Set project name environment variable to be used in subsequent commands.

    PROJECT_NAME="your-gcp-project-name"
  2. Create a GCP Service Account:

    gcloud iam service-accounts create flightcrew-monitoring-viewer --project=${PROJECT_NAME}
  3. Give the GCP Service Account permissions to read monitoring data:

    gcloud projects add-iam-policy-binding ${PROJECT_NAME} --member "serviceAccount:flightcrew-monitoring-viewer@${PROJECT_NAME}.iam.gserviceaccount.com" --role roles/monitoring.viewer
  4. Allow the Control Tower's K8s Service Account to access the GCP Service Account's permissions:

    gcloud iam service-accounts add-iam-policy-binding flightcrew-monitoring-viewer@${PROJECT_NAME}.iam.gserviceaccount.com --role roles/iam.workloadIdentityUser --member "serviceAccount:${PROJECT_NAME}.svc.id.goog[flightcrew/control-tower]"

    NOTE: In this command, "flightcrew/control-tower" is the default namespace and deployment name for the helm chart that will be installed in the next step. If you are planning on renaming either of these later, this command needs to be updated to match.

Troubleshooting

Permissions changes in GCP are typically quite fast, but can on occasion take a few hours to propagate. If metrics are not still not being read in after 12 hours, please contact us in Slack or at support@flightcrew.io.