Install the Control Tower
Using Helm (directly)
To install Flightcrew via Helm on the command line:
-
(Optional) Create a separate namespace for the deployment to live in:
kubectl create namespace flightcrew
-
Determine values to fill into the helm chart based on your metric provider:
Requirement Field Name Description Always required fc_api_key
Your Flightcrew API key is a unique, private key that can be found after signing up in your instance. The same key should be used for all Control Towers. Always required cluster_name
Your Kubernetes cluster's name, whose name can be found by using aws eks list-clusters
orgcloud container clusters list
Always required cloud_provider
For EKS clusters, use provider:aws/platform:eks
. For GKE clusters, useprovider:gcp/platform:gke
Always required metric_providers
Metric provider name: "datadog", "prometheus", "stackdriver", "sumologic" Datadog only datadog_api_key
API key created in the Datadog setup steps Datadog only datadog_app_key
Application key created in the Datadog setup steps Prometheus only prometheus_url
The Control Tower reads from the DNS name for the Prometheus service, in the form http://[SERVICE].[NAMESPACE].svc.cluster.local:[PORT]
, where:
SERVICE=Prometheus service name
NAMESPACE=Its namespace
PORT=spec.ports.port from the Prometheus service config
The final URL should look something likehttp://prometheus-service.monitoring.svc.cluster.local:9090
Stackdriver only stackdriver_service_account
Set to the service account created in the Stackdriver setup steps which should look like iam.gke.io/gcp-service-account=flightcrew-monitoring-viewer@PROJECT_NAME.iam.gserviceaccount.com
Sumo Logic only sumo_access_id
Access ID created in the Sumo Logic setup steps Sumo Logic only sumo_access_key
Access Key created in the Sumo Logic setup steps Sumo Logic only sumo_cluster_display_name
Cluster name, described in Sumo Logic verification steps Sumo Logic only sumo_region_code
Region code, described in Sumo Logic verification steps -
Install the helm chart by running following command, setting the values from above. Here's an example for Datadog:
helm repo add flightcrew https://flightcrew-helm-charts.storage.googleapis.com/control-tower/stable
helm repo update
helm install control-tower flightcrew/control-tower \
--namespace flightcrew \
--set fc_api_key=123-abc-xyz \
--set cluster_name=my-prod-cluster \
--set cloud_provider=provider:aws/platform:eks \
--set metric_providers="datadog" \
--set dd_api_key=000111 \
--set dd_app_key=222333
After a minute or two, your Kubernetes resources and metrics should begin to populate in the dashboard. If so, we're done!
Flightcrew will take a few hours to build a full understanding of your cloud infrastructure and begin surfacing insights. Recommendations will continue to improve with time after gathering more and more data.
Using Helm (via Terraform)
This example shows how to set up the .tf file with Amazon EKS and Sumo Logic.
Replace the values to match your setup, and then run terraform apply
.
resource "helm_release" "flightcrew" {
name = "control-tower"
namespace = "flightcrew"
repository = "https://flightcrew-helm-charts.storage.googleapis.com/control-tower/stable"
chart = "control-tower"
create_namespace = true
set {
name = "fc_api_key"
value = var.FC_API_KEY
}
set {
name = "cluster_name"
value = var.cluster_name
}
set {
name = "cloud_provider"
value = "provider:aws/platform:eks"
}
set {
name = "metric_providers"
value = "sumologic"
}
set {
name = "sumo_access_id"
value = var.SUMOLOGIC_ACCESSID
}
set {
name = "sumo_access_key"
value = var.SUMOLOGIC_ACCESSKEY
}
set {
name = "sumo_cluster_display_name"
value = var.cluster_name
}
set {
name = "sumo_region_code"
value = "us2"
}
}
Troubleshooting
CrashLoopBackOff
The Control Tower will fail loudly if it's misconfigured or cannot reach the APIs it needs. Run the following command to check the error logs:
kubectl logs deployment/control-tower --namespace flightcrew --previous --tail=20
Feel free to send the error logs to the Flightcrew team on Slack or support@flightcrew.io for help debugging.
Out of Memory
Resource usage will vary slightly depending on the size of the cluster. Memory usage very rarely exceeds 500Mi, but if the pod is hitting its memory limit and getting OOMKilled
, the resources can be increased by updating fields on the helm chart, for example:
helm upgrade control-tower flightcrew/control-tower --namespace flightcrew --reuse-values \
--set resources.limits.memory="2000Mi" \
--set resources.requests.memory="1000Mi"
Please still reach out to support@flightcrew.io if this happens, so we can take a closer look as something else may be going wrong.