Flagger
Search…
Canary analysis with Prometheus Operator
This guide show you how to use Prometheus Operator for canary analysis.

Prerequisites

Flagger requires a Kubernetes cluster v1.16 or newer and Prometheus Operator v0.40 or newer.
Install Prometheus Operator with Helm v3:
1
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
2
3
kubectl create ns monitoring
4
helm upgrade -i prometheus prometheus-community/kube-prometheus-stack \
5
--namespace monitoring \
6
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false \
7
--set fullnameOverride=prometheus
Copied!
The prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false option allows Prometheus Operator to watch serviceMonitors outside of its namespace.
Install Flagger by setting the metrics server to Prometheus:
1
helm repo add flagger https://flagger.app
2
3
kubectl create ns flagger-system
4
helm upgrade -i flagger flagger/flagger \
5
--namespace flagger-system \
6
--set metricsServer=http://prometheus-prometheus.monitoring:9090 \
7
--set meshProvider=kubernetes
Copied!
Install Flagger's tester:
1
helm upgrade -i loadtester flagger/loadtester \
2
--namespace flagger-system
Copied!
Install podinfo demo app:
1
helm repo add podinfo https://stefanprodan.github.io/podinfo
2
3
kubectl create ns test
4
helm upgrade -i podinfo podinfo/podinfo \
5
--namespace test \
6
--set service.enabled=false
Copied!

Service monitors

The demo app is instrumented with Prometheus, so you can create a ServiceMonitor objects to scrape podinfo's metrics endpoint:
1
apiVersion: monitoring.coreos.com/v1
2
kind: ServiceMonitor
3
metadata:
4
name: podinfo-canary
5
namespace: test
6
spec:
7
endpoints:
8
- path: /metrics
9
port: http
10
interval: 5s
11
selector:
12
matchLabels:
13
app: podinfo-canary
14
---
15
apiVersion: monitoring.coreos.com/v1
16
kind: ServiceMonitor
17
metadata:
18
name: podinfo-primary
19
namespace: test
20
spec:
21
endpoints:
22
- path: /metrics
23
port: http
24
interval: 5s
25
selector:
26
matchLabels:
27
app: podinfo
Copied!
We are setting interval: 5s to have a more aggressive scraping. If you do not define it, you should use a longer interval in the Canary object.

Metric templates

Create a metric template to measure the HTTP requests error rate:
1
apiVersion: flagger.app/v1beta1
2
kind: MetricTemplate
3
metadata:
4
name: error-rate
5
namespace: test
6
spec:
7
provider:
8
address: http://prometheus-prometheus.monitoring:9090
9
type: prometheus
10
query: |
11
100 - rate(
12
http_requests_total{
13
namespace="{{ namespace }}",
14
job="{{ target }}-canary",
15
status!~"5.*"
16
}[{{ interval }}])
17
/
18
rate(
19
http_requests_total{
20
namespace="{{ namespace }}",
21
job="{{ target }}-canary"
22
}[{{ interval }}]
23
) * 100
Copied!
Create a metric template to measure the HTTP requests average duration:
1
apiVersion: flagger.app/v1beta1
2
kind: MetricTemplate
3
metadata:
4
name: latency
5
namespace: test
6
spec:
7
provider:
8
address: http://prometheus-prometheus.monitoring:9090
9
type: prometheus
10
query: |
11
histogram_quantile(0.99,
12
sum(
13
rate(
14
http_request_duration_seconds_bucket{
15
namespace="{{ namespace }}",
16
job="{{ target }}-canary"
17
}[{{ interval }}]
18
)
19
) by (le)
20
)
Copied!

Canary analysis

Using the metrics template you can configure the canary analysis with HTTP error rate and latency checks:
1
apiVersion: flagger.app/v1beta1
2
kind: Canary
3
metadata:
4
name: podinfo
5
namespace: test
6
spec:
7
provider: kubernetes
8
targetRef:
9
apiVersion: apps/v1
10
kind: Deployment
11
name: podinfo
12
progressDeadlineSeconds: 60
13
service:
14
port: 80
15
targetPort: http
16
name: podinfo
17
analysis:
18
interval: 30s
19
iterations: 10
20
threshold: 2
21
metrics:
22
- name: error-rate
23
templateRef:
24
name: error-rate
25
thresholdRange:
26
max: 1
27
interval: 30s
28
- name: latency
29
templateRef:
30
name: latency
31
thresholdRange:
32
max: 0.5
33
interval: 30s
34
webhooks:
35
- name: load-test
36
type: rollout
37
url: "http://loadtester.flagger-system/"
38
timeout: 5s
39
metadata:
40
type: cmd
41
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test/"
Copied!
Based on the above specification, Flagger creates the primary and canary Kubernetes ClusterIP service.
During the canary analysis, Prometheus will scrape the canary service and Flagger will use the HTTP error rate and latency queries to determine if the release should be promoted or rolled back.
Last modified 9mo ago