Alertmanager template problem with Prometheus rule

Darryl Lee · June 28, 2022

Hey folks,
I've alertmanager integrated with Opsgenie working well, however I'm facing a problem when my annotations match more than one record fired.

For example, I have a PrometheusRule to monitor kubernetes pods in crash/pending state and if more than one pod is having problems, the description annotation below does not appear on opsgenie, only runbook and dashboard, if only one pod is having problems, I can see the description normally on opsgenie.

does not appear on opsgenie if more than 1 pod is firing.

description: Pod {{ $labels.pod }} in the namespace {{ $labels.namespace }}

I guess is something related to arrays, not sure where and how to fix it.

Alertmanager template config

config:
	global: {}
	receivers:
	- name: opsgenie
	opsgenie_configs:
	- api_key: ${opsgenie_key}
	description: \|-
	{{ range .CommonAnnotations.SortedPairs }}
	- {{ .Name }} = {{ .Value }}
	{{- end }}
	message: '[{{ .Status \| toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing \| len }}{{ end }}] {{ .GroupLabels.alertname }}'
	priority: '{{ if .GroupLabels.priority }}{{ .GroupLabels.priority }}{{ else }}p2{{ end }}'
	responders:
	- name: '{{ if .GroupLabels.responders }}{{ .GroupLabels.responders }}{{ else }}platform{{ end }}'
	type: team

Prometheus Rule

	apiVersion: monitoring.coreos.com/v1
	kind: PrometheusRule
	metadata:
	labels:
	app: kube-prometheus-stack
	release: kube-prometheus-stack
	name: kube-pod-crash-looping-platform
	namespace: platform
	spec:
	groups:
	- name: eks
	rules:
	- alert: KubePodCrashLooping
	annotations:
	description: Pod {{ $labels.pod }} in the namespace {{ $labels.namespace }}
	runbook: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
	dashboard: https://my-grafana-url
	expr: max_over_time(kube_pod_container_status_waiting_reason{pod=~"liftbridge-.\|nats-.\|redis-.\|consul-server-.\|vault-0\|vault-1\|vault-2\|vault-agent-injector-.\|argocd-.\|argo-rollouts-.\|coredns-.\|istio-.\|istiod-.\|hubbble-.\|external-.\|keda-.*", reason="CrashLoopBackOff"}[10m]) >= 1
	for: 10m
	labels:
	env: dev
	priority: p2
	responders: platform

Forums

Product Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Alertmanager template problem with Prometheus rule

1 answer

1 accepted

Suggest an answer

Was this helpful?

Thanks!

DEPLOYMENT TYPE

TAGS

Atlassian Community Events