Alertmanager template problem with Prometheus rule

daniel.rosa June 28, 2022

Hey folks,
I've alertmanager integrated with Opsgenie working well, however I'm facing a problem when my annotations match more than one record fired.

For example, I have a PrometheusRule to monitor kubernetes pods in crash/pending state and if more than one pod is having problems, the description annotation below does not appear on opsgenie, only runbook and dashboard, if only one pod is having problems, I can see the description normally on opsgenie.


does not appear on opsgenie if more than 1 pod is firing.

description: Pod {{ $labels.pod }} in the namespace {{ $labels.namespace }}

I guess is something related to arrays, not sure where and how to fix it.



Alertmanager template config

 

config:
 global: {}
 receivers:
 - name: opsgenie
 opsgenie_configs:
 - api_key: ${opsgenie_key}
 description: |-
 {{ range .CommonAnnotations.SortedPairs }}
 - {{ .Name }} = {{ .Value }}
 {{- end }}
 message: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.alertname }}'
 priority: '{{ if .GroupLabels.priority }}{{ .GroupLabels.priority }}{{ else }}p2{{ end }}'
 responders:
 - name: '{{ if .GroupLabels.responders }}{{ .GroupLabels.responders }}{{ else }}platform{{ end }}'
 type: team


Prometheus Rule


 apiVersion: monitoring.coreos.com/v1
 kind: PrometheusRule
 metadata:
 labels:
 app: kube-prometheus-stack
 release: kube-prometheus-stack
 name: kube-pod-crash-looping-platform
 namespace: platform
 spec:
 groups:
 - name: eks
 rules:
 - alert: KubePodCrashLooping
 annotations:
 description: Pod {{ $labels.pod }} in the namespace {{ $labels.namespace }}
 runbook: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodcrashlooping
 dashboard: https://my-grafana-url
 expr: max_over_time(kube_pod_container_status_waiting_reason{pod=~"liftbridge-.*|nats-.*|redis-.*|consul-server-.*|vault-0|vault-1|vault-2|vault-agent-injector-.*|argocd-.*|argo-rollouts-.*|coredns-.*|istio-.*|istiod-.*|hubbble-.*|external-.*|keda-.*", reason="CrashLoopBackOff"}[10m]) >= 1
 for: 10m
 labels:
 env: dev
 priority: p2
 responders: platform

1 answer

1 accepted

0 votes
Answer accepted
Darryl Lee
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
June 29, 2022

Hi @daniel.rosa,

This is Darryl. I am here to help. 😃

Understand that you would like to understand why some details from Prometheus AlertManager were rendered successfully on Opsgenie alerts.

In order to dive deeper into the logs, we will need your consent to access your Opsgenie and it would be much more efficient to communicate over a support request.

Please consider raising a support request to our team via this link.

Thanks.

Kind regards,
Darryl Lee
Support Engineer, Atlassian

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
TAGS
AUG Leaders

Atlassian Community Events