Missed Team ’24? Catch up on announcements here.

×
Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Prometheus Alertmanager Alert priority is not being parsed correctly

Swapna November 15, 2021

Hi there,

I am running prometheus community in k8s using helm, and have my alert rules and alertmanager files as values I pass to helm during upgrade.  As such, I am having difficulty using the examples listed here, particularly the following error;

"{\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"priority\":\"should be one of [ P1, P2, │
│ 3, P4, P5 []\"},\"took\":0.001,\"requestId\":\"ca7xxx3a0\"}"

I have tried escaping the curly brackets, using square brackets, single quotes, double quotes and many other syntax variations however I can't seem to get past this hurdle.  What would be really useful is to be able to see how OpsGenie is receiving the payload, so I can get more feedback to my changes.  

Is there a way to see what the 'priority' key is being set to?

I have tried using the requests API but it doesn't give me the .Alert payload.

My alerts have this configured;

labels:
severity: info

And the alertmanager config at the mo is;

priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}'

I've also tried setting the label key as 'priority' instead of 'severity', and this label is available in the alertmanager TSDB, but OpsGenie is still not seeing it.

Thanks!

Swapna

2 answers

1 vote
Swapna November 23, 2021

Hi @Connor Eyles 

Just an update, I was playing around again trying to get this working, and was about to go down the whole 'Relabel Config' route, when I came across this post, which I had seen before but it made more sense to me now;

https://github.com/prometheus/alertmanager/issues/2088

He is saying that by having the {{ range .Alerts }} at the beginning, it will cycle over the multiple alerts that fired in the given interval, and print the priority level for each alert.. so they will get concatenated, i.e. you'll get a long string like 'P1P1P2P3P2' instead of just P1 etc.

So I tried it with this new config;

priority: '{{ if eq .GroupLabels.severity "critical" }}P1{{ else if eq .GroupLabels.severity "warning" }}P2{{ else }}P3{{- end -}}'

And also I had to group by severity (I was already grouping by alertname and environment);

route:
group_by: [alertname, environment, severity]

And now, I am able to see alerts with priority levels set by the label.severity key value! :D

Do you think that your example code works because of the group labels [...]?  I believe this means that alerts are not grouped, and so each alert comes in individually (thus creating a lot more noise) and so the concatenation doesn't occur?  In which case your code will also probably work without the {{ range .Alerts }}?

Thanks fo all your help!

Swapna

Connor Eyles
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 23, 2021

Hi @Swapna 

That makes a lot more sense here! 

Appreciate you sharing your findings here in relation to the grouping and why it's not setting the priority correctly as the range of alerts are being concatenated.

You are correct in your understanding here surrounding grouping, what I can see from the prometheus documentation 

To aggregate by all possible labels use the special value '...' as the sole label name, for example:
# group_by: ['...']
# This effectively disables aggregation entirely, passing through all
# alerts as-is.

 

Thanks,

Connor

Like Steffen Opel _Utoolity_ likes this
1 vote
Connor Eyles
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 15, 2021

Hey @Swapna 

Thanks for reaching out! 

I have a sample config file I captured some time back which could be useful here on how to use dynamic priorities, let me know if this is useful!

global:
resolve_timeout: 1m
opsgenie_api_url: https://api.opsgenie.com/
opsgenie_api_key: <my-top-secret-api-key>
receivers:
- opsgenie_configs:
responders:
- name: "DailyBugle"
type: "team"
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical"}}P1{{else if eq .Labels.severity "warning"}}P2{{else if eq .Labels.severity "info"}}P3{{else}}P4{{end}}{{end}}'
name: opsgenie
route:
group_by: ['...']
receiver: opsgenie
repeat_interval: 5m

Thanks,

Connor

Swapna November 17, 2021

Thanks Connor!

I had also come across this snippet which looked like it should work, and I have tried this just to double check the result, and indeed I am still seeing the same error in the alertmanager log.

What would be great is to get more insight into what OpsGenie has received in terms of the payload, rather than a message saying 'this payload isn't correct'. I want to essentially see how the various square brackets, escape symbols etc are being parsed, so that I can tweak accordingly.

The issue here, is that I am using helm, which itself is running the .yaml through a parser, and either stripping off the brackets (I think this is how it behaves) or something else.  It's the combination of helm and alertmanager -> Opsgenie.

This is what I have configured and you can see what else I have tried commented out;

```

- name: OpsGenie-Test
opsgenie_configs:
- message: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{- end -}}] Alert: {{ .GroupLabels.alertname }} - Cluster: {{ .GroupLabels.environment }}'
description: "{{ range .Alerts }}- {{ .Annotations.identifier }}\n {{ end }}"
details:
num_firing: "{{ .Alerts.Firing | len }}"
num_resolved: "{{ .Alerts.Resolved | len }}"
api_key: my_api_key
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical"}}P1{{else if eq .Labels.severity "warning"}}P2{{else if eq .Labels.severity "info"}}P3{{else}}P4{{end}}{{end}}'
# priority: '[{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}]'
# priority: "{{ range .Alerts }}{{ .Labels.priority }}{{ end }}"
# priority: '[{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}]'
# priority: '{{"{{"}} range .Alerts {{"}}"}}{{"{{"}} if eq .Labels.severity "critical" {{"}}"}}P1{{"{{"}} else if eq .Labels.severity "warning" {{"}}"}}P2{{"{{"}} else if eq .Labels.severity "info" {{"}}"}}P3{{"{{"}} else {{"}}"}}P4{{"{{"}} end {{"}}"}}'

```

All the other keys are being parsed ok - and you can see that a range of double quotes, single quotes, square brackets etc are being used.  

Thanks so much for your help!

Swapna

Connor Eyles
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
November 17, 2021

Hi @Swapna 

What you can do is using a test webhook endpoint to see how the payload is being sent, a common one I use for testing is https://beeceptor.com/ 

To send a test you can post info to the webhook endpoint via this

Its not the format thats used in Opsgenie, but we can use it as a platform to identify if helm is doing any parsing.

I have also verified that the config I did send you the correct format.

Hope this helps with identifying the issue!

Thanks,

Connor

Like Steffen Opel _Utoolity_ likes this
Swapna November 19, 2021

Hi @Connor Eyles 

Thanks for the suggestion, I will give that a go, it looks very useful!

Regards,

Swapna

Like Connor Eyles likes this

Suggest an answer

Log in or Sign up to answer
DEPLOYMENT TYPE
CLOUD
TAGS
AUG Leaders

Atlassian Community Events