Hi there,
I am running prometheus community in k8s using helm, and have my alert rules and alertmanager files as values I pass to helm during upgrade. As such, I am having difficulty using the examples listed here, particularly the following error;
"{\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"priority\":\"should be one of [ P1, P2, │
│ 3, P4, P5 []\"},\"took\":0.001,\"requestId\":\"ca7xxx3a0\"}"
I have tried escaping the curly brackets, using square brackets, single quotes, double quotes and many other syntax variations however I can't seem to get past this hurdle. What would be really useful is to be able to see how OpsGenie is receiving the payload, so I can get more feedback to my changes.
Is there a way to see what the 'priority' key is being set to?
I have tried using the requests API but it doesn't give me the .Alert payload.
My alerts have this configured;
labels:
severity: info
And the alertmanager config at the mo is;
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}'
I've also tried setting the label key as 'priority' instead of 'severity', and this label is available in the alertmanager TSDB, but OpsGenie is still not seeing it.
Thanks!
Swapna
Just an update, I was playing around again trying to get this working, and was about to go down the whole 'Relabel Config' route, when I came across this post, which I had seen before but it made more sense to me now;
https://github.com/prometheus/alertmanager/issues/2088
He is saying that by having the {{ range .Alerts }} at the beginning, it will cycle over the multiple alerts that fired in the given interval, and print the priority level for each alert.. so they will get concatenated, i.e. you'll get a long string like 'P1P1P2P3P2' instead of just P1 etc.
So I tried it with this new config;
priority: '{{ if eq .GroupLabels.severity "critical" }}P1{{ else if eq .GroupLabels.severity "warning" }}P2{{ else }}P3{{- end -}}'
And also I had to group by severity (I was already grouping by alertname and environment);
route:
group_by: [alertname, environment, severity]
And now, I am able to see alerts with priority levels set by the label.severity key value! :D
Do you think that your example code works because of the group labels [...]? I believe this means that alerts are not grouped, and so each alert comes in individually (thus creating a lot more noise) and so the concatenation doesn't occur? In which case your code will also probably work without the {{ range .Alerts }}?
Thanks fo all your help!
Swapna
Hi @Swapna
That makes a lot more sense here!
Appreciate you sharing your findings here in relation to the grouping and why it's not setting the priority correctly as the range of alerts are being concatenated.
You are correct in your understanding here surrounding grouping, what I can see from the prometheus documentation
To aggregate by all possible labels use the special value '...' as the sole label name, for example:
# group_by: ['...']
# This effectively disables aggregation entirely, passing through all
# alerts as-is.
Thanks,
Connor
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hey @Swapna
Thanks for reaching out!
I have a sample config file I captured some time back which could be useful here on how to use dynamic priorities, let me know if this is useful!
global:
resolve_timeout: 1m
opsgenie_api_url: https://api.opsgenie.com/
opsgenie_api_key: <my-top-secret-api-key>
receivers:
- opsgenie_configs:
responders:
- name: "DailyBugle"
type: "team"
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical"}}P1{{else if eq .Labels.severity "warning"}}P2{{else if eq .Labels.severity "info"}}P3{{else}}P4{{end}}{{end}}'
name: opsgenie
route:
group_by: ['...']
receiver: opsgenie
repeat_interval: 5m
Thanks,
Connor
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks Connor!
I had also come across this snippet which looked like it should work, and I have tried this just to double check the result, and indeed I am still seeing the same error in the alertmanager log.
What would be great is to get more insight into what OpsGenie has received in terms of the payload, rather than a message saying 'this payload isn't correct'. I want to essentially see how the various square brackets, escape symbols etc are being parsed, so that I can tweak accordingly.
The issue here, is that I am using helm, which itself is running the .yaml through a parser, and either stripping off the brackets (I think this is how it behaves) or something else. It's the combination of helm and alertmanager -> Opsgenie.
This is what I have configured and you can see what else I have tried commented out;
```
- name: OpsGenie-Test
opsgenie_configs:
- message: '[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{- end -}}] Alert: {{ .GroupLabels.alertname }} - Cluster: {{ .GroupLabels.environment }}'
description: "{{ range .Alerts }}- {{ .Annotations.identifier }}\n {{ end }}"
details:
num_firing: "{{ .Alerts.Firing | len }}"
num_resolved: "{{ .Alerts.Resolved | len }}"
api_key: my_api_key
priority: '{{ range .Alerts }}{{ if eq .Labels.severity "critical"}}P1{{else if eq .Labels.severity "warning"}}P2{{else if eq .Labels.severity "info"}}P3{{else}}P4{{end}}{{end}}'
# priority: '[{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}]'
# priority: "{{ range .Alerts }}{{ .Labels.priority }}{{ end }}"
# priority: '[{{ range .Alerts }}{{ if eq .Labels.severity "critical" }}P1{{ else }}P2{{- end -}}]'
# priority: '{{"{{"}} range .Alerts {{"}}"}}{{"{{"}} if eq .Labels.severity "critical" {{"}}"}}P1{{"{{"}} else if eq .Labels.severity "warning" {{"}}"}}P2{{"{{"}} else if eq .Labels.severity "info" {{"}}"}}P3{{"{{"}} else {{"}}"}}P4{{"{{"}} end {{"}}"}}'
```
All the other keys are being parsed ok - and you can see that a range of double quotes, single quotes, square brackets etc are being used.
Thanks so much for your help!
Swapna
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi @Swapna
What you can do is using a test webhook endpoint to see how the payload is being sent, a common one I use for testing is https://beeceptor.com/
To send a test you can post info to the webhook endpoint via this
Its not the format thats used in Opsgenie, but we can use it as a platform to identify if helm is doing any parsing.
I have also verified that the config I did send you the correct format.
Hope this helps with identifying the issue!
Thanks,
Connor
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thanks for the suggestion, I will give that a go, it looks very useful!
Regards,
Swapna
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.