We had an incident recently that Statuspage shows affecting three components, when in reality it only affected one. Is there a way to edit an incident and correct which components were affected? It doesn't seem possible, using the edit icons on the various stages of the incident only allow for editing the text associated with them. We've backfilled the incident on a specific component that was affected but which wasn't showing in the histograph, but this did not resolve the original issue.
Hi Cory - Welcome to Statuspage!
It's not possible to edit the components that were included in an Incident, and it would not be ideal to allow this to happen.
Those entries in an incident are part of the historical record, and show which components were included and notifications sent for. If you were able to edit which components were included in an incident you would have the case where your notifications for components affected in an incident differs from what the incident shows and this would imply that your incident reporting is unreliable.
A better way of handling this would be to include in your resolved message, or your Postmortem that these components were not actually affected and remained available during the incident.
What you can do is edit/correct the uptime for that component, by editing the component and editing the historical uptime data - as detailed in https://community.atlassian.com/t5/Statuspage-articles/Edit-your-historical-uptime-data-for-a-more-accurate-Uptime/ba-p/961136
Cheers,
Scot.
Hi Scot,
Completely disagree with this. When an incident occurs it's often unclear what the component actually is. An incident manager can guess it's about a specific component or set of components, but it's not until later in the incident where they can realize, no, it was something else. That's why there's the investigating phase, right? We are trying to identify the cause and not the symptoms.
Yes, we can edit the component to adjust the downtime, but then on the UI you'd hover over a red block or red line and have no related incidents. Why was this set of days red? No idea, because there's no information about it. I can't read the post-mortem because there is none. There's no association with an incident.
The idea that you're divorcing the visual UI from the intent of the UI and asking folks to instead just plug a bunch of text into the post-mortem is really short sighted. One core audience for statuspage is our executive team. They need easy-to-follow data that very clearly links why an outage occurred on a given day. Asking them to dig through post-mortems isn't a winning strategy.
I love most of the Atlassian stack, but it's bizarre Atlassian would draw the line on allowing us to edit the component. Even if you guys don't think it's best practice - so what? Give your customers the option to make that determination. I can't give a true and accurate representation of what really happened when I can't even go back and correct a set of components.
Without this feature, I don't see how I can rely on Statuspage to actually produce accurate data. If Atlassian applied this logic to JIRA, why bother having custom fields? After all, if a customer changed the custom field it would potentially make the report unreliable. Come on guys. You're overthinking this.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Tim!
You're absolutely able to update an Incident while it is in progress, and include new components, remove components, and update the status of each components at any stage. You're completely correct in thinking that while the incident is in Investigating phase that it should be feasible for you to modify which components are being affected by the incident, and we support that by allowing you to update the incident to change these details.
Our product designers made the decision to not allow you to edit an incident once it is closed, because it would be a confusing experience for users (especially SMS subscribers) that will receive notices, after they have been told than an incident is over. There's also the issue where your users will have received incident notifications at various points in time, and if it was possible to modify an incident post closure, these notifications would not match the what the incident says happened.
We encourage our customers to use the Monitoring status for incidents until they are sure everything has been identified and commented on, before they close them out and turning them to a read-only status.
In thinking about this as a feature request, where a customer should be able to add a component as affected to a closed out incident we would consider that functionality would be available to you by choosing to create a backfilled incident which would include those other components, so they are associated with an incident.
The other feature requests I believe may be warranted here, is that you should be allowed to modify the start time for a component when added to an existing incident. Currently when we add a component to an incident, the outage for that component starts from the point we update the incident, but your discussion indicates that you want to be able to set this date to be prior to this time point (perhaps at incident start). And the alternate side of this would be that if I update an incident to remove a component, there should be the option to remove the outage associated with the incident since the component wasn't being affected by the incident.
Would these features cover the scenarios you are envisioning?
Scot.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I've hit the exact scenario that @Tim Fostik has explained. Understanding the correct service impact could take a while post resolution. There should be a few roles that can add/update/remove status page service components with the "Do not notify" option. This way corrections can be made in the backend for accurate service reporting and users don't have to be notified.
PS: I did try to backfill an incident and currently there's no option to choose a service component.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
There's plenty of good examples of why being able to edit and change anything outside of an incident makes total sense.
If there is a failure in some aspect of automation or process or human error, then an incident could be updated (without notifications so as not to confuse, we can already do that) so that the statuspage accurately represents that incident, for the record.
Statuspage isn't just an in the moment tool, it also acts as an historical record, for others who are doing root cause analysis and want to correlate possible causes across services.
its really unclear why this isn't just over to the service provider, to manage how they see fit? Ideally, we'd just be enabled to configure and edit whatever we want, like we do throughout the rest of your products. that would be the most consistent admin experience.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.