Create
cancel
Showing results for 
Search instead for 
Did you mean: 
Sign up Log in

Next challenges

Recent achievements

  • Global
  • Personal

Recognition

  • Give kudos
  • Received
  • Given

Leaderboard

  • Global

Trophy case

Kudos (beta program)

Kudos logo

You've been invited into the Kudos (beta program) private group. Chat with others in the program, or give feedback to Atlassian.

View group

It's not the same without you

Join the community to find out what other Atlassian users are discussing, debating and creating.

Atlassian Community Hero Image Collage
Highlighted

Incident communication is *hard*: Fullstory’s Head of Support shares 4 ways to ease the pain Edited

shannyshan Atlassian Team Jun 18, 2019

If you’ve ever been tasked with writing a status update when things are on fire you know that it’s hard. It’s often urgent, stressful, and requires conveying clear and accurate communication (often before you’re clear on the details yourself).

When we saw this tweet from Ben McCormack, Head of Support at FullStory, we knew we had to dig into the topic further:

tweet (1).png

What makes one-to-many communication so much harder than one-to-one communication? And how can we ease this angst for folks tasked with the job of getting the word out to customers when things go wrong?

While (thankfully) incidents are rare for FullStory, Ben and his team have developed helpful practices to ensure downtime doesn’t equal disaster for their support team or their customers. He was kind enough to sit down with us and share their techniques for combating stress and writer’s block during an incident:

1. Define the situations that warrant customer communication (before an incident strikes)

When things are on fire, you don’t want to waste time determining whether you should communicate the problem to customers. This ‘should we/shouldn’t we debate is harmful in a couple of ways: 1) It keeps customers in the dark longer than necessary if you end up determining that comms are needed and 2) It can cause internal confusion and debate that could have been sorted out before an incident, saving your team time and strife.

Ben’s team recognized this and decided to create a “Statuspage Constitution.” This constitution is essentially a document that lists out things that must be true in order for the team to post to Statuspage during an incident. The questions focus upon incident severityexpected duration, and customer impact:

  • A core piece of FullStory functionality is broken, nonfunctional, or experiencing significant performance degradation.
  • The incident persists over a non-negligible period of time.
  • The incident impacts a large number of users/customers.

For each bullet point, the Statuspage Constitution includes further definition (e.g. what does “large number of customers mean”) and examples of what might qualify or be disqualified. It’s a lot of work up front, but the added clarity lets them move fast if something comes up.

As soon as the team is alerted about an issue, they quickly answer the questions together in Slack. If they determine that communication is necessary, they quickly spring into action and get the right people on the communication front lines.

2. Make incident communication part of your on-call schedule

It’s easy to let communication fall to the wayside during incident response. The dev team is focused on fixing the problem, while the support team is trying to handle the surge in inbound tickets or emails coming in.

The FullStory team ensures communication remains front and center by embedding their support team into their on-call process. There are always support team members on-call alongside the dev team, and at least one ‘Incident Hugger’ gets paged when comms are needed. The Incident Hugger’s job is to be heads down focused on customer communication, allowing engineers to stay focused on resolving the incident.

3. Practice, practice, practice

When you work on a support team, communicating with customers on support tickets becomes second nature. Since incidents (hopefully) aren’t an everyday thing, status update practice is not inherent in the role.

As Ben told us:

To actually sit down and write a status update – even if you’re an expert at customer communication – is such a different type of communication and audience you’re trying to speak to. You can’t rely on your expertise in other types of comms.

That’s why setting aside time to practice writing incident updates is so crucial. They recommend holding mock incidents or fire drills to get your team comfortable with writing under pressure. Ben’s team is even working on a series of playbooks for incident response which will include incident communication fire drills.

4. Breathe, and focus on your goals

Even the most practiced team will encounter stressful situations that they may not feel ready for. In this case, Ben recommends pausing, taking a breath, and refocusing on your goal. For the FullStory team, the goal is to deliver the quick and accurate information to customers, while precluding additional follow-up. If they are trying as hard as they can to meet this goal, the rest should fall into place.

Additional resources

💡How do you ease the pain of incident communication? Have any of Ben’s tips worked well for you? Let us know in the comments. 

 

 

0 comments

Comment

Log in or Sign up to comment
TAGS
Community showcase
Published in DevOps

Step up your DevOps game webinar Q&A series - Questions answered!

  On October 21st, 2020 we hosted a webinar titled,   Step Up You DevOps Game with 4 Key Integrations for Jira and Bitbucket. We had a great showing and high engagement, but that meant th...

126 views 0 1
Read article

Community Events

Connect with like-minded Atlassian users at free events near you!

Find an event

Connect with like-minded Atlassian users at free events near you!

Unfortunately there are no Community Events near you at the moment.

Host an event

You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events

Events near you