Further: the information presented in the status page shows no progress whatsoever. Check the screenshot below.
I am fairly certain progress is being done behind the scenes, but please inform us so we can in turn inform our customers about what's going on. Compared to other providers of professional services, the information level here is abysmal. As a former president would put it: Sad!
So Day 6:
35% of affected customers have now been restored in some way. That is +4% compared to yesterday. Extrapolating that restore speed gives us the last customers being restored in (100% - 35%) / 4% = 16 days time.
I find the following sentences from today's support ticket update rather annoying:
This incident was not the result of a cyberattack and there has been no unauthorized access to your data. As part of scheduled maintenance on selected cloud products, our team ran a script to delete legacy data. This data was from a deprecated service that had been moved into the core datastore of our products. Instead of deleting the legacy data, the script erroneously deleted sites, and all associated products for that site including connected products, users, and third-party applications. We maintain extensive backup and recovery systems, and there has been no data loss for customers that have been restored to date.
Let me ask the Atlassian team: who is responsible for Disaster Recovery and how often do you test the DR process?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
It's becoming apparent that they had planned for a complete loss of their data and not for partial loss of data.
This is an architectural problem as well as a DR disaster. Since a subset of their users were impacted, and they probably didn't know it had happened until a period of time had elapsed, they couldn't fall back on (what I presume is their) DR plan of simply restoring everything back to the previous set point.
So instead, they are sacrificing us and manually rebuilding our sites one at a time. They are not restoring from backups. They are manually copying data from a restored instance into the prod system. Which is SUPER SCARY.
At least that's my guess based on the information coming out.
We're taking steps in our company to stand up an alternative system so we can meet our release dates. I really feel for the companies who relied on this service for their livelihood.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
We are (were?) insight users but I think Atlassian have had the last of any money from me. Not for the outage, more for the completely amateur way it's been handled. I can no longer trust them with any of our processes.
How can a company write ~20 emails that say nothing?
How can you need "hundreds of engineers" to fix "a handful of sites"?
This whole thing stinks.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I have tried submitting a request on Atlassian support, but it will not allow me to do so because our site is down. This is ridiculous - our entire organization depends on this and it has been down for ~36 hours with no information about when it is expected to be restored.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I would recommend handing out a copy of the book Radical Candor to the Atlassian support management team.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
I am unable to report an issue, because the url for our Jira Service Management is not recognized (due to the error you are experiencing).
So please help the ones impacted with solutions different than you are proposing!
Our URL is: https://hwbgsupport.atlassian.net
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
So my group just had our conversation with Trisha and Diane from Atlassian, and here are my takeaways.
Once we are notified that our data is being restored, all of the components will be restored at the same time (Confluence, Service Management, and Insight in our case). Once notification occurs it will take roughly 4 to 5 days before we are able to access our site.
Due to how the data was deleted, the site and URL was also deleted. Site and URL will be recreated/restored and we were told that we are on there radar even though our URL is no longer visible.
Atlassian is restoring in batches of 60 customer sites, but they are unable to tell us what batch we are a part of or when to anticipate the initial email stating that recovery of our site is beginning.
The restoration is a manual process and they are re-adding the deleted sites into the data stores of the unaffected sites. Asked about the possibility of restoring all 400 effected sites to their own data store to get users up quicker did not return any answers.
So for now, we will wait until our ticket number is called (email stating data restoration has begun) and then able to head up to the counter to wait the 4 to 5 days until we can begin to access/verify our data.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello,
We apologize for the length of this ongoing incident and for not being more proactive in our communication with you. We understand how mission-critical our products are for you and want to make sure we are relaying the most accurate information possible.
While we really want to get in front of you live to answer your questions, we are prioritizing getting customers up and running first and foremost. We will host an AMA (Ask Me Anything) after we get all of our customers fully restored.
In the meantime, please add your questions here and we will respond as quickly and transparently as we can. Some questions may not be answered until we do an official PIR, but we will let you know that and answer as much as we can now.
Our Chief Technology Officer Sri Viswanath has posted a blog about this incident with more background and details, and our team is working around the clock to move through the various stages for restoration.
We are working 24/7 to restore your service. Thank you to those who have been sharing all the information you know from your support requests. Thank you for being open, honest, and caring for your fellow Community members. This speaks a lot about what makes Community a special place.
@Olimpia Estela Cáceres-Brown Thank you for raising this post.
@Shane Doerksen Thank you for being open and honest with your feedback and updates.
@Karla Thank you for the feedback and updates.
@Ulf Sahlin We hear you and appreciate your feedback and updates.
@paul.fritz Thank you for the honest, direct, and needed feedback.
@Karim Abrik We've heard your feedback and appreciate it.
Lastly, if you're unsure if a support request was raised on your behalf, please let me know, and I will personally check. My team and I went through all Community posts related to this incident to ensure everyone's site had a support request.
Regards,
Stephen Sifers | Product Lead, Community
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Were any of you guys using the old Insight and were forced to move to the new/built-in one? The end-of-life of the old Insight was March 31, 2022.
We moved from the old to the new Insight and we are affected by this incident.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
And 2 points gryffindor for @Ulf Sahlin putting together that the insight retirement was to blame for all of our outages 4 DAYS AGO. As far as I’ve seen, that was something that was only actually confirmed by atlassian today. The amount of deductive reasoning that I’ve seen by this forums community members has been nothing short of impressive. Thank you.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
@Olimpia Estela Cáceres-Brown -
The best way I would recommend is for you to submit a formal support request to Atlassian Support (https://support.atlassian.com/) for direct assistance.
Best, Joseph Chung Yin
Jira/JSM Functional Lead, Global Infrastructure Applications Team
Viasat Inc.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.