Curious how other teams handle Definition of Done - sharing what worked for us after three failed at

May 28, 2026

What I learned about Definition of Done after three failed attempts

Six years ago I joined a team where every sprint retro had the same item near the top: "tickets keep getting reopened next sprint." We tried writing acceptance criteria more carefully. We added a QA column. We made the PO sign off twice. None of it stuck for more than a couple of sprints.

The actual fix took us three tries. Two of them made things worse before they got better. This is what I wish someone had told me before we wasted those six months.

Attempt 1: the Confluence page nobody read

Our first DoD lived in Confluence. We wrote it at an offsite, everyone agreed, the page got a banner that said "READ BEFORE CLOSING ANY TICKET." Three sprints later I asked the team in standup who'd actually looked at it that week. One person had. They were new.

The Confluence-DoD failure mode is so common it should have a name. The page exists, the team "has" a DoD, leadership can point at it. But the page is not where work happens. Work happens on the Jira issue. Anything that lives somewhere else might as well not exist.

We deprecated the page in month four. We just stopped pretending.

Attempt 2: the checklist plugin that broke us slowly

We installed a Jira checklist plugin. Every Story now had a 12-item DoD checklist attached. Tick the boxes before transitioning to Done. The plugin would even prevent the transition if anything was unchecked.

For about three weeks this worked. Then I noticed everyone was ticking all twelve boxes in about four seconds. The QA box was always ticked. The "PR approved" box was always ticked. The "documentation updated" box was always ticked.

When I dug into a couple of stories that had been "done" but reopened, the checklist for one of them had been completed by the developer about two minutes after they created the ticket. They hadn't done the work yet. The checkboxes don't know.

That's the second failure mode. A manual checklist depends on people honestly ticking boxes. The reviewer-of-record always says yes, because saying no creates friction. Boxes get ticked reflexively. The signal-to-noise ratio gets worse than the wiki page, because now the boxes are loudly saying "done" while reality drifts.

We kept the plugin for another two sprints out of sunk cost, then turned it off.

Attempt 3: making the boxes evaluate themselves

The breakthrough was obvious in hindsight. Most of our DoD criteria aren't opinions. They are facts that Jira already knows.

"All subtasks done"? Jira knows this. Don't ask a human to confirm.

"Resolution field is set"? Jira knows. Don't ask.

"PR linked and merged"? The Development panel knows. Don't ask.

"No blocking issues still open"? Issue Links knows. Don't ask.

Once we stopped putting a human between Jira and the answer, the dishonest-ticking problem disappeared. The check is either true or it isn't. Nobody can lie to themselves about whether the subtasks are actually Done. The query returns what it returns.

For criteria that genuinely require human judgement (more on those below), we kept the manual checkbox. The point isn't "everything must be auto." The point is to stop forcing humans to tick boxes for things the system already knows.

What our DoD looks like now, with the reasoning

After three years of iteration, here's where we landed. Roughly 70% auto-evaluated against Jira state, 30% manual.

1. Status is in the Done category. Auto. Sounds trivial, but in practice a chunk of "I dragged it to the Done column" stories were never transitioned through the workflow. Status-category check catches them.

2. Resolution is set. Auto. If you don't set Resolution on Done, your velocity reports lie and your "show me done stories" JQL is broken. We forced this years ago and it was the single biggest reporting fix we ever made.

3. All subtasks Done. Auto. A parent issue can't be more done than its children.

4. Acceptance criteria verified. Manual checkbox. The "verified" part requires a human looking at the running thing and saying yes. Auto-checks can't catch this.

5. PR linked and merged. Auto via Development panel. Caught a surprising number of stories where the dev forgot to actually merge.

6. Tests added or updated. Manual. Auto-checking that the right tests exist is hard. We ask the reviewer to confirm.

7. Documentation updated. Manual checkbox, with a comment linking to the doc. If a story touches user-facing behaviour, there's a public docs check. If it's purely backend infrastructure, we skip it for that issue type.

8. Deployed to staging. Auto via Fix Version or a deploy label. "Done" without a deploy is "merged" — different state.

9. No "blocked by" links to open issues. Auto. Catches the edge case where a dependency got marked Done late and we close the dependent without re-checking.

10. Stakeholder sign-off. Manual. For the work the PO specifically requested. We track this with a label the PO adds, not a checkbox a dev can tick. Different incentives, different ownership.

What I'd do differently knowing what I know now

Two things.

First, I'd start smaller. We launched the new system with twelve criteria. We could have started with four — Status, Resolution, Subtasks Done, PR linked — and added the rest over a quarter. A 12-item DoD on day one is too much process for a team that hasn't built the habit yet. People look at it and either ignore it or game it.

Second, I'd visually separate the auto-criteria from the manual ones from the start. We mixed them in the same checklist for almost a year. The result was that nobody distinguished between "the system confirmed this" and "a human said this is true" — the cognitive weight felt identical. Now we show them in different sections. Auto-criteria show as a green dashboard. Manual criteria are a deliberate "I am asserting this is true" action. Different visual treatment for different types of trust.

The honest tradeoffs

Two things I haven't solved.

One: auto-criteria are only as good as the Jira fields they read. If your team is sloppy about updating Fix Version, the "deployed" check is junk. The system can't compensate for upstream data quality, and there's no clever fix for that. You can either commit to updating the field or accept the criterion doesn't work for you.

Two: not every team has the granularity to auto-check. If you don't use subtasks, the "subtasks Done" criterion is irrelevant. If your repo isn't connected to the Jira Development panel, "PR linked" can't fire. Auto-evaluation has prerequisites that aren't free.

What we did for parts of the org that couldn't auto-check — we kept the manual checklist for them, but at least they knew it was manual, and we revisited it quarterly to see what could migrate to auto. Two of our teams have moved 80% of their criteria over in the last year. One team is still mostly manual and probably will be forever, which is fine.

A question for the community

Curious how other teams handle the auto-versus-manual split. Where do you draw the line? Any criteria you wish you could auto-evaluate but can't because of how your repo or workflow is set up?

The criterion I keep going back and forth on is "test coverage didn't drop." We can read it from the CI, but plumbing it into Jira is annoying. If anyone has a clean pattern for that, I'd love to hear it.

If the longer reference version with the Jira-field mapping per criterion is useful, I wrote up the full table over on our team's site: Definition of Done template for Jira. Disclosure — I build ReDo, the Jira app this whole pattern came out of practising with.

Forums

Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Curious how other teams handle Definition of Done - sharing what worked for us after three failed at

What I learned about Definition of Done after three failed attempts

Attempt 1: the Confluence page nobody read

Attempt 2: the checklist plugin that broke us slowly

Attempt 3: making the boxes evaluate themselves

What our DoD looks like now, with the reasoning

What I'd do differently knowing what I know now

The honest tradeoffs

A question for the community

0 answers

Suggest an answer

DEPLOYMENT TYPE

PRODUCT PLAN

PERMISSIONS LEVEL

TAGS

Community showcase

Atlassian Community Events