Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Bitbucket Tests: Introducing the Fix Flaky test AI Agent

Hi Bitbucket Community!

 

We're excited to announce an AI agent to fix your flaky tests in one click. Here's Senior Engineering Manager, Rajkumar Singh, to tell you more:

 

Earlier this year, we launched Automatic Flaky Test Detection in Bitbucket Tests enabling teams to spot and quarantine flaky tests, right inside their pipelines. No need to manually triage failures across thousands of test cases. That was a game-changer for visibility. But detection was only half the battle.

Today, we're delivering the other half:

Bitbucket Tests can now fix your flaky tests automatically — with a single click.

Powered by agentic pipelines, a pre-configured AI agent diagnoses the root cause, writes a targeted fix, and opens a pull request for your review — all automatically.

What this means for your team

No scripts. No plugins. No context-switching. Just click "Fix flaky test" on any flagged test, and the AI agent does the rest:

  • One-click remediation: Click "Fix flaky test" on any test marked as flaky. An agentic pipeline spins up, diagnoses the issue, and implements a fix.

  • PR-based workflow: The agent opens a draft pull request with a clear explanation of the root cause, the fix, and verification steps. You review and merge on your terms.

  • Bring your own agent: The built-in agent works out of the box, but you can define custom agents with your own prompts and coding standards for full control.

  • Human always in the loop: The agent never commits directly to your branch — every fix goes through a PR for review.

The result? Flaky tests go from detected to fixed in minutes, not days.

Getting started

If you're already using Bitbucket Tests with flaky test detection, you're almost there:

  1. Add the fix-flaky-test trigger and pipeline to your bitbucket-pipelines.yml (see the docs for a ready-to-use config snippet).

  2. Open the Tests tab in your repo and find any test marked as flaky.

  3. Click "Fix flaky test" — the agent takes it from there.

  4. Review the draft PR, approve, and merge.

New to Bitbucket Tests? Start with our introductory blog post to set up test tracking in your pipelines first.

Share your feedback

AI-powered flaky test remediation is live in open beta for all Bitbucket Pipelines users. Give it a spin, and tell us what you think — your feedback directly shapes what comes next

2 comments

Carter Gray
I'm New Here
I'm New Here
Those new to the Atlassian Community have posted less than three times. Give them a warm welcome!
May 25, 2026

This looks really useful, especially for teams dealing with large test suites where flaky tests eat up a ton of debugging time. I like that the fixes still go through a PR review instead of auto-merging changes directly into the branch. The “bring your own agent” part is probably the most interesting feature here since every team has different coding standards and test patterns. Curious to see how well it handles more complex flaky cases like timing issues or external dependency failures in real-world projects.

Like Rajkumar Singh likes this
Rajkumar Singh
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
May 26, 2026

Thanks @Carter Gray for the thoughtful feedback! Glad the PR-based workflow resonates — keeping a human in the loop was a deliberate design choice, not a constraint. We conciously didn't want it to start merging code without review. But would love to hear your thoughts.

On the complex cases you mentioned — timing issues, race conditions, external dependency failures — the agent has actually been tested and trained extensively on these exact scenarios. They're some of the most common flakiness patterns we see, so handling them well was a core goal from day one. The root-cause diagnosis step gives the agent enough context to go beyond surface-level fixes (like blind retries) and apply more targeted remediations.

Would love to hear how it performs on your test suites — real-world feedback on edge cases is what shapes where we take it next.

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events