Is AI-coding making us faster or just more expensive? Token costs, Rework rates, and PR Cycle time

May 21, 2026

How do you know that AI-assisted coding is efficient?

Because the number of releases increased and often at decent quality? Make sense, but do you know at what cost (how many dollars were spent on feature development) and why, for example, $1,000/month was spent instead of $500? How do you know that in cases involving

1. scalability, maintainability, or security

2. race conditions, distributed system failures, infrastructure interactions, flaky tests, or performance

3. institutional/tribal knowledge

4. crypto, auth flows, compliance-heavy logic, financial systems, or privacy-critical areas

and under certain conditions – timelines, quality, resources - you needed to use AI-coding at all?

Misuse of AI in many of those areas can easily cost $15,000/month.

Tokenmaxxing and distorted incentives

At the same time, many organizations adopted a tokenmaxxing approach, which led to:

• “Amazon employees admit to using AI unnecessarily to pump up internal usage scores — workers complain of intense pressure to use AI tools”

• “Meta employees used a total of 60.2 trillion AI tokens in 30 days, within the same tokenmaxxing initiative Microsoft burned a lot of AI tokens”

• The message Salesforce sends to staff: “use a minimum of $170/month tokens or be flagged.”

Meawhile, others ask these kinds of questions:

Head of Engineering at Shopify, Farhan Thawar:

• "I want to see why they spent say $1,000 a month in credits for Cursor. Maybe that’s because they’re building something great and they have an agent workforce underneath them!"

AI Code Pulse: measuring AI usage in context

These are the problems that AI Code Pulse solves.

The app associates:

• Token input

• Token output

• Token Cache Create

• Token Cache Read

• Token Cost ($)

with arbitrary nesting structures:

• AI Model (Model → Repo → File → Author → Date)

• Author (Author → Repo → File → Date)

• Repo (Repo → File → Author → Date)

• File (File (repo) → Author → Date)

• Task (Task → Repo → File → Author → Date)

• Epic (Epic → Task → Repo → File → Author → Date)

• Date (Date → Repo → File → Author)

This allows organizations to analyze AI usage in context rather than as an isolated metric.

On the screenshot you might see Task (Task → Repo → File → Author → Date) associated with Token Cost ($):

To analyze the cost reasoning, you can use these approaches depending on nuances in your case:

1. correlating JIRA tasks or commits with AI cost

2. benchmarking similar classes of tasks or epics

3. analyzing cost per repository or per file

4. comparing usage patterns across engineering seniority levels in correlation with features

The spending-by-model view looks like this:

Token Maxxing advocates might want to select group by “Authors” to see the total token cost (or select metric = Total tokens, to see token consumption by authors).

However, token economics alone still don't answer the next engineering question.

Even in some organizations with strong engineering hygiene and culture, PRs are getting overlooked due to their density, or sometimes even the flood of them.

While in some less mature companies, due to the pace of AI adoption, the problem becomes even more pressing - engineers tend to be less conscious about what they write and how it affects the overall system or future maintainability. Many engineers minimize self-testing, over-rely on AI, become less diligent, or are simply pushed in that direction by managers. Now we even have a term for it – feature factory. But if you take away SDLC practices that a fair amount of people advocates for, you can easily end up with a mess you will eventually have to pay off later, just as it always has been.

So, what the criterion might be?

Rework rate - code waste/survivability and missing implementation parts. The metric is one of the few crucial signals for measuring AI coding efficiency. This metric must be understood in the context of a company, but normally a high rework rate signals burned tokens, wasted reviewer time and compute resources, sometimes wasted time of QA, DevSecOps, SRE and other teams.

Even in Meta where refactoring encouraged still has healthy boundaries around it.

A high rework rate also often signals exposure to substantial bugs and therefore reputational risks for a company.

Microsoft Research has one strong finding:

High rework rate predicts bugs better than many complexity metrics.

Typical signals:

• files modified frequently

• many developers touching same file

• recent heavy rewrites

These correlate strongly with defect density.

Below you find 2 metrics – rework rate and PR cycle time:

How it works

Data Flow:
1. AI Code Pulse Tracker (npm) installed on a developer's workstation transmits AI usage data, but never source code or prompt text.

2. Deterministic heuristics in AI Code Pulse associate AI usage with commits from GitHub or Bitbucket.

As of now, AI Code Pulse supports:
- Claude Code
- GitHub
- Bitbucket
- Jira

If your toolset is different, comment and I will support it shortly.

Free trial

Did you know that organizations lose 7 hours per team member weekly to AI-related inefficiencies?
Start a free trial now to surface them and more.

Forums

Q&A

Community resources

Support

Top groups

Community resources

Support

Learn

Community resources

Support

Events

Community resources

Support

Is AI-coding making us faster or just more expensive? Token costs, Rework rates, and PR Cycle time

How do you know that AI-assisted coding is efficient?

Tokenmaxxing and distorted incentives

AI Code Pulse: measuring AI usage in context

So, what the criterion might be?

How it works

Free trial

0 comments

Comment

Was this helpful?

Thanks!

About this author

TAGS

Atlassian Community Events