Because the number of releases increased and often at decent quality? Make sense, but do you know at what cost (how many dollars were spent on feature development) and why, for example, $1,000/month was spent instead of $500? How do you know that in cases involving
1. scalability, maintainability, or security
2. race conditions, distributed system failures, infrastructure interactions, flaky tests, or performance
3. institutional/tribal knowledge
4. crypto, auth flows, compliance-heavy logic, financial systems, or privacy-critical areas
and under certain conditions – timelines, quality, resources - you needed to use AI-coding at all?
Misuse of AI in many of those areas can easily cost $15,000/month.
At the same time, many organizations adopted a tokenmaxxing approach, which led to:
• “Amazon employees admit to using AI unnecessarily to pump up internal usage scores — workers complain of intense pressure to use AI tools”
• “Meta employees used a total of 60.2 trillion AI tokens in 30 days, within the same tokenmaxxing initiative Microsoft burned a lot of AI tokens”
• The message Salesforce sends to staff: “use a minimum of $170/month tokens or be flagged.”
Meawhile, others ask these kinds of questions:
Head of Engineering at Shopify, Farhan Thawar:
• "I want to see why they spent say $1,000 a month in credits for Cursor. Maybe that’s because they’re building something great and they have an agent workforce underneath them!"
These are the problems that AI Code Pulse solves.
The app associates:
• Token input
• Token output
• Token Cache Create
• Token Cache Read
• Token Cost ($)
with arbitrary nesting structures:
• AI Model (Model → Repo → File → Author → Date)
• Author (Author → Repo → File → Date)
• Repo (Repo → File → Author → Date)
• File (File (repo) → Author → Date)
• Task (Task → Repo → File → Author → Date)
• Epic (Epic → Task → Repo → File → Author → Date)
• Date (Date → Repo → File → Author)
This allows organizations to analyze AI usage in context rather than as an isolated metric.
On the screenshot you might see Task (Task → Repo → File → Author → Date) associated with Token Cost ($):
To analyze the cost reasoning, you can use these approaches depending on nuances in your case:
1. correlating JIRA tasks or commits with AI cost
2. benchmarking similar classes of tasks or epics
3. analyzing cost per repository or per file
4. comparing usage patterns across engineering seniority levels in correlation with features
The spending-by-model view looks like this:
Token Maxxing advocates might want to select group by “Authors” to see the total token cost (or select metric = Total tokens, to see token consumption by authors).
However, token economics alone still don't answer the next engineering question.
Even in some organizations with strong engineering hygiene and culture, PRs are getting overlooked due to their density, or sometimes even the flood of them.
While in some less mature companies, due to the pace of AI adoption, the problem becomes even more pressing - engineers tend to be less conscious about what they write and how it affects the overall system or future maintainability. Many engineers minimize self-testing, over-rely on AI, become less diligent, or are simply pushed in that direction by managers. Now we even have a term for it – feature factory. But if you take away SDLC practices that a fair amount of people advocates for, you can easily end up with a mess you will eventually have to pay off later, just as it always has been.
Rework rate - code waste/survivability and missing implementation parts. The metric is one of the few crucial signals for measuring AI coding efficiency. This metric must be understood in the context of a company, but normally a high rework rate signals burned tokens, wasted reviewer time and compute resources, sometimes wasted time of QA, DevSecOps, SRE and other teams.
Even in Meta where refactoring encouraged still has healthy boundaries around it.
A high rework rate also often signals exposure to substantial bugs and therefore reputational risks for a company.
Microsoft Research has one strong finding:
High rework rate predicts bugs better than many complexity metrics.
Typical signals:
• files modified frequently
• many developers touching same file
• recent heavy rewrites
These correlate strongly with defect density.
Below you find 2 metrics – rework rate and PR cycle time:
Data Flow:
1. AI Code Pulse Tracker (npm) installed on a developer's workstation transmits AI usage data, but never source code or prompt text.
2. Deterministic heuristics in AI Code Pulse associate AI usage with commits from GitHub or Bitbucket.
As of now, AI Code Pulse supports:
- Claude Code
- GitHub
- Bitbucket
- Jira
If your toolset is different, comment and I will support it shortly.
Did you know that organizations lose 7 hours per team member weekly to AI-related inefficiencies?
Start a free trial now to surface them and more.
Alexey Pavlenko _App Developer_
0 comments