Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

🧠 Introducing Atlassian's Inference Engine: Our self-hosted AI inference service

Hi everyone! 👋 👋 👋

I’m Jordan Leventis, an ML Systems Engineer at Atlassian, focused on building the infrastructure that brings AI features to life across our products. My team and I have been working really hard to overcome some of the obstacles we encountered when Atlassian's AI capabilities scaled. Today, I'm proud to introduce Atlassian's Inference Engine! 

🚀 TL;DR

Atlassian’s Inference Engine is our custom-built, self-hosted AI inference platform* that powers everything from search models to content moderation. We built this platform to enable faster, more reliable, and more flexible AI-powered experiences across our apps.

 


❓Why We Built It

As I mentioned above, as Atlassian’s AI capabilities grew we hit some big challenges:

  • Latency that didn’t meet our standards

  • Rigid deployment experiences

  • Vendor lock-in and limited troubleshooting

  • Costs that didn’t match our optimization goals

So, we rolled up our sleeves and built the Inference Engine from the ground up to give us full control over scale, reliability, and observability—without the constraints of third-party platforms.

 


⚙️ How It Works

Atlassian’s Inference Engine is built on a modern, cloud-native stack:

  • Kubernetes for container orchestration

  • Karpenter for dynamic node provisioning

  • ArgoCD for GitOps-style deployments

  • Helm for version-controlled model rollouts

This setup lets us automate cluster provisioning, model rollouts, and versioning, while giving us fine-grained control over GPU spend and operational observability. You're already reaping the benefits!

 


🔥 What's the Real-World Impact?

  • Latency Improvements: Up to 63% reduction for non-LLM workloads and up to 38% for LLM workloads, compared to previous third-party solutions.

  • Cost Savings: Up to 81% reduction for non-LLM workloads and up to 79% for LLM workloads.

  • Full Observability: We monitor everything from latency and throughput to GPU utilization and error rates, so we can keep tuning for performance.

 


🎉 Up Next!

  • Continuous improvement in reliability and scaling

  • Expanding support for new model types and modalities

  • Even richer observability and automation in our inference pipeline

 


I hope you love Atlassian’s new Inference Engine! Comments and especially compliments 😊 are welcome on this post. We'll take ideas too!

The good news is that we're just getting started. Stay tuned for more updates from me as we roll out those new features and capabilities. 

If you want a deep dive, check out the full engineering blog:

Atlassian's Inference Engine

 

*An inference engine manages the deployment and execution of trained machine learning (ML) models, enabling them to make predictions or generate outputs based on new input data. Essentially, it's the infrastructure that allows AI models to be used in real-world applications

 

3 comments

Patrick S. Stuckenberger
Contributor
August 6, 2025

Hi Jordan, 

can you clarify the term self hosted? is this a solution we as customer, are allowed to run on our own environment (edge)?

Like # people like this
Josh
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
August 7, 2025

This is really cool!

 

Can't wait to see those cost savings passed onto customers! j/k

Like # people like this
Jordan Leventis
Atlassian Team
Atlassian Team members are employees working across the company in a wide variety of roles.
August 7, 2025

Hi Patrick,

 

When we say self-hosted it means we are no longer relying on 3rd parties for workloads.

I can't speak to environment/edge setups, that is not a thing supported afaik.

 

Thanks! 

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events