Hi everyone! 👋 👋 👋
I’m Jordan Leventis, an ML Systems Engineer at Atlassian, focused on building the infrastructure that brings AI features to life across our products. My team and I have been working really hard to overcome some of the obstacles we encountered when Atlassian's AI capabilities scaled. Today, I'm proud to introduce Atlassian's Inference Engine!
🚀 TL;DR
Atlassian’s Inference Engine is our custom-built, self-hosted AI inference platform* that powers everything from search models to content moderation. We built this platform to enable faster, more reliable, and more flexible AI-powered experiences across our apps.
❓Why We Built It
As I mentioned above, as Atlassian’s AI capabilities grew we hit some big challenges:
Latency that didn’t meet our standards
Rigid deployment experiences
Vendor lock-in and limited troubleshooting
Costs that didn’t match our optimization goals
So, we rolled up our sleeves and built the Inference Engine from the ground up to give us full control over scale, reliability, and observability—without the constraints of third-party platforms.
⚙️ How It Works
Atlassian’s Inference Engine is built on a modern, cloud-native stack:
Kubernetes for container orchestration
Karpenter for dynamic node provisioning
ArgoCD for GitOps-style deployments
Helm for version-controlled model rollouts
This setup lets us automate cluster provisioning, model rollouts, and versioning, while giving us fine-grained control over GPU spend and operational observability. You're already reaping the benefits!
🔥 What's the Real-World Impact?
Latency Improvements: Up to 63% reduction for non-LLM workloads and up to 38% for LLM workloads, compared to previous third-party solutions.
Cost Savings: Up to 81% reduction for non-LLM workloads and up to 79% for LLM workloads.
Full Observability: We monitor everything from latency and throughput to GPU utilization and error rates, so we can keep tuning for performance.
🎉 Up Next!
Continuous improvement in reliability and scaling
Expanding support for new model types and modalities
Even richer observability and automation in our inference pipeline
I hope you love Atlassian’s new Inference Engine! Comments and especially compliments 😊 are welcome on this post. We'll take ideas too!
The good news is that we're just getting started. Stay tuned for more updates from me as we roll out those new features and capabilities.
If you want a deep dive, check out the full engineering blog:
*An inference engine manages the deployment and execution of trained machine learning (ML) models, enabling them to make predictions or generate outputs based on new input data. Essentially, it's the infrastructure that allows AI models to be used in real-world applications
Jordan Leventis
3 comments