Hi Community Leaders,
We’re working on a complex solution design for a large-scale deployment of Atlassian Data Center products (Jira and Confluence) in a customer-managed cloud environment (AWS, Azure, or GCP). The project requires high scalability, fault tolerance, and long-term resilience to support thousands of users and millions of annual interactions.
We’ve reviewed available documentation but are facing some challenging areas where deeper clarification would help, especially regarding:
Scalable Architecture:
Best-practice deployment patterns for massive workloads
Guidance on VM-based vs. containerized deployments (e.g., Kubernetes, ECS)
Load balancing and regional redundancy strategies
Infrastructure Planning:
Recommended instance types, DB setups, and storage/networking configurations
Options for auto-scaling and resource optimization
Resilience in multi-AZ/cloud-native environments
Container Orchestration:
Kubernetes vs. ECS/Fargate suitability for Atlassian DC
Reference implementations, Helm charts, or operators (if officially supported)
Support implications of running containerized workloads
Security & Compliance:
Deployment patterns that support enterprise-grade security (SSO, encryption, etc.)
Experience in meeting compliance standards (e.g., GDPR, FedRAMP)
Tooling & Cost:
Infrastructure-as-Code templates (Terraform, Bicep, CloudFormation, etc.)
Monitoring integrations (CloudWatch, Azure Monitor, ELK stack)
Licensing or cost estimation tips for highly available environments
We’re hoping to align closely with Atlassian's best practices and would greatly value any input from experts or references to official guidance. If anyone has tackled similar challenges or can point us to relevant examples, we’d be grateful.
Thanks in advance for your support!