How to tune JIRA and Confluence configuration for high usage?

If you are using JIRA and/or Confluence in a corporate environment you should expect to hit scalability issues quite fast.

Here our current list of configuration changes that are needed in order to be able to scale the system for more users:

  • Switch from VM to baremetal, get a machine with SSD and plenty of RAM. 
  • Install Oracle JVM 1.7, and keep it updated (you can to this with apt-get).
  • Never week JIRA or Confluence files on network drives / NAS. You can safely mount the attachments directory from NAS, but nothing else.
  • Also, keep Atlassian products updated, not older than 6 months. 
  • Use Linux and PostgreSQL, don't waste precious time with other configs, these are the ones that work the best and that are used by Atlassian on all of their instances.
  • 50% of the memory should be reserved to the JVMs, and at least 30% should be free / used for caching by the OS.
  • enable validationQuery on JDBC connection, sooner or later you will lose DB connections and you will hate your life if you do not do this.
  • Increase the number of max database connections in both JDBC and the database engine, always configure a max in potgresql +20-30 greater than your worst case.
  • Monitor performance, for example we use DataDog and monitor:
    • Machine memory
    • IO Usage
    • CPU Load
    • JVM memory
    • JVM threads
    • Postgresql connections per database 
  • We use nginx in front of these services, which also adds the SSL layer. Nginx is used to provide  temporary out of service messages and allowing us to throttle or even ban some HTTP clients, when needed.

Example of working setups that are sharing a server with 48GB RAM, 32 cores, and SSD:

  • JIRA, 300k issues: 7500 MB RAM, 512 max perm size, maxThreads=250, JDBC maxActive=120
  • Confluence, 50-100 real users: 6000 MB RAM, 512 max perm size, maxProcessors=140

Number of cores is not so important, it seems that Atlassian products are not able to really use them effectively, the average load is less than 3, and a 100% usage would translate into 32.

Memory and IO speed are essential, SSD added a speed up of almost 10x when it comes to indexing or service start time.

While I do have full HTTP logs that do include backend response time for each request, I am still looking for a tools that is able to parse these and to extract some meaningful information.

We are not fully pleased about the performance and I want to have some realistic data regarding what is causing the slowdowns. 

Feel free to add your own hints, so we can build a better tuning tutorial.

5 answers

1 accepted

This widget could not be displayed.

At this moment we are using nginx in front on JIRA to provide the following:

  • Adding SSL layer (HTTPS over HTTP)
  • Enabling SPDY for speeding up HTTPS (even faster than HTTP)
  • Ability to hide/switch backend with DNS changes
  • Potential to use split load over several servers (not used yet)
  • Provide a nice maintenance page when backend is down
  • Caching the number of requests
  • Limiting the number of requests 

 

server {
    listen 80;
    server_name jira.example.com old-jira.example.com;
    rewrite     ^ https://jira.example.com$request_uri? permanent;
    access_log  /var/log/nginx/redirect_access.log full;
    error_log  /var/log/nginx/redirect_error.log;
}
 
server {
        listen 443 ssl spdy;
        server_name jira.example.com ;
        root /etc/nginx/www/;
        gzip  on;
        gzip_vary on;
        access_log  /var/log/nginx/jira.access.log full;
        error_log  /var/log/nginx/jira.error.log;
        client_max_body_size 150m;
        error_page 502 503 504  /www/maintenance.html;
        location /www/ {
                root /etc/nginx/;
        }
        location / {
                proxy_redirect          off;
                proxy_next_upstream     error timeout invalid_header http_500;
                proxy_connect_timeout   5;
                proxy_pass   http://localhost:8080;
                proxy_redirect  off;
                #proxy_connect_timeout 120;
                proxy_read_timeout      10800s;
                proxy_set_header        Host            $http_host;
                proxy_set_header        X-Real-IP       $remote_addr;
                proxy_set_header        X-Forwarded-Host $host;
                proxy_set_header        X-Forwarded-Server $host;
                proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        }
}

 

 

Have you been able to set up SPDY to be used by all the requests? I noticed that only a few (13 in my case) first request are being made using SPDY and then basic HTTS is used in my case: https://www.dropbox.com/s/qlowz0gr7qhlza8/Zrzut%20ekranu%202015-04-13%2010.55.29.png?dl=0

This widget could not be displayed.

Have you experimented with JIRA/Confluence Data Center yet?   I would assume large installation optimizations would be in that deployment package.

https://www.atlassian.com/enterprise/data-center

Sorry but considering that the hardware usage is below 15%, looking for the datacenter seems just as an unfounded up-selling proposal.

I have no relationship with Atlassian other than an user of their products, so I was not trying to upsell anything to you. I was suggesting that maybe (the reason for the word experiment) that changes in software architecture (active-active database) and algorithms should have a greater effect than throwing hardware at the problem.

This widget could not be displayed.

Great question and insights, Sorin! Thanks a lot! Have you tried setting up caching of static resources by nginx? Does it/will it have an effect on performance/load?

This widget could not be displayed.

@Rp Subhub I am sure that using caching in nginx would speedup them considerably but at this moment I do not have a set of nginx caching rules. The major problem is that Atlassian products have the bad habit of not using the HTTP caching/expire properties properly so adding caching is really tricky. If anyone has a set of working caching options I would be more than happy to try them. My only experience with enabling the caching was really awful as I ended up with users being switched: logging in as John and ending up as Marry. Still it seems that Atlassian does keep the information regarding what can be be caches really secret, probably as an selling point for TAM and Enterprise level support.

This widget could not be displayed.

I have found that the number of custom fields affects the size of the Lucene index. And really big indexes can be slower to update which leads to other actions waiting longer

Suggest an answer

Log in or Sign up to answer
Atlassian Summit 2018

Meet the community IRL

Atlassian Summit is an excellent opportunity for in-person support, training, and networking.

Learn more
Community showcase
Posted Wednesday in New to Jira

Are you planning to trial, or are currently trialling Jira Software? - We want to talk to you!

Hello! I'm Rayen, a product manager at Atlassian. My team and I are working hard to improve the trial experience for Jira Software Cloud. We are interested in   talking to 20 people planning t...

139 views 2 0
Join discussion

Atlassian User Groups

Connect with like-minded Atlassian users at free events near you!

Find a group

Connect with like-minded Atlassian users at free events near you!

Find my local user group

Unfortunately there are no AUG chapters near you at the moment.

Start an AUG

You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs

Groups near you