How to tune JIRA and Confluence configuration for high usage?

Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
September 27, 2014

If you are using JIRA and/or Confluence in a corporate environment you should expect to hit scalability issues quite fast.

Here our current list of configuration changes that are needed in order to be able to scale the system for more users:

  • Switch from VM to baremetal, get a machine with SSD and plenty of RAM. 
  • Install Oracle JVM 1.7, and keep it updated (you can to this with apt-get).
  • Never week JIRA or Confluence files on network drives / NAS. You can safely mount the attachments directory from NAS, but nothing else.
  • Also, keep Atlassian products updated, not older than 6 months. 
  • Use Linux and PostgreSQL, don't waste precious time with other configs, these are the ones that work the best and that are used by Atlassian on all of their instances.
  • 50% of the memory should be reserved to the JVMs, and at least 30% should be free / used for caching by the OS.
  • enable validationQuery on JDBC connection, sooner or later you will lose DB connections and you will hate your life if you do not do this.
  • Increase the number of max database connections in both JDBC and the database engine, always configure a max in potgresql +20-30 greater than your worst case.
  • Monitor performance, for example we use DataDog and monitor:
    • Machine memory
    • IO Usage
    • CPU Load
    • JVM memory
    • JVM threads
    • Postgresql connections per database 
  • We use nginx in front of these services, which also adds the SSL layer. Nginx is used to provide  temporary out of service messages and allowing us to throttle or even ban some HTTP clients, when needed.

Example of working setups that are sharing a server with 48GB RAM, 32 cores, and SSD:

  • JIRA, 300k issues: 7500 MB RAM, 512 max perm size, maxThreads=250, JDBC maxActive=120
  • Confluence, 50-100 real users: 6000 MB RAM, 512 max perm size, maxProcessors=140

Number of cores is not so important, it seems that Atlassian products are not able to really use them effectively, the average load is less than 3, and a 100% usage would translate into 32.

Memory and IO speed are essential, SSD added a speed up of almost 10x when it comes to indexing or service start time.

While I do have full HTTP logs that do include backend response time for each request, I am still looking for a tools that is able to parse these and to extract some meaningful information.

We are not fully pleased about the performance and I want to have some realistic data regarding what is causing the slowdowns. 

Feel free to add your own hints, so we can build a better tuning tutorial.

5 answers

1 accepted

3 votes
Answer accepted
Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 4, 2014

At this moment we are using nginx in front on JIRA to provide the following:

  • Adding SSL layer (HTTPS over HTTP)
  • Enabling SPDY for speeding up HTTPS (even faster than HTTP)
  • Ability to hide/switch backend with DNS changes
  • Potential to use split load over several servers (not used yet)
  • Provide a nice maintenance page when backend is down
  • Caching the number of requests
  • Limiting the number of requests 

 

server {
    listen 80;
    server_name jira.example.com old-jira.example.com;
    rewrite     ^ https://jira.example.com$request_uri? permanent;
    access_log  /var/log/nginx/redirect_access.log full;
    error_log  /var/log/nginx/redirect_error.log;
}
 
server {
        listen 443 ssl spdy;
        server_name jira.example.com ;
        root /etc/nginx/www/;
        gzip  on;
        gzip_vary on;
        access_log  /var/log/nginx/jira.access.log full;
        error_log  /var/log/nginx/jira.error.log;
        client_max_body_size 150m;
        error_page 502 503 504  /www/maintenance.html;
        location /www/ {
                root /etc/nginx/;
        }
        location / {
                proxy_redirect          off;
                proxy_next_upstream     error timeout invalid_header http_500;
                proxy_connect_timeout   5;
                proxy_pass   http://localhost:8080;
                proxy_redirect  off;
                #proxy_connect_timeout 120;
                proxy_read_timeout      10800s;
                proxy_set_header        Host            $http_host;
                proxy_set_header        X-Real-IP       $remote_addr;
                proxy_set_header        X-Forwarded-Host $host;
                proxy_set_header        X-Forwarded-Server $host;
                proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        }
}

 

 

Grzegorz Dubicki April 12, 2015

Have you been able to set up SPDY to be used by all the requests? I noticed that only a few (13 in my case) first request are being made using SPDY and then basic HTTS is used in my case: https://www.dropbox.com/s/qlowz0gr7qhlza8/Zrzut%20ekranu%202015-04-13%2010.55.29.png?dl=0

0 votes
MattS
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 13, 2015

I have found that the number of custom fields affects the size of the Lucene index. And really big indexes can be slower to update which leads to other actions waiting longer

0 votes
Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 7, 2014

@Rp Subhub I am sure that using caching in nginx would speedup them considerably but at this moment I do not have a set of nginx caching rules. The major problem is that Atlassian products have the bad habit of not using the HTTP caching/expire properties properly so adding caching is really tricky. If anyone has a set of working caching options I would be more than happy to try them. My only experience with enabling the caching was really awful as I ended up with users being switched: logging in as John and ending up as Marry. Still it seems that Atlassian does keep the information regarding what can be be caches really secret, probably as an selling point for TAM and Enterprise level support.

0 votes
Rp Subhub
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
September 28, 2014

Great question and insights, Sorin! Thanks a lot! Have you tried setting up caching of static resources by nginx? Does it/will it have an effect on performance/load?

0 votes
Norman Abramovitz
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
September 28, 2014

Have you experimented with JIRA/Confluence Data Center yet?   I would assume large installation optimizations would be in that deployment package.

https://www.atlassian.com/enterprise/data-center

Sorin Sbarnea (Citrix)
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 7, 2014

Sorry but considering that the hardware usage is below 15%, looking for the datacenter seems just as an unfounded up-selling proposal.

Norman Abramovitz
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
October 8, 2014

I have no relationship with Atlassian other than an user of their products, so I was not trying to upsell anything to you. I was suggesting that maybe (the reason for the word experiment) that changes in software architecture (active-active database) and algorithms should have a greater effect than throwing hardware at the problem.

Suggest an answer

Log in or Sign up to answer