Confluence : 100K users?

Andrei [errno]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 7, 2012

did anyone come across a huge Confluence user base? We are potentially looking @ 100,000+ users and I am wondering if someone already dealt with it and maybe could share "lessons-learned"?

I am also looking at Crowd to provide all those accounts either from a delegated LDAP directory or an internal dir. - can Crowd hanlde this load?

I remember seeing a presentation from Accenture showing 90K users, but it has been pulled from Atlassian site. - anyone has an akternative link? (http://confluence.atlassian.com/display/CONFHOST/Examples+-+Intranet/ http://confluence.atlassian.com/display/CONFDEVAL/Accenture's+Use+of+Confluence)


Any pointers/suiggestions would be very welcome.
thanks!

2 answers

1 accepted

6 votes
Answer accepted
Brendan Patterson
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 8, 2012

Lots of detailed guidance here: http://confluence.atlassian.com/display/DOC/Operating+Large+or+Mission-Critical+Confluence+Installations

The SAP Developer Network rolled out Confluence for its wiki in 2006 and they have 500k+ users. (You can tell it is Confluence from the icons, layout and by doing a "view source").

http://wiki.sdn.sap.com/wiki/display/WHP/Home

Looks like Atlassian had to take down the Accenture video for some reason according to the comments in this blog.

http://blogs.atlassian.com/2009/01/back_by_popular_demand_accenture_webinar/

If I were you I'd do these things (just brainstorming highlights):

* get a really beefy dedicated box running some optimized version of Linux - 16 GB of RAM+ with an SSD drive big enough to handle your content.

* Run the database local (same machine as Conf) to minimize latency. Probably have a mysql expert or postgres expert tune it.

* try to anticipate the actual load on your server and create a distributed load test using JMeter and scripts Atlassian provides to load test

* fire it up, run the tests and see if performance is good

* configure monitoring like Atlassian suggests in that first link

* configure and test a hot swap with a spare machine

* don't even think about using a VM (I'm sure all of these things are pretty obvious)

alternately

* talk to Contegix about hosting it - they are world class in every sense of the phrase to work with. They probably have a few of these sized servers already set up. This is really your best option, roll out with them to assure success, learn from their expertise and if you want to you can plan for transition to your own infrastructure down the road.

I would not automatically assume clustered is the way to go, but it might be with that size - you'll have to research and balance the good with the added complexity, costs (as in hardware) and maintenance challenges.

For tips I would watch this presentation once it's available:

http://www.atlassian.com/summit/2010/presentations/development-speed/performance-tuning-application-development.jsp

In the mean time consider voting up my related question to get the videos back up (I'm sure it's a ton of work for Atlassian, but they are very valuable)

Most the instances I've worked with are not that big - so no doubt other folks will have more tangible experience to make suggestions.

Hope something in there is useful :)

Andrei [errno]
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 8, 2012

thanks - it is helpful. + i voted on your episodic q. :)

2 votes
Sergey Svishchev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
November 25, 2013

If you're not careful, the client part of Confluence (the JS code inside browser) will poll the server every 30 seconds, looking for various updates:

  • Heartbeat from open editor windows -- every 30s, hardcoded into source (CONF-29749)
  • Workbox (in-app notifications) polling -- every 30s (adjustable).
  • Quick Reload feature -- used to flood the server under certain conditions (CONF-30741).
  • Drafts are also saved every 30s by default (adjustable).

Some of these polls used to reset the session timeout (CONF-26796).

Autowatch feature (adds everyone who edits a page as its watcher) has a nasty side effect; when page is saved, mail notifications are generated synchronously and if the page is both popular (i.e. has many watchers) and large, saving it can be very very slow -- CONF-21846.

Sergey Svishchev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
December 12, 2013

A few more notes:

The OLoMCCI page doesn't tell you a few good ideas that knowledge base articles do:

- don't ever enable Usage Tracking plugin as it has known performance issues
- change schedule for "Optimize Search Index" job to run only at night
- keep larger logs longer (edit confluence/WEB-INF/classes/log4j.properties)
- use JIRA's disk speed test tool if you suspect your filesystem is slow

Plus, from time to time hidden gremlins strike you :)

- working with plugins (enabling/disabling them) and editing global settings affects performance (CONF-30110) so avoid doing that under heavy load.

- every displayed page makes a number additional of REST calls; one in particular, /rest/create-dialog/1.0/spaces, makes a query to search engine. Normally that query is fast, but when it isn't, entire app may slow down. Results of the call are apparently needed by 'Create' button (to display list of spaces), and are marked non-cacheable by the app.

- same deal with JIRA plugin's Autoconvert feature -- it calls /rest/jiraanywhere/1.0/servers and result is also non-cacheable (but fairly static -- how often do you add application links?)

- however, the Application Navigator (that little hamburger at the top left) is not using REST and may slow down every displayed page in exceptional circumstances. You cannot entirely disable the feature at the moment but Atlassian is aware of performance issues in the NavLink plugin which implements it.

Sergey Svishchev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
January 13, 2014

More gremlins:

- Changing page restrictions on a large page hierarchy is known to cause problems (https://confluence.atlassian.com/x/YoX2Cw), avoid doing that. Since the app queues items for reindexing synchronously AND givies no progress feedback to the user, they might try again and again which only makes matters worse.

Sergey Svishchev
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
April 8, 2014

Suggest an answer

Log in or Sign up to answer
TAGS
AUG Leaders

Atlassian Community Events