In the past it seemed like a good solution to just throw more hardware resources towards a machine to make it go faster.
Now I've begun to notice on a couple of my systems that not only doesn't that work anymore it even seems to hurt the instance when you oversize it.
Using the following link https://confluence.atlassian.com/jirakb/jira-server-sizing-guide-975033809.html as a guideline it seems that you really don't need much to have a fast, good working system.
My regular machines often has 32GB of memory with a heap size of over 20GB. This just seems overkill for 100-250users and I'm really wondering if I'm not hurting the instance more than actually doing any good.
So my question, what's your experience regarding dimensions of Jira Server/DC hardware.
any other tips and tips/best practices to keep your on premise system running the best it can?
I know a lot of things play in to performance, things like memory, cpu, disk speed, database system, reverse proxy, customization,.. so even a general "I suggest x or y" would be welcome!
Not always. There's a specific case where it can make things a lot worse. Java apps have to do this "garbage collection" thing, which is not as simple as "stuff it all in a bag and throw it out". Long story short is that if you have a system that's suffering because it's doing frequent GC, you can make it worse by giving it more memory - GC goes from a frequent but short freeze (Users: "Jira is slow all the time") to less frequent but longer freezes (Users: "Jira works ok most of the time, but 'crashes' and stops randomly")
For 250 users, a 20Gb heap probably is overkill. I'd generally start at 8 for that number, but the devil is in the detail of how they're using it, not so much the numbers of them. If they're all doing healthy stuff like updating issues, looking at well formed scrum/kanban boards that suit the team's work, using dashboards that help them look at what they're generally interested in, then it's fine. The thing to worry about is the memory/database/cpu hogging processes - please stop doing massive excel/csv exports for "reporting" (you're doing it wrong anyway), why the hell have you got 10,000 issues on your board? why is your build server updating 10 issues every second? What the heck does THAT app do that's chewing up so much time? Are you sure 270 custom fields on that issue type is actually useful? A 30,000 comment issue can't be of much use to you, even if it was displayable without eating your CPU for 30 minutes... that sort of stuff.
If you do suspect memory is an issue, then the broad and simple advice that we can give without doing any investigation is "change memory size in small steps". If you've got an 8Gb heap and you think the memory might be an issue, don't just whack it up to 12Gb arbitrarily. Add 512k and give it a couple of days. If there is no improvement, give it another 512k. If still nothing at 9Gb, you're probably looking at the wrong thing, but you probably won't have made it worse either. If you see small improvements, keep adding 512k every few days until you reach a comfortable limit or you stop seeing improvements (or things like long freezes start happening)
But also, see the third question
I cheat, I simply start with the Atlassian guidelines, because I can dump the responsibility on them. They wrote it, they're aware of all the customers who've come to them about performance (which is a much larger number than who've come to me), so I'd say they're best qualified.
But.... the but leads me straight on to the answer to your next question
Monitor it. A very broad answer, but yes, as you say, look at all the elements. Memory (and swap) is always a first point for Atlassian applications (I'd say that about most Java services though), CPU next, then in no particular order, network throughput, disk io, database access, etc. Even logging, and rummaging around in the application (JMX hooks if nothing else) can help you find resource hogs, performance issues and break points.
If you've got a good eye on how the system is performing, you can look for patterns. One of the ones that sticks in my mind was people complaining Jira was slow, but we were unable to get them to tell us exactly when in enough detail. Added basic monitoring and found a massive spike in all load throughout Tuesday afternoons. When the entire building did sprint planning, so were all messing with boards, often with compound filters on them. Once we knew that, we had options - revising the boards, staggering teams across the day so they didn't all hit at the same time, upping the memory/cpu for the afternoon or using it to upgrade from the couple of BBC micros it was running on, stick the index on an SSD instead of a physical hard drive, plug it into a 10/100/1000 network instead of a 100, etc.
The key here is knowing what the problem is, and monitoring is the best way to find that.
One thing that often surprises people is that getting rid of minor but frequent errors can actually help a lot more than you think - it takes work to write an error message to a log, and some errors or even warnings can chew up more cpu/memory than you think they might. Check your logs for errors (especially those big java stack trace ones) - you might think removing a broken post-function that just throws a java error but doesn't stop the workflow working is trivial and not worth removing, but it's chewing up more resource than you'd think, so if it's happening a lot, it may be worth digging it out.
This a thousand times:
the devil is in the detail of how they're using it, not so much the numbers of them
It's tempting to want to simplify hardware requirements to plain numbers, but it really depends for each Jira installation. Of course, broad recommendations (as Atlassian recommends in their on-prem docs) are a good starting point.
You wouldn't want someone to install Jira on a single potato in a closet for a company of 5,000 people.
Also, I find tuning the thread pools to help significantly. Increasing the Tomcat and DB thread pools is like adding lanes to a highway. They allow more traffic to pass through at the same time. I think defaults are 48 Tomcat and 60 DB max connections.
Using incremental steps, increase these 25% (while maintaining 120% db max to tomcat max, per Atlassian recommendations). For example, increase tomcat max to 100 and db max to 120. Then monitor load on system. Tune again if needed, rinse, repeat. If you're running on prem behind load balancers or proxies, don't forget to tune them as well.
Connect with like-minded Atlassian users at free events near you!Find an event
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no Community Events near you at the moment.Host an event
You're one step closer to meeting fellow Atlassian users at your local event. Learn more about Community Events