Attachment directory consuming lot of disk space

Naren
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 13, 2013

Hi All,

We are having a production JIRA with issues upto ~3 lacs. The attachment directory specifically is consuming lot of disk space, hence we are planning to move our production JIRA_HOME directory to SAN storage, but that will take some time to implement.

Meanwhile, our short term plan is to move the attachments of the older projects to a secondary server, whose issues are resolved and not updated over an year (criteria decided by our team).

Can I move these attachments for all those projects which fall under above criteria using a scripts or is there any plugin available in JIRA that can help me achieve this.

It would be of great help if anybody can share thoughts on below points -

1. Is this approach feasible enough. Are there any areas to be considered while following such approach

2. How do I go about making the same attachments available to the client upon his request, as the attachments moved are based on above criteria.

3. Is there any alternative approach, to get rid of attachments directory consuming disk space issue.

Any help on above issue much appreciated :)

2 answers

1 accepted

1 vote
Answer accepted
Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
February 13, 2013

Question 1:

Yes, this is a good idea. Moving the attachments to another disk with more space will solve the problem.

However, you suggest moving Jira_home to SAN. Do NOT do that without doing a LOT of testing. Jira_home contains other things - working files, plugins and so on. Critically, it contains your index. The index is read and written immensely heavily, and should always be placed on the fastest nastiest disk you can get your hands on. When I tested a SAN on versions 3 and 4 of Jira, the SAN disks were so relatively slow, Jira was utterly crippled. Attachments on the other hand - absolutely fine. It may be that our SAN was slow or old or something, but you really do need to load test if you're going to put the index on it.

Question 2:

I'm not sure what you mean by "make attachments available to the client on their request". The attachments are for Jira, and accessed via Jira. All you need to do is make sure that the directory that they live in is available to the Jira service in the right place. I tend to use symbolic links. If you want other systems to see attachments, that's fine as long as it's read-only (you must not delete attachments outside Jira or change the directory tree), and security is not a concern (the external system won't respect Jira permissions)

Question 3:

Not really. You either want the attachments, in which case you need to keep them on a disk Jira can see. Or you don't, so you can delete them.

Naren
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 19, 2013

Hi Nic,

Thanks for the answer. Also I would like to know if we can implement some LINUX shell script kind of thing to move the attahments periodically based on the criteria - issues across projects in JIRA inactive (i.e., resolved and not updated) since last 365 days.

Is this feasible approach to implement, as I am not sure how to -

  • Get the list of issues across all projects which are inactive since last 365 days in shell script.
  • Check for the corresponding directory in /data/attachments equivalent to the issue key in shell script
  • Move these attachments directory for all issues to a backup storage.Wu

Would be great if you can share your thoughts!

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
February 19, 2013

It's feasible, but ugly

  • Your script will need to read Jira to find out what it can move. It'll need to filter for "updated is over a year ago"
  • The attachments directory has a structure that's easy to parse once you have an issue key
  • Moving the attachments will break them in Jira - they will still appear there, but if someone clicks to download them, they'll get a "file missing" error

As you're using linux, there's a far more simple solution - put ALL the attachments on a remote disk and mount it into the file system. That does all the work without losing the links or needing any scripting.

Naren
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 19, 2013

Yes, it will result in error. Your approach is good to tackle this disk space issue and would try to implement the same.

But assuming that the above mentioned approach by me is the only approach.

What if I read the created date of a directory under /data/attachments/<project_key> through the Linux shell script, and rule out communicating with JIRA for the above mentioned criteria part. And then check whether if it was created a year ago. If yes, then move it to the backup sorage with similar file system hierarchy found in production JIRA /data/attachments directory.

Nic Brough -Adaptavist-
Community Leader
Community Leader
Community Leaders are connectors, ambassadors, and mentors. On the online community, they serve as thought leaders, product experts, and moderators.
February 19, 2013

Yes, that will work fine.

I'd do a little more checking thogh - don't just look at the creation date individually on an attachment - I'd look at all of them for the issue and work out the created date of the most recent one before deciding to archive all of them.

That's for neatness mostly. Either an issue has attachments or it'll break consistently on all attachments, so you know they're archived. Without that check, I can imagine people seeing "attachments 1 and 2 throw errors on download, but 3 and 4 are fine", which is confusing. You could still get that situation too - if someone attaches a file to an issue that you've archived the attachments for, but I imagine that would be infrequent, especially if you're using "a year" as the rule.

As an aside, I'm not sure why it's "the only approach" - if your server can reach a filesystem for a simple copy, it's very likely that it can reach it to be used as a remote file system and could be used for attachments!

2 votes
Renjith Pillai
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
February 16, 2013

Suggest an answer

Log in or Sign up to answer