We have almost 600 Subversion repositories, some that are 30 GB or more in size. The last time we tried to support FishEye the scanning was taking over a month and we decided it wasn't viable.
I'm wondering if Atlassian has considered supporting the use of Hadoop to improve scanning performance? Revisions seem like a unit of work that could be distributed to various nodes for processing.
Other than Hadoop, does anyone have other suggestions for improving the performance?
Bah, accidentally deleted my comment. Be sure that all your repos are structured in the way it likes: https://answers.atlassian.com/questions/19281/how-can-i-reduce-the-size-of-the-fisheye-indexes
I ended up writing something that automatically generates the exclusion rules.
30Gb repos doesn't tell us much - if it's binary files fisheye doesn't care, if it's metadata it does.
Be sure all your repos are structured in the way that fisheye likes: https://answers.atlassian.com/questions/19281/how-can-i-reduce-the-size-of-the-fisheye-indexes . I ended up writing something to automatically generate the exclusions.
30Gb repos doesn't really tell us anything useful. It could be binary files, in which case it makes no odds to FE, or metadata, in which case it will kill it.
We also met some problem with Big repositories.
We have over 300 repos, and each one is bigger than 10G.
We are also interesting about your idea of "using Hadoop".
Do you know how to use Haddoop on Fisheye/Crucible?
How Hadoop will improve the scanning performance?
Hey Everyone! Unfortunately, the venue that was hosting us on the 23rd has pulled out so we're looking for a new venue. If anyone would have a room free that we could use on the ev...
Connect with like-minded Atlassian users at free events near you!Find a group
Connect with like-minded Atlassian users at free events near you!
Unfortunately there are no AUG chapters near you at the moment.Start an AUG
You're one step closer to meeting fellow Atlassian users at your local meet up. Learn more about AUGs