You're on your way to the next level! Join the Kudos program to earn points and save your progress.
Level 1: Seed
25 / 150 points
1 badge earned
Challenges come and go, but your rewards stay with you. Do more to earn more!
What goes around comes around! Share the love by gifting kudos to your peers.
Keep earning points to reach the top of the leaderboard. It resets every quarter so you always have a chance!
Join now to unlock these features and more
It's hard to say it again, but I'm again upset with the quantity of spam. I know, you made it easy to instaban users, but this is ridiculous.
I mean, it makes sense for a spammer to target this community considering the subjects we're spammed with:
You can't fight the popular culture, but I believe we can fight these posts. All these have in common some phone numbers and nono keywords. I think it should be pretty trivial to filter such questions ... please.
Thanks for your feedback and passion. We've made a number of changes recently in an attempt to defend against the spam we're receiving:
As is always the way with these things, new defensive techniques are initially very effective, but then eventually they become less effective as the attackers learn how they work and how they can circumvent them. When the Atlassian ID CAPTCHA was turned on, we had about 7 days straight of no spam (a new record!), however we're now climbing back up to the original levels.
I think the next logical step is a manually maintained blacklist of words, phrases or URLs, but we've been shying away from that so far since we're really going to have to commit to someone spending a fair bit of time looking after this blacklist in order for it to remain effective. We've been investigating automated solutions before resorting to this.
I'll have a chat about this with the team and I'll come back to this Question when we decide what the next step will be.
How about this: when a power user instabans a user, take the texts and parse for their meaning using http://nlp.stanford.edu/software/lex-parser.shtml
For a medium-sized text, it will take 5-8 seconds to parse it. You can safely extract keywords from there, including URLs, because you will know the part of speech and how it is used. Hence, your list will be maintained automatically.
I played with it, it is really useful, but on the other side, it consumes quite a lot of memory. Have fun.
Looks like we had another big pile of spam overnight! I've just cleaned up some of it now.
Just giving you an udpate, we're continuing to work out the best way to stop this for good. Dennis is actually going to be splitting his time with the Atlassian ID team for a couple of months to see if we can develop some better anti-spam tools further upstream (stop the spammers from creating an Atlassian ID account, before they even get to Answers).