Suggestions for anti-spambot Q&A
- Andrew Lee
- Posts: 3070
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
Suggestions for anti-spambot Q&A
The spambots have again risen to the challenge and there has been a recent increase in spam posts (all caught at the initial post stage thankfully).
I need ideas for Q & A that's not too difficult for genuine humans who want to sign up, but difficult enough for spam farms in India not to bother us for awhile.
The current questions are:
- Who is the author for TPFC's website design?
- When was the website design for TFPC last updated?
A little background to the spambot battle. This used to be done by pure code, which is aimed at defeating standard CAPTCHA such as those included in phpBB. These can be defeated by non-standard CAPTCHA, such as asking "What is the next number in the sequence 3,5,7?", which we did in the earliest iterations on TPFC.
Later on, it seems with the rise of more difficult CAPTCHAs such as pictorial ones (pick the photos which depict cats), or those that require identifying numbers or letters on actual document fragments, the task of solving them was delegated to human farms in low cost countries, where actual humans solve these CAPTCHAs for a pittance. We tried the OCR CAPTCHAs in our comment section previously, and they were easily defeated, which was why we made the comment section member-only.
So we moved to a system whereby the questions are site-specified and difficult to solve without spending a little extra time looking around the actual site. The idea is that the CAPTCHA solvers won't have the time and effort to bother with such questions, since they are paid by the number of CAPTCHAs they solve, and this seemed to work.
However, it appears from time to time, someone will actually look at these failed CAPTCHAs more carefully and add the answers to their database. This seems to happen every 6 months or so. That's when we see a rise in spammers by the mods (since all new posters must be verified by a mod). Normal users don't notice this, but the workload for the mods increase.
Any suggestions?
I need ideas for Q & A that's not too difficult for genuine humans who want to sign up, but difficult enough for spam farms in India not to bother us for awhile.
The current questions are:
- Who is the author for TPFC's website design?
- When was the website design for TFPC last updated?
A little background to the spambot battle. This used to be done by pure code, which is aimed at defeating standard CAPTCHA such as those included in phpBB. These can be defeated by non-standard CAPTCHA, such as asking "What is the next number in the sequence 3,5,7?", which we did in the earliest iterations on TPFC.
Later on, it seems with the rise of more difficult CAPTCHAs such as pictorial ones (pick the photos which depict cats), or those that require identifying numbers or letters on actual document fragments, the task of solving them was delegated to human farms in low cost countries, where actual humans solve these CAPTCHAs for a pittance. We tried the OCR CAPTCHAs in our comment section previously, and they were easily defeated, which was why we made the comment section member-only.
So we moved to a system whereby the questions are site-specified and difficult to solve without spending a little extra time looking around the actual site. The idea is that the CAPTCHA solvers won't have the time and effort to bother with such questions, since they are paid by the number of CAPTCHAs they solve, and this seemed to work.
However, it appears from time to time, someone will actually look at these failed CAPTCHAs more carefully and add the answers to their database. This seems to happen every 6 months or so. That's when we see a rise in spammers by the mods (since all new posters must be verified by a mod). Normal users don't notice this, but the workload for the mods increase.
Any suggestions?
Re: Suggestions for anti-spambot Q&A
This is an impossible task... Difficult questions would make registration for honest people simply annoying (we've seen a few cases already here)...
What I would do: I would make a specific (locked) topic in this forum area with answers to questions needed for registration. The question would change every month. The answers would be chosen completely random. Better work once a month to change that than every day to delete spam accounts.
Example: Q: What's the capital of France? A: Apple.
Regular people trying to register should have no problem finding the (sticky) topic with the answer available for the current month, because a tooltip with directions would be displayed during the registration process (search the forum site matters for clues).
P.S. @Andrew: if you could take a look at TC's forums, they have a section (first one) with a spam trap (everything posted there gets deleted); any idea what kind of efficiency has that?
What I would do: I would make a specific (locked) topic in this forum area with answers to questions needed for registration. The question would change every month. The answers would be chosen completely random. Better work once a month to change that than every day to delete spam accounts.
Example: Q: What's the capital of France? A: Apple.
Regular people trying to register should have no problem finding the (sticky) topic with the answer available for the current month, because a tooltip with directions would be displayed during the registration process (search the forum site matters for clues).
P.S. @Andrew: if you could take a look at TC's forums, they have a section (first one) with a spam trap (everything posted there gets deleted); any idea what kind of efficiency has that?
Re: Suggestions for anti-spambot Q&A
A couple of suggestions:
- In which programming language is TPFC written?
- What is the term for a program that doesn't leave traces behind?
My YouTube channel | Release date of my 13th playlist: August 24, 2020
Re: Suggestions for anti-spambot Q&A
Adding the answers in a dedicated thread mightn't be the best idea :p
Re: Suggestions for anti-spambot Q&A
How about making only one dedicated sub-forum visible for first-posters and requiring that at least two members validate them before granting full membership (or else it gets auto-erased periodically...)? I think I have been to forums that work something like that...
-
- Posts: 26
- Joined: Sat Jan 07, 2017 8:27 pm
Re: Suggestions for anti-spambot Q&A
Hi, I'm a newly-registered user. I think the current set of anti-spambot questions are acceptable, but can be improved to reduce the "annoyance" factor. For me, the main issue is that the questions are open-ended & does not specify how answer should be structured.
Eg 1: "Who is the author for TPFC's website design?"
I browse TPFC whenever I get a chance to. So by happy happenstance, I know the answer to the above. The thing is: I can only recall offhand that it was yaP's developer. I don't remember the exact spelling of his "unusual" username, although I recognize it when I see it. So can I simply state: "yaP developer" or "user with black avatar" ?
Eg 2: "When was the website design for TPFC last updated?"
This is the anti-spam question I encountered yesterday & today when trying to register. Personally, I don't really remember numbers (including people's ages, even my own). Luckily, I provided the correct answer based on forum search & some logic. But I did wonder if I would need to provide the precise day-month-year (or would year alone suffice), & in what format (all numerals, numerals & text, American vs European date sequence).
A possible way to mitigate the open-ended problem:
Eg 1: "Who is the author for TPFC's website design?"
I browse TPFC whenever I get a chance to. So by happy happenstance, I know the answer to the above. The thing is: I can only recall offhand that it was yaP's developer. I don't remember the exact spelling of his "unusual" username, although I recognize it when I see it. So can I simply state: "yaP developer" or "user with black avatar" ?
Eg 2: "When was the website design for TPFC last updated?"
This is the anti-spam question I encountered yesterday & today when trying to register. Personally, I don't really remember numbers (including people's ages, even my own). Luckily, I provided the correct answer based on forum search & some logic. But I did wonder if I would need to provide the precise day-month-year (or would year alone suffice), & in what format (all numerals, numerals & text, American vs European date sequence).
A possible way to mitigate the open-ended problem:
- For every question, provide user with a set of plausible choices drawn from a (much larger) pool of pre-defined custom statements from the backend (ie. not posted anywhere on the forum or internet). The choices are randomly drawn & arranged every time the registration form is accessed. Note that the statements have to plausible, so the wrong answers would still appear reasonable to a humanbot not familiar with TPFC.
- And depending on the question itself, more than 1 selection (tick) may be required to achieve the correct answer. Users still has to do the research, if they don't already know the info beforehand.
- On a semi-serious note: The above can perhaps be strengthened by then making user play a custom 10-minute topdown game (eg. catch 100 falling "TPFC"s to prevent them from going splat ...), where user has to win before registration submission is enabled. I suppose human farmbots can't afford to keep playing 10-minute games the whole day. (Or can can they ?)
- Andrew Lee
- Posts: 3070
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
Re: Suggestions for anti-spambot Q&A
I think this is a great idea, and I'm willing to give it a try. Thanks!!joby_toss wrote:What I would do: I would make a specific (locked) topic in this forum area with answers to questions needed for registration. The question would change every month. The answers would be chosen completely random. Better work once a month to change that than every day to delete spam accounts.
@Specular: Currently, the answers to all the questions are located at the footer of each TPFC page! How much more accessible that it get? I'm willing to gamble the spammers won't take the trouble to monitor a particular forum thread and track the ever-changing Q&A.
@HairyPorter: As I mentioned above, the answers to all the questions are located at the footer of each TPFC page. However I don't blame you for not noticing that because if I am not familiar with TPFC, it's likely I too will find it difficult to locate information that may be obvious to a regular.
I think this is also an excellent idea, but implementation is a bit more work than joby_toss' idea, since it requires automating the tasks of purging the offending user accounts and posts (manually doing them is a lot of work). I will KIV that as Plan B. Thanks!Midas wrote:How about making only one dedicated sub-forum visible for first-posters and requiring that at least two members validate them before granting full membership (or else it gets auto-erased periodically...)? I think I have been to forums that work something like that...
- Andrew Lee
- Posts: 3070
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
-
- Posts: 26
- Joined: Sat Jan 07, 2017 8:27 pm
Re: Suggestions for anti-spambot Q&A
Thanks, I finally see the info being subtly alluded to in the statement at the bottom of TPFC's homepage & software listing pages. I think the reasons why I never really noticed that statement before is because it's in tiny font, & also I'd never scrolled right to the bottom of the aforementioned webpages.Andrew Lee wrote:the answers to all the questions are located at the footer of each TPFC page
On a note of caution, I actually find the latest registration process very much easier, because the link to the anti-spam answers are given on the TPFC registration form itself, followed by a quick copy-&-paste (no need to sieve through forum discussions). I guess I should've waited 1 more day before attempting to register. As such, I'm not sure if it's really humanbot-proof, unless that humanbot don't even see the obvious on the registration form, before keying in the username. It doesn't take much time & effort to click that obvious link & read the brief contents.
https://www.portablefreeware.com/forums ... e=register
Registration
...
For Q&A answers and other important information for new users, please read this topic.
Username: _____________
...
Re: Suggestions for anti-spambot Q&A
Andrew Lee wrote:I think this is also an excellent idea, but implementation is a bit more work than joby_toss' idea, since it requires automating the tasks of purging the offending user accounts and posts (manually doing them is a lot of work). I will KIV that as Plan B. Thanks!Midas wrote:How about making only one dedicated sub-forum visible for first-posters and requiring that at least two members validate them before granting full membership (or else it gets auto-erased periodically...)? I think I have been to forums that work something like that...
- Just declaring my full agreement with your appraisal and choice.
Re: Suggestions for anti-spambot Q&A
I remember back when first registering searching the site for the user with the smiling Japanese character since I knew you were the admin (I believe that was the question at the time). Mustn't have noticed, haha.Andrew Lee wrote:@Specular: Currently, the answers to all the questions are located at the footer of each TPFC page! How much more accessible that it get?
Had thought from your OP comments about the previous system that every six months spammers would go manually looking for answers, unless you mean those answers were general capchas not specific to this site. Guess we'll see how the new idea works out, seems pretty reasonable.Andrew Lee wrote:I'm willing to gamble the spammers won't take the trouble to monitor a particular forum thread and track the ever-changing Q&A.
Re: Suggestions for anti-spambot Q&A
"What is the next number in the sequence 3,5,7?"
the current question is very difficult
you might be losing potential members
the current question is very difficult
you might be losing potential members
Re: Suggestions for anti-spambot Q&A
For forum regulars, I should note that the question is a little harder than 3, 5, 7.bitcoin wrote:What is the next number in the sequence 3,5,7?
- Andrew Lee
- Posts: 3070
- Joined: Sat Feb 04, 2006 9:19 am
- Contact:
Re: Suggestions for anti-spambot Q&A
The previous implementation was to have a link at the top of the registration page that lead to the correct answer of the day, as suggested by joby_toss. The answers are mostly random stuff to throw the bad guys off the track.
Anyway, with the upgrade to the latest version of phpBB, this is now irrelevant. The default anti-bot mechanism is now Google's reCAPTCHA algorithm. I guess we will find out in due course how effective this method really is!
I still have nagging doubts that reCAPTCHA will prove to be no match for the human bots in low wage countries like India. We may yet have to revert to joby's solution to fight those guys!
Anyway, with the upgrade to the latest version of phpBB, this is now irrelevant. The default anti-bot mechanism is now Google's reCAPTCHA algorithm. I guess we will find out in due course how effective this method really is!
I still have nagging doubts that reCAPTCHA will prove to be no match for the human bots in low wage countries like India. We may yet have to revert to joby's solution to fight those guys!
- JohnTHaller
- Posts: 718
- Joined: Wed Feb 10, 2010 4:44 pm
- Location: New York, NY
- Contact:
Re: Suggestions for anti-spambot Q&A
The only things that seem to work today are distributed blacklists and content analysis. You won't keep spambots out now because they farm out captchas to real humans. Keep a basic captcha like Google's ReCaptcha for account signups just to keep out the lowest automated ones, but don't worry beyond that.
For distributed blacklists, use something like StopForumSpam.com. They have a module for phpbb that will block known spam IPs from registering or posting as a guest: http://stopforumspam.com/mods#link_phpbb
For content analysis, we use Mollom on PortableApps.com right now. Unfortunately, it doesn't work that great, is only for Drupal/WordPress, and is being discontinued. I'm unsure if there is something similar for phpbb.
For distributed blacklists, use something like StopForumSpam.com. They have a module for phpbb that will block known spam IPs from registering or posting as a guest: http://stopforumspam.com/mods#link_phpbb
For content analysis, we use Mollom on PortableApps.com right now. Unfortunately, it doesn't work that great, is only for Drupal/WordPress, and is being discontinued. I'm unsure if there is something similar for phpbb.
PortableApps.com - The open standard for portable software | Support Net Neutrality