how-to-build-a-smarter-honeypot

Captchas Don’t Work: How to Trick Spambots with a Smarter Honeypot

Is anyone else tired of being caught in the endless loop of captchas? It can be very hard to read the strange text, especially when it’s full of random letters that look like numbers and vice versa. I really hate it when the letters are so squashed together that you can’t read them. There has to be a better solution!

What is a captcha?

You’ve seen captchas in your internet travels. They are those tests to see if you are a human. Captchas can take the form of a simple math problem or swirly text.

How do you feel about those captcha tests? They make me feel like a site doesn’t trust me. They get between me and my goal, slow me down, and annoy me. I get a feeling that I’m not alone in this.

What is a spambot?

A spambot is a piece of software written with the specific purpose of filling out forms with information that benefits the spambot author. This usually takes the form of comments that contain links which might help their website’s SEO (Search Engine Optimization). A clever author can extract the math problem from your form and calculate an answer. A very clever spam author can also get around image captchas.

So, how do you stop spambots?

The ultimate way to stop spambots is to use the Akismet plugin. Unfortunately, this service now costs money. I didn’t see any mention on their site of price cuts for non-profits. For a small personal blog or a small non-profit, this is cost prohibitive.

It’s time for a creative programming solution! To stop a spambot you have to think like a programmer writing a spambot. The simplest of spambots see a form and fill in every field on the form. So, what’s the solution?

What is a honeypot?

A honeypot is a field added to the form that the users can’t see due to CSS or JavaScript (which hides the field). Honeypots are awesome because they don’t inconvenience users like a captcha and they are a valid tool for thwarting spam bots. Basically, a spambot fills in a field that valid users can’t see, alerting us to their activity. If the honeypot field is filled in, we can confidently reject the form as spam.

After the honeypot was invented, the spambot authors got a little smarter. They added some code to detect these hidden fields. If the name of the field is always the same, then the field is really simple to detect.

Tricking Spambots with a Smarter Honeypot

It’s time to step up our game, programmers! Here’s a combination of spam thwarting techniques that makes a great spambot-proof form:

  1. Create a honeypot with the same name as one of the default fields. Make it look legit with a label. If you are using bootstrap, make it look perfectly legit with label and icon. We don’t want to alert the bot in any way that this field is special.
  2. Place the honeypot in the form in a random location. Keep moving it around between the valid fields. We don’t want the spambot writer to simply ignore the same field based on index.
  3. Rename your default fields to something random. Keep in mind you have to convert it back to its proper name on the server side. By naming the default fields to something random, the valid fields now begin to look like honeypots to the spambot.
  4. Add an expiration to your form. This will keep spambots from using the same fields and submitting the form later.
  5. Hide your form. You have to hide the honeypot to keep the valid users from filling it out. In my form, I hide the honeypot with JavaScript. It is still valid for you to hide this field with CSS. If you use CSS, your best bet is to use a class that contains a random word. In other words, if you call it “hide”, then the spambot author will pick it out easily.

Testing Honeypot Theories

I wrote a WordPress plugin that uses these techniques to test all the above concepts. The result? Spambots fill out the honey pot every time! This is great because the comments are now automatically marked as spam and it saves me time having to click the spam button every day.

I think the first bullet point using the same default fields in the honeypot is key because the WordPress spambots are definitely going to be looking for the 4 common fields that are on every WordPress comment form. This is a common footprint used by comment spammers and spambots. I guess the good news is that you know your SEO is starting to work when the spambots find the forms on your site.

Do you have any additional techniques or ideas on how to get around the above techniques? How would you use a honeypot to thwart spambots? Let me know in the comments.

Update (February 05, 2014): This smarter honeypot is available for Django. Ben Timby originally authored this and I recently had to make a quick code change. Check django-secureform out on github.

Update (March 06, 2014): Let’s test another theory. Will the public release of the WordPress plugin reduce its effectiveness? Check out wp-smart-honeypot on GitHub.

Update (September 29, 2014): There is now a fork of wp-smart-honeypot called tarpit: https://github.com/cferdinandi/tarpit

Image Credit: http://www.jongales.com/

SmartFile is a business file mangement platform that gives you more control, compliance and security.

TO SIGN UP