Completely Automated Public Turing test to tell Computers and Humans Apart : the CAPTCHA. The scrambled distorted letters that one must decipher and enter into a form before posting a comment, or joining a forum, or signing up for a mailing list. The test that usually takes me between 2 and 5 attempts to pass. A test I really do not like and never want to implement.
Apparently, the spammers don't care about my feelings and it looks like designing a CAPTCHA is going to be heading my way. First, there are a few requirements that need to be adhered to.
- No twisty letters
- No multi-colored letters
- No hard to read letters
- (hopefully) Easy to solve
- No hidden POST fields containing the answer
The solution for me was to implement a math CAPTCHA. Basically, a simple addition or subtraction problem that needs to be answered. The first thing to do was to get two numbers, fairly unique to the visitor, upon which I could base a math problem. Although I originally wrote the CAPTCHA code in PHP, there is no reason why the code can't be implemented in just about any server side language and I will be posting the code as meta-code. The code works in the following way:
//get the visitors IP address
ip_address = VISITORS_IP
//append the numeric day of the year to the ip_address
appended_ip = ip_address + string()day_of_year
//add some salt to the appended_ip
salt = "go away spammers"
salted_string = appended_ip+salt
//get the md5sum of the salted string
md5_string = md5sum( salted_string)
//get the first and last integer of the md5_string
first_int = get_first_int(md5_string)
last_int = get_last_int(md5_string)
//if the first int is greater than the last
//this is a subtraction problem
problem = first_int + " minus " + last_int
answer = first_int-last_int
//this is an addition problem
problem = first_int + " plus " + last_int
answer = first_int+last_int
The code gets run when a blog comment form is created and "problem" gets displayed to the user. After the user submits the form, the code is run again and the user's submitted answer is checked against the CAPTCHA "answer".
There are two immediate problems with this CAPTCHA
1. it would be really easy to scrape the initial form and parse the problem into a submittable answer
2. if the form is created at 11:55 PM and the user submits the form after
12 AM, the problem answer will have changed.
Storing the problem and answer in a SESSION variable may be a better way to go. Actually, that sounds like a much better way to implement this CAPTCHA.