Tease rating and web crawlers
Posted: Sat Sep 27, 2008 2:21 pm
We were just discussing how some teases gets rated in just a few minutes after they are posted. This seems to happen quite often. It just hit me that this could be web crawlers doing the voting. Bots from google, yahoo, msn/live.com and such will click through all links they find, trying to index all pages (we can see them quite often in the forum list of online users). And it just might be that some bots opt for the last link (5) first, while others opt for the first one (1).
This could be fixed a few different ways.
- Change the voting into a form post instead of regular links. Web crawlers don't post forms as they peek around.
- Add rel="nofollow" to the <a> tag for the rating links. This should stop most bots (if not all) from following the links.
- Add vote.php to robots.txt (this site doesn't appear to have one at all). I think all web crawlers honor the robots.txt file, but there may be some smaller ones that don't.
The first and last are probably the best (as in most effective) solutions, but the middle one might be easiest to implement.
If you have any questions about this, like how the robots.txt works, feel free to shoot me a PM or ask here. I'll be happy to help!
This could be fixed a few different ways.
- Change the voting into a form post instead of regular links. Web crawlers don't post forms as they peek around.
- Add rel="nofollow" to the <a> tag for the rating links. This should stop most bots (if not all) from following the links.
- Add vote.php to robots.txt (this site doesn't appear to have one at all). I think all web crawlers honor the robots.txt file, but there may be some smaller ones that don't.
The first and last are probably the best (as in most effective) solutions, but the middle one might be easiest to implement.
If you have any questions about this, like how the robots.txt works, feel free to shoot me a PM or ask here. I'll be happy to help!