Block Spam Bots with Lighttpd
January 11th, 2006
To block Spam bots from harvesting email addresses displayed on your website, drop this into your Lighttpd configuration file.
$HTTP["useragent"] =~ "^AdultGods" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^AIRF" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^BlackWidow" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^CherryPicker" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^ChinaClaw" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Custo" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^DISCo" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Download Demon" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^eCatch" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^EirGrabber" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^EmailCollector" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^EmailSiphon" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^EmailWolf" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Express WebPictures" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^ExtractorPro" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^EyeNetIE" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^FlashGet" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^GetRight" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^GetWeb!" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Go!Zilla" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Go-Ahead-Got-It" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^GrabNet" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Grafula" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^HMView" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^HTTrack" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Image Stripper" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Image Sucker" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Indy Library" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^InterGET" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Internet Ninja" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^iRider" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^JetCar" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^JOC Web Spider" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^KITV4.7 Wanadoo" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^larbin" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^LeechFTP" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Mass Downloader" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Microsoft URL Control" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^MIDown tool" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Miragorobot" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Mister PiX" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Navroad" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^NearSite" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^NetAnts" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^NetSpider" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Net Vampire" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^NetZIP" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^N_o_k_i_a" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^NICErsPRO" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Octopus" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Offline Explorer" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Offline Navigator" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^PageGrabber" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Papa Foto" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^pavuk" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^pcBrowser" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Popdexter" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^RealDownload" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^ReGet" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SAFEXPLORER TL" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SiteSnagger" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SmartDownload" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SuperBot" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SuperHTTP" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Surfbot" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^SV1" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^tAkeOut" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Teleport Pro" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^VoidEYE" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^W3CRobot" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Web Image Collector" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Web Sucker" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebAuto" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebCopier" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebEMailExtrac.*" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebFetch" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebGo" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebLeacher" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebReaper" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebSauger" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Website eXtractor" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Website Quester" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebStripper" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WebWhacker" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Widow" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^WWWOFFLE" { url.access-deny = ( "" ) }
$HTTP["useragent"] =~ "^Xaldon WebSpider" { url.access-deny = ( "" ) }
2 Responses to “Block Spam Bots with Lighttpd”
Sorry, comments are closed for this article.
January 11th, 2006 at 04:35 PM Note: I have not extensively tested this code, so use it at your own risk. Also, please be aware that this does NOT, I repeat NOT, immunize you from spam harvesters. If you post your email address on your website then there are myriad ways to harvest it. This method only blocks those spiders that identify themselves as one on the list. The only surefire way to safeguard your email address is to *not post it online* at all.
January 14th, 2006 at 09:03 AM http://www.comicstripgenerator.com has a fun way to mask your email (sample) from evil spam harvesting bots.