Benjamin Juang (ibneko) wrote,
Benjamin Juang

Anyone know how those e-mail collecting bots work?

, , , ,

Playing around, and wondering if the above would fool them?

Source be:
<table><tr><td>ben</td><font color=white>, </font><td>.juang</td><font color=white>, </font><td>@</td><font color=white>, </font><td>comcast</td><font color=white>, </font><td>.net</td></tr></table>

Granted, it makes little , above the e-mail, but I'm trying to reproduce the way livejournal did their e-mail thing in the userinfo page. Will look into it a bit later....

Maybe the same thing, with mmm... this:
<table><tr><td>ben</td><font color=white>E-</font><td>.juang</td><font color=white>ma</font><td>@</td><font color=white>il</font><td>comcast</td><font color=white>:</font><td>.net</td></tr></table>

resulting in:

[ edit ]
Granted, the color isn't necessary for the second modification... I just didn't remove it.
Can I remove the spacing inbetween the table cells? I ought to be able to.....

And the benefit of this is that if you just copy and paste, the spaces go away, even though it looks as if there are spaces. Whee~

[ edit 2 ]
Hm, yeah. cellspacing=0 and cellpadding=0 doesn't seem to do much. Grr..? My assumption is that those e-mail collecting bots ("spiders" that collect e-mail addresses to spam) work by loading a page, finding @ and .net/.com/.org things, and matching them together? Limited in... 30 characters? Forget the address e-mail specifications. And it skips over protection (such as separating them with random html tags) by stripping the page of html tags? I suppose one could invoke the HTML::Parser, although I haven't looked into how that works, and I dunno if one could use that to find e-mails anyways...? You know, some sort of library to make it appear to be what a human might see?

  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded