I've been toying around with implementing an external_acl module
to check against phishtank.org's database, but the problem is
comparing URLs to make sure that minor semantic variations to a
malware URL (/./, host capitalisation, user:pass@, %-escape, etc)
are worked around.
I then stumbled across the Google Safe Browsing API which has
a section on URL canonicalization, which pretty much encompasses
all the bits I was thinking about.
http://code.google.com/apis/safebrowsing/developers_guide.html
Comments?
Adrian
Received on Fri Jun 22 2007 - 03:13:37 MDT
This archive was generated by hypermail pre-2.1.9 : Sun Jul 01 2007 - 12:00:07 MDT