On fre, 2008-05-23 at 15:47 +0200, Cezary Rzewuski wrote:
> I've looked into squid's code and I've got an idea how to do this. The
> best place to scan downloaded site seems to be storeSwapOutFileClosed
> function in store_swapout.cc file. After closing file clamav could scan
> this file and log if the site is malicious.
There is a simpler way. Use Squid-3 and the c-icap project. This allows
you to plug in ClamAV to Squid quite seamless, and works for all content
even those that are not getting cached.
And if your crawler offloads SSL to the proxy then it will happily scan
even https content for you. (such offloading is done by sending
https:// URLs to the proxy, without wrapping them in SSL)
Squid-3:
http://www.squid-cache.org/Versions/v3/3.0/
C-ICAP:
http://c-icap.sourceforge.net/
Install instructions including Squid-3 configuration details:
http://c-icap.sourceforge.net/install.html
Regards
Henrik
This archive was generated by hypermail 2.2.0 : Tue Aug 05 2008 - 01:06:35 MDT