squid modification for antivirus use

From: Cezary Rzewuski <crzewusk_at_nask.pl>
Date: Fri, 23 May 2008 15:47:37 +0200 (CEST)

Hello,
I've got a question concerning slight modification of squid
(for the purpose of other GNU project). I don't think I could get answer
in any other place, so I hope my message won't be filtered by the moderator:)

We want to use squid as a proxy for crawler that visit sites which are
suspected to be malicious. However, we'd like the proxy to check downloaded sites in
clamav engine. There is solution that combines squid and clamav, but
clamav scanner works there as redirector, which means that sites are
downloaded twice. This is unacceptable approach for our project due to the
fact that some sites behave differently whether it's the first connection
from a certain IP address or not.

I've looked into squid's code and I've got an idea how to do this. The
best place to scan downloaded site seems to be storeSwapOutFileClosed
function in store_swapout.cc file. After closing file clamav could scan
this file and log if the site is malicious. The only glitch is the fact that
not all sites are cached. However it doesn't seem like difficult to solve
(needs to track places where decision is made whether the site is
cachable). The second thing is that I don't want squid to return cached
site ever. Even if the site has been cached already it should be
downloaded again. I haven't investigated this yet, but I also think it
shouldn't be very complicated to change.

So, my question is: is the way of modifying squid I've described a good
idea and will it lead to desired results? Maybe you have other suggestions
for modification? Thank you for any help.

Kind regards,
Cezary Rzewuski
Received on Fri May 23 2008 - 23:30:44 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 05 2008 - 01:06:35 MDT