Re: [squid-users] what are the Pros and cons filtering urls using squid.conf?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 11 Jun 2013 15:26:38 +1200

On 11/06/2013 9:03 a.m., Jose-Marcio Martins wrote:
>
> Welllll... sorry for the top post...
>
> If the "filter" is an external handler process... it should be able to
> do all the job of updating its database, in memory or file based,
> without boring squid, and without (or eventually a minimal)
> interruption of service
>
> There are some solutions out there...
>
> Or maybe I didn't understood what you're talking about.
>

When Squid reloads it pauses *everything* it is doing while the reload
is happening.
* 100% of resources get dedicated to the reload ensuring fastest
possible recovery then everything resumes exactly as before.

When a squid helper reloads it pauses *just* transations which are
depending on it, other transactions remain processing.
* Some small % of resources get dedicated to the reload.
* each helper instance of that type must do its own reload, multiplying
the work performed during reload by M times.

When ICAP reloads it has the option of signalling Squid no more
transactions and completing the existing ones first, or spawning a new
service instance with the new config and then swapping over seamlessly.
* the resources of some other server are usually being applied to the
problem - leaving Squid to run happily
* Squid can failover to other instances of that ICAP service for
handling new transactions.

No matter how you slice it, Squid will eventually need reconfiguring for
something and we come back to Squid needing to accept new configuration
without pausing at all.
There is the "HotConf" project
(http://wiki.squid-cache.org/Features/HotConf) which 3.x releases are
being prepared for through the code cleanup we are doing in the
background on each successive release. There is also CacheMgr.JS project
Kinkie and I have underway to polish up the manager API, which will
eventually result in some configuration options being configurable via
the web API.

Amos

> On 06/10/2013 05:43 PM, Squidblacklist wrote:
>> On Mon, 10 Jun 2013 12:16:40 -0300
>> Marcus Kool wrote:
>>
>>> [discussion about proposal 1 deleted]
>>>
>>>>> About solution 2:
>>>>> Consider the following scenario:
>>>>> Suppose the parent proxy configuration must be reloaded.
>>>>> What mechanism will be used to signal the child proxy to ignore
>>>>> the parent?
>>>>
>>>> Squid does this on its own. thats what I have been trying to tell
>>>> you. the child proxy knows to bypass the parent when it is
>>>> unavailable.(IE During reload or restart)
>>>
>>> The child knows how to deal with a non-responsive parent. correct.
>>> But in the process of recovering from a parent that suddenly does not
>>> respond any more, CONNECT tunnel break, and HTTP object retrieval and
>>> uploads in progress break. The client has no way of redoing or
>>> repairing this.
>>>
>>>>> - reload its configuration? No, reconfiguration of the client
>>>>> stops all traffic.
>>>>
>>>> Not if your directing your traffic to a child proxy, and reloadong
>>>> on the parent proxy.
>>>
>>> The question was: how is the child signalled that the parent is
>>> reconfiguring, with the intent to stop using the parent neatly and to
>>> prevent that HTTP traffic in progress is processed without
>>> interruption of service. The option to reload the configuration of
>>> the client proxy does not work, since reconfiguration of a squid
>>> proxy causes interruption of service. Especially when all traffic is
>>> redirected to the client proxy.
>>>
>>>>> - simply let the connection to the parent fail? this will lead to
>>>>> timeouts and everything in progress fails.
>>>>
>>>> Nothing fails in this configuration.
>>>
>>> Have you tested this? In a live situation where applications use
>>> CONNECT tunnels, HTTP POST with a large body, chat applications which
>>> use a protocol where an HTTP GET may get a very late answer? And
>>> what about applications that rely on persistent HTTP connections?
>>
>> Yes this would seem to be a problem. I just confirmed.
>>
>>>
>>>>> - use more than 1 parent? can be done but is no cost effective
>>>>> since one needs an extra Squid server and still everything in
>>>>> progress fails. If I am missing something, please explain how the
>>>>> child ignores the parent without interruption of service.
>>>>
>>>> There is no added cost, you can run multiple instances of squid on
>>>> the same machine, by using a different conf and cache dirs for each
>>>> instance.
>>>
>>> Squid is used in many institutions with a large configuration: large
>>> memory and large caches. It is not obvious that institutions which
>>> sized their environment for a particular task can run two Squid
>>> proxies (parent and child) on the same hardware.
>>>
>>
>> Well , I would argue that if you in fact setup a child, parent proxy,
>> resource requirements would be minimal for the child as it likely
>> wouldnt require any filtering or much resources.
>>
>>> Marcus
>>>
>>> PS: what is name? Is it Ben or Fix ?
>>>
>>
>> My name is Ben. If calling me Fix seems
>> silly, just use Ben.
>>
>> My purpose in interjecting into your thread was not to disrupt or
>> dissuade discussion about improvements to squid proxy, I merely was
>> explaining the work arounds I see. And yes, it is not
>> perfect. At every price point there is an appropriate solution. And
>> while I admit, this economical solution may work for those who have no
>> alternative might not be acceptable for some,
>>
>> In conclusion, I too would like to see a
>> true fix for squid that allows a reload without interrupting traffic,
>> or any sort of "work around".
>>
>> Also, URLfilterDB looks like an excellent product.
>>
>>
>>
>>>>>> -
>>>>>> Signed,
>>>>>>
>>>>>> Fix Nichols
>>>>>>
>>>>>> http://www.squidblacklist.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>>
>> -
>> Signed,
>>
>> Fix Nichols
>>
>> http://www.squidblacklist.org
>>
>
>
Received on Tue Jun 11 2013 - 03:26:53 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 11 2013 - 12:00:13 MDT