Re: [squid-users] Re: can squid load data into cache faster than sending it out?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 12 May 2011 13:37:13 +1200

On 12/05/11 08:18, Dave Dykstra wrote:
> On Wed, May 11, 2011 at 09:05:08PM +1200, Amos Jeffries wrote:
>> On 11/05/11 04:34, Dave Dykstra wrote:
>>> On Sat, May 07, 2011 at 02:32:22PM +1200, Amos Jeffries wrote:
>>>> On 07/05/11 08:54, Dave Dykstra wrote:
>>>>> Ah, but as explained here
>>>>> http://www.squid-cache.org/mail-archive/squid-users/200903/0509.html
>>>>> this does risk using up a lot of memory because squid keeps all of the
>>>>> read-ahead data in memory. I don't see a reason why it couldn't instead
>>>>> write it all out to the disk cache as normal and then read it back from
>>>>> there as needed. Is there some way to do that currently? If not,
>>>>
>>>> Squid should be writing to the cache in parallel to the data
>>>> arrival, the only bit required in memory being the bit queued for
>>>> sending to the client. Which gets bigger, and bigger... up to the
>>>> read_ahead_gap limit.
>>>
>>> Amos,
>>>
>>> Yes, it makes sense that it's writing to the disk cache in parallel, but
>>> what I'm asking for is a way to get squid to keep reading from the
>>> origin server as fast as it can without reserving all that memory. I'm
>>> asking for an option to not block the reading from the origin server&
>>> writing to the cache when the read_ahead_gap is full, and instead read
>>> data back from the cache to write it out when the client is ready for
>>> more. Most likely the data will still be in the filesystem cache so it
>>> will be fast.
>>
>> That will have to be a configuration option. We had a LOT of
>> complaints when we accidentally made several 3.0 act that way.
>
> That's interesting. I'm curious about what people didn't like about it,
> do you remember details?
>

The bandwidth overflow mentioned below.

>
> ...
>>>>> perhaps I'll just submit a ticket as a feature request. I *think* that
>>>>> under normal circumstances in my application squid won't run out of
>>>>> memory, but I'll see after running it in production for a while.
>>>
>>> So far I haven't seen a problem but I can imagine ways that it could
>>> cause too much growth so I'm worried that one day it will.
>>
>> Yes, both approaches lead to problems. The trickle-feed approach
>> used now leads to resource holding on the Server. Not doing it leads
>> to bandwidth overload as Squid downloads N objects for N clients and
>> only has to send back one packet to each client.
>> So its a choice of being partially vulnerable to "slow loris" style
>> attacks (timeouts etc prevent full vulnerability) or packet
>> amplification on a massive scale.
>
> Just to make sure I understand you, in both cases you're talking about
> attacks, not normal operation, right? And are you saying that it is
> easier to mitigate the trickle-feed attack than the packet-amplification
> attack, so trickle-feed is less bad? I'm not so worried about attacks
> as normal operation.
>

Both are real traffic types, the attack form is just artificially
induced to make it worse. Like ping-flooding in the 90's it happens
normally, but not often. All it takes is a large number of slow clients
requesting non-identical URLs.

IIRC it was noticed worse by cellphone networks with very large numbers
of very slow GSM clients.
  A client connects sends request, Squid reads back N bytes from server
and sends N-M to the client. Repeat until all FD available in Squid are
consumed. During which time M bytes of packets are overflowing the
server link for each 2 FD used. If the total of all M is greater than
the server link size...

Under the current design the worst case is Server running out of FD
first and reject new connections. Or TCP protections dropping
connections and Squid aborting the clients early. The overflow factor is
32K or 64K linear with the number of FD and cant happen naturally where
the client does read the data just slowly.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.12
   Beta testers wanted for 3.2.0.7 and 3.1.12.1
Received on Thu May 12 2011 - 01:37:20 MDT

This archive was generated by hypermail 2.2.0 : Thu May 12 2011 - 12:00:02 MDT