RE: [squid-users] Squid getting too big? from BAARDA, Don on 2001-08-09 (squid-users)

From: BAARDA, Don <don.baarda@dont-contact.us>
Date: Fri, 10 Aug 2001 09:14:45 +0930

G'day,

> -----Original Message-----
> From: Adrian Chadd [mailto:adrian@squid-cache.org]
[...]
> On Thu, Aug 09, 2001, BAARDA, Don wrote:
> > played with the memory_pools option? The whole concept of
> an application
> > doing a better job of managing memory than the OS it runs
> on seems flawed to
> > me, so I turn memory_pools off.
[...]
> Right. It sounds crazy, but its not.
> See, UNIX is designed to be generally good at everything.
> malloc() implementations are a good example - it would be rather
> challenging to implement a malloc library that would service
> all access types. A few very large blocks of RAM? What about
> a lot of small blocks of RAM?
>
> squid does the latter. Unfortunately, most malloc implementations
> don't deal with it very well, and some just leak RAM.
>
> the memory pools implementation in squid was designed to combat
> this. It keeps a "cache" of allocated RAM, knowing that its going
> to be reused at some later date. Since the system malloc hasn't
> got any way of knowing (without statistical sampling) that
> the application is doing lots of fixed-sized allocations, it has
> to use a "generic" algorithm which is going to be comparitively
> slow.

Yes, but there is more to consider than just how well the memory allocator
works for Squid in isolation.

An OS knows the whole context of the system, and can take that into account.
An application that just consumes a larger and larger chunk of memory for
it's own internal allocator can screw up the OS's attempts to juggle RAM,
swap, and disk cache between _all_ the processes running on the system. In
the end your highly tuned memory allocator can be bottlenecked by the OS's
generic virtual memory management.

Implementing your own memory manager duplicates functionality. This means
two chunks of code running in memory, two lots of source to maintain, and
two places to introduce bugs. I know that the memory allocation stuff is
small in the context of everything, but I'm reminded of the saying "take
care of the cents, and the dollars will take care of themselves". Code bloat
is rarely caused by duplication of massive components, usually its
duplication of heaps of tiny pieces, which also makes it much harder to fix.

> What is being worked on right now is a block allocator to replace
> the memory pools implementation - this is an allocator tailored
> towards squid's memory access patterns. Hopefully it'll cut down
> on squid's memory footprint by quite a bit on large caches
> (as malloc alignment can waste a _lot_ of RAM with the small
> StoreEntry allocations)

I'm always wary of any efforts to duplicate and tailor basic OS
functionality for a particular application. By just leveraging the OS, you
save yourself heaps of work, and you get free upgrades as the OS is
improved. If the OS's implementation is flawed, I figure it's better value
to fix the OS than re-invent it. I think the reason the NOVM version of
squid was so effective (at least on Linux) was because it tossed the whole
idea of managing its own memory cache and just relied on the OS's disk
caching to do the work for it.

However, I'm not a Squid developer. I am speaking generally, and I do know
that in specific cases, there can be significant benefits from re-inventing
a special wheel. I trust the squid developers know what they are doing :-)

ABO
Received on Thu Aug 09 2001 - 17:43:40 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:01:31 MST