RE: [squid-users] Squid to cache a DB? from sean.upton@dont-contact.us on 2001-08-17 (squid-users)

From: <sean.upton@dont-contact.us>
Date: Fri, 17 Aug 2001 12:05:43 -0700

Robert Collins wrote:
>The _only_ content worth accelerating is static
>content. Dynamic content - changing content - will never have the same
>hit ratio in a http-accelerator, and thus does not make as effective use
>of the acceleration. There is a class of semi-dynamic data that is also
>worth accelerating, but that is a different discussion.

I'm not sure if I completely agree with this... I get the feeling that for
purely static content, unless you can benefit from a cache hierarchy, an
accelerated http server like TUX w/ Zero Copy kernel patches is going to
serve those static files quicker (or for that matter a farm of nodes like
this). Or for static text/html, Apache with mod_gzip. There must be a
reason some of us running caching accelerators are doing just that, given
all the other options available out there: that reason is "predictable"
dynamic content, of which much is, in fact, cacheable. Perhaps this is
what you mean by 'semi-dynamic' data? IHMO, using Squid as an accelerator
provides the best balance for accelerating the widest range of content for
many applications, including static and dynamic content.

My company, for example, uses app servers that dynamically publish content,
which generally is the same for all users who browse or search the site.
Everything, for example, in one of our newest applications is
cache-friendly: search results and browsing are all dynamic, CPU-intensive
database driven events, and we use GET requests for everything, which means
near everything is cachable.

The difficulty, of course, is that a certain _class_ of dynamic data is not
cacheable: anything heavily personalized; some of this limitation can be
overcome. Small amounts of personalization can (in a limited sense) be done
on the client-side with Javascript and cookies. For example, in e-commerce,
someone's shopping cart view page is NOT cached, and it sets a cookie for
the number of items in the cart every time it is refreshed. Other 'catalog
viewing' pages (i.e. looking at an entry for a book on Amazon) on the site
can be cached, but a message at the top of the page saying 'you have 7 items
in your cart' could be done from the client side (via scripting) from a
cached page because of a previously set cookie... I guess what I am saying
is that caching requires app design considerations in dynamic content, but
that this is a very appropriate use-case for a proxy cache as an http
accelerator.

And the HIT ratios are good: we have an online newspaper classified ad
search system that searches about 18-20k ads at any given time... that setup
behind squid as an accelerator averages about an 88% HIT ratio (including,
of course, images); I would estimate that at least 80% of the most popular
'entry' and 'browse' page views are cached, and 30% of search result lists
are cached. I don't think this is too bad (especially since most page views
in our application are search/browse result lists involving catalog queries
/ BTree traversals in an object database), because the ones that do get HITs
are the most demanded by our users: the most popular content will also be
the fastest.

One might say, caching like this should be done within the app server you
are using. Sure, but why not cached at the proxy too? The app server we
use (Zope) has cache managers for both internal RAM-based caching of
executed code, as well as cache managers for HTTP headers used in an http
accelerator like Squid.

I guess I see a lot of value in using Squid as an accelerator for dynamic
content. I'm sure others' mileage varies...

Sean

=========================
Sean Upton
Senior Programmer/Analyst
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
sean.upton@uniontrib.com
=========================
Received on Fri Aug 17 2001 - 13:02:21 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:01:42 MST