[squid-users] Architecture advice from Rob Williams on 2008-08-08 (squid-users)

From: Rob Williams <rob.williams_at_gmail.com>
Date: Fri, 8 Aug 2008 13:45:45 -0700

After doing some more research, I wonder if I'm approaching the
problem I want to solve the right way.

I am creating a large-scale website with a large amount (terabytes) of
static content. What I need is a way to scale access to my static
content.

I could do this many ways, but an efficient way in my mind would be to
create an array of reverse-proxy cache servers that work together to
represent the static content.

The architecture would work like this:

1) User goes to www.mysite.com
2) The user's web browser wants to retrieve static.mysite.com/image1.jpg
3) DNS sends the HTTP request to a load balancer representing port 80
for static.mysite.com
4) Load balancer sends request to a random internal reverse proxy
cache server in the cache array
5) Cache server returns image1.jpg to end user.

In my example, there would be only one static apache server
representing the terabytes of data. All proxies in the array would
retrieve and cache files from the single web server as needed.

Now, if no peering protocol is implemented (ICP, HTCP, Cache Digest,
CARP) then eventually all proxies would have the same files waisting
tons of cache memory. Instead, a protocol of some kind should be
implemented so that the cache servers in the array act together like
one big cache. That way as more servers are added to the array, the
system scales to support more cached content.

Any suggestions thus far? Has anyone done this?

-Robert
Received on Fri Aug 08 2008 - 20:45:49 MDT

This archive was generated by hypermail 2.2.0 : Sat Aug 09 2008 - 12:00:02 MDT