Re: [squid-users] Newbie question on Squid Reverse Proxy configuration

From: Michael Alger <squid_at_mm.quex.org>
Date: Sat, 19 Jul 2008 12:58:47 +0800

On Fri, Jul 18, 2008 at 05:04:00PM -0700, aspasia wrote:
> Apologize, am a newbie to web2.0 architecture and squid cache (am
> storage and infrastructure not architecture); I was asked to
> prepare a test environment to just validate ability of SQUID
> application to "connect" to a type of storage.
>
> I read through a plethora of SQUID FAQs and Apache as well as
> concepts on Cache, proxy, web cache, etc.....
>
> I would like to validate my assumptions and questions since I have
> no one else to discuss this who is familiar with the application
> architecture:
>
> 1. in reverse proxy configuration - if we think of it in simple
> terms, would it be safe to say, usually the architecture of
> Reverse Proxy config is:
>
> hostA - 192.168.17.1 -- Squid Reverse server
> hostB - 192.168.17.2 -- http server --> /vol0/www-> storage
>
> so that if:
> a. Client from outside accesses a URL - http://192.168.17.2 (which
> is the http server's URL)

This isn't quite right; the URL the client accesses would be the
reverse proxy, i.e. http://192.168.17.1 (or more likely, a hostname
which resolves to the reverse proxy's IP address). You don't do
transparent "interception" in a reverse proxy setup, you just make
the reverse proxy the "public face" of your web site, and it fetches
content from the appropriate backend / origin servers.

> b. goes into internet cloud
> 3. hostB - Proxy/Cache server - intercepts and checks - if in
> cache, if in cache responds back to client

I assume the above is a typo, and you meant "3. hostA - Proxy/Cache
server - intercepts and checks...". Again, it wouldn't be
"intercepting" it per se.

> 4. If not in cache, proceeds to contact hostB (192.168.17.2)
> which fetches data from storage and returns to hostA
> 5. hostA responds to client and updates cache.
>
> The above is simplistic, but that is how I envision reverse proxy.

It's about right, except the reverse proxy is explicitly acting as
the web server, rather than "intercepting" requests to it.

> Here's my question - I have been tasked to test the ability of the
> Proxy/Cache server - to BYPASS the http server, and have the
> ability to directly connect to storage that stores web
> content/data - is this configuration possible?

No, this isn't possible. squid is a proxy cache, not a content
server. It has some *very* limited ability to serve content itself
(e.g. icons for file types in FTP directory listings), but it's not
something it's designed to do.

> Would it be possible to configure Squid Reverse Proxy's squid.conf
> file - so that it directly accesses
>
> a- cache_dir - to store its cache
> b- storage repository - that stores web contents

Absolutely not, and it might be helpful if you could clarify exactly
why you think you want it to do so. The idea is that the reverse
proxy reduces load on your backend server by caching objects so your
backend only serves it once, then squid will serve it from its own
cache for subsequent requests until it becomes stale.

If your backend server is a resource-hungry web application and
you're trying to reduce the load on it as much as possible, you'd
probably be best off setting up another simple web server to serve
your static files or whatever you envisioned squid would be serving
from the storage repository. squid can be configured to use several
different origin servers to serve a single site, by configuring
access lists which define request paths, and permitting or denying
particular origin servers the ability to serve that content.

If you really can't add an extra server, you could run an HTTP
server on your squid box on another port, and use it to serve the
content from your storage repository.

If you're mostly worried about duplicating your data (e.g. if you're
serving very lage files), you might be best served by setting up a
different hostname for these large files and bypassing squid
altogether. Alternatively, you can define ACLs matching the files
you don't want squid to cache and tell it not to cache them; or
simply set limits on the maximum size of objects squid is allowed to
cache.

> Would there be a comprehensive URL or link that shows the various
> possible ways of configuring Squid proxy - cache, transparent, etc
> ... and details a flow of how http requests and content are
> processed?

Have you seen this page?

http://www.visolve.com/squid/whitepapers/reverseproxy.php

> I apologize for the naive question, but I am a bit lost ..
Received on Sat Jul 19 2008 - 04:58:52 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 23 2008 - 12:00:05 MDT