Re: [squid-users] Compression from Robert Collins on 2001-08-23 (squid-users)

From: Robert Collins <robert.collins@dont-contact.us>
Date: 24 Aug 2001 11:38:40 +1000

On 23 Aug 2001 20:57:17 -0400, Brian wrote:
> On Thursday 23 August 2001 07:07 pm, Robert Collins wrote:
> > I picked up some previous work done on squid which had client-side
> > transfer-encoding(te - see rfc 2616), and generalised it to allow server
> > side and client side te.
> >
> > The basic concept is that squid has an access list that controls what
> > requests squid is willing to compress, and with what compression
> > algorithm. For example you might make sure that squid only ever
> > compress's html or text files. The result is an ordered access list for
> > a given request - ie chunked, gzip, gz
>
> Is there (or will there be) an acl for reply mime-type? I'd like to
> compress php and cgi replies, but some return exotic things like gifs and
> zips.

yes - its in squid-HEAD at the moment, which means you will see it in
Squid 2.5. There are two components to it:
1) a new acl type - all thats needed for the compression angle, and
2) a new a accell list entry - http_reply_access - that is used to allow
or deny replies - such as prevent all mp3's by mime type, or by regex
etc.

> > Then as data flows into squid, all transfer encodings are unwrapped,
> > giving squid the native body (which might itself be content-encoded).
> > This native body gets saved to the store if it is cachable - and the
>
> If the native (I assume you mean fully decoded) version is cached, where
> does Vary play a role in this? Is that only for storing compressed
> versions as mentioned in the future enhancements?

Vary plays a large role - the identity encoding for transfer encodings
can vary in actual encoding due to content encoding.

Note that the apache mod-gzip does content encoding not transfer
encoding. Content transcoding can be a neat thing, but it interferes
with the end to end model for http - transfer encoding was introduced to
allow proxies in the transmission path to still encode things
appropriately without affecting that end to end model...

Transfer encoding is a hop by hop encoding. That is if you have the
following:

client-link1-proxy a-link2-proxy b-link3-web server

The compression on link2 does not imply or prevent compression on link1
or link3.

Content Encoding is an end to end encoding, that is that a gzip content
coding on the data travelling across link2 implies that (given that all
the proxies are _transparent_ - This does not mean wccp style
transparency, rather not-changing the semantic meaning of the http
message transparency) the body is compressed in the same fashion all the
way through.

Example:
Change client in the above diagram to client1, and add client2 on the
same proxy...
client1 wgets a file www.test.com/index.html from the web server.
The body gets encoded as "identity" - meaning it is in the native
encoding for that content.
proxy b to proxy a utilize gzip transfer encoding when sending the
document. This means that proxy 1 will recieve a gzipped html document,
ungzip to get the identity, cache that, and transmit it as is to the
client1.

Now client2 requests the same file - www.test.com/index.html using a
http/1.1 browser that supports gzip content encoding.
Proxy a will see that the current object in cache had a vary header
covering the content-encoding header, so this will be a cache miss.
www.test.com sends as the response the same object, with a gzip content
encoding. Proxy b to proxy a compress the response using gzip transfer
encoding.

Proxy a will recieve a gzipped gzipped html file. The outer gzip - the
transfer encoding will be stripped off - leaving a gzip content coded
html file. That will be stored in the cache and sent to client 2.

This is _today_ without looking at the future enhancements. Now the
point is that content-encoding is determined by the origin web server
and applies all the way to the end user, whereas transfer encoding only
applies link to link.

Imagine snail mail -
If you compress your data before you put it in the envelope - say you
fold it reaaaaaly tight - then you are performing content encoding.
When the post office folds the envelope up really tight and squeezes it
into your po box, they ar performing transfer encoding :].

For more detail, and probably a better explanation, please see RFC 2616
:}.

> Overall, this looks really nice.
> -- Brian
Received on Thu Aug 23 2001 - 19:51:17 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:01:54 MST