Re: Efficient way of enumerating squid contents?

From: Chris Wedgwood <chris@dont-contact.us>
Date: Sat, 10 Oct 1998 13:53:58 +1300

On Fri, Oct 09, 1998 at 08:31:11AM +0200, Henrik Nordstrom wrote:

> You have to walk the hash chains and swap in the headers of every
> object to do such an operation.

Is there already code in there to do this - the dump all object in
cache for example?

Ideally, such an operation should be as quick as possible, but since
all URLs are storred by hash reference, then a slow process is better
than nothing.

> Very hard to tell. There are strong reasons why HTTP 1.1 does not
> allow a proxy to change the URL in any way.

I can see why is most cases this makes sense... but I can also see
why there are exceptions. These exceptions are mostly for MacOS and
Win32 based web-servers where the URL (like most web-servers) is a
file-path (which because of the underlying filesystem is case
independent).

> Why do you want to change case of URLs or header information?

Because

        http://www.microsoft.com/msdownload/

and

        http://www.microsoft.com/MSDOWNLOAD/

are really the same object - even though the URLs are different.

Perhaps I should be looking more at the E-Tag header here?

> If you want to clean up URLs to gain a higher hit ratio then this
> can easily be done in a redirector, but you should take exctreme
> caution when doing things like this as different case is different
> on many servers (all UNIX based for example) so there is no
> guarantee that .asp and .ASP is the same thing.

I'm aware UNIX is sane and knows .ASP != .asp, but sadly some lesser
OSs don't make this distinction. It also appears that some genetic
flaw afflicts certain people to make them more likely to choose such
products and to product content with MiXEd case URLs, so that
references to the same object result in multiple objects being
stored.

-cw
Received on Tue Jul 29 2003 - 13:15:54 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:56 MST