A couple URL problems...

From: Clifton Royston <cliftonr@dont-contact.us>
Date: Wed, 16 Jun 1999 18:47:19 -1000 (HST)

Hi, all:

  I just looked over the last month or two's posts to the list, and
didn't see anything about this subject.

  I'm running Squid 2.2STABLE3 in an alpha-test mode on our in-house
network, using it as an explicitly configured proxy. (OS and hardware
are probably irrelevant for now.) If this test goes well, I'm planning
to do a brief test with transparent proxy on our in-house network,
using a borrowed ServerIron, then install a dedicated Squid server - a
suitably hefty BSD/OS 4.0 box, probably with RAID for uptime - do a
formal customer beta-test, and finally start announcing the explicit
proxy for customer use. Eventually, I would like us to go over to
full-scale transparent caching via L4 switch redirection, but I feel we
need to test this configuration fanatically first and make sure it is
genuinely transparent in all cases.

  However... I have run into several sites now, mostly e-commerce
sites, which are accessible directly via Netscape but not via Squid
proxy. This is the kind of nagging "gotcha" that could become a
serious problem in a transparent caching environment, where there's no
way for the end-user to just turn it off. Here's one:

<http://www2.tesys.com/telenetparts.store/284979100/Product/View/HD&20&20SSA18BLW/HD SSA18BLW>

  (Note the space in the last portion of the URL - blecch!!)

  Squid rejects this with:

    "Invalid URL
Some aspect of the requested URL is incorrect. Possible problems:
    Missing or incorrect access protocol (should be `http://'' or similar)
    Missing hostname
    Illegal double-escape in the URL-Path
    Illegal character in hostname; underscores are not allowed "

  Is there a ready-made Squid patch to work around this?

  A similar problem occurs if you put in more than one search term in
the store search pages at:

<http://store.knifecenter.com/knifecenter/search.html>

  A different but related problem occurs at the same site with their
keyword search if you put in a single term and get a multi-page list.
In this case, the URL looks something like this:

<http://store.knifecenter.com/pgi-KeywordSearch3?katana,Contains>

and it presents a form with buttons; clicking the button sends one to
the horrendous URL:

<http://store.knifecenter.com/pgi-ContinueList1?Search For Items Containing ALL Of These Keywords:<BR>>

  (Yes, that's really what it apparently sends as the URL.)

  If I bypass the proxy, this actually works. I'm assuming that the
spaces in the URL are not valid HTTP, though I haven't checked the
RFCs, but it looks like I may need to work around this if multiple web
sites are doing the same thing, broken or not.

  Comments or suggestions?
  -- Clifton

-- 
 Clifton Royston  --  LavaNet Systems Architect --  cliftonr@lava.net
        "An absolute monarch would be absolutely wise and good.  
           But no man is strong enough to have no interest.  
             Therefore the best king would be Pure Chance.  
              It is Pure Chance that rules the Universe; 
          therefore, and only therefore, life is good." - AC
Received on Wed Jun 16 1999 - 22:32:57 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:46:54 MST