%20 in URLs (was Re: Squid and spaces in URLs)

From: Marc van Selm <marc.van.selm@dont-contact.us>
Date: Fri, 16 Oct 1998 09:11:17 +0200

At 10:27 PM 10/13/98 +0200, Miquel van Smoorenburg wrote:

>>> I know spaces in URLs are not really valid, but most webservers
>>> accept them, and squid does not.
>>They are valid. The webbrowser is supposed to encode that as %20 though, so
>>their webbrowser is royally broken. Let me guess, it wouldn't be MSIE?
>No, in fact I can't see the page with netscape but can with MSIE.
>BTW one of these sites is http://www.ptt-telecom.nl/

In url.c we change %xx into a character. http://www.kpn-telecom.nl uses %20 in
their URL's (stupid?) This translates here in a space and so triggers a
protocol error.

/* convert %xx in url string to a character
 * Allocate a new string and return a pointer to converted string */
char *
url_convert_hex(char *org_url, int allocate)

In client_side.c this triggers an error message. I created a temporary fix by
compiling the source with: -DRELAXED_HTTP_PARSER and a redirector
(jesred-1.2.1) which replaces this "%20" URL with a local image. These 2 hacks
get the page working.

But now the question: what is the rational behind the function url_convert_hex
and is %20 in an URL really illegal? If not we could decide to patch this
function to leave these type of URL (or just the %20) alone.

PS I've asked the designer of this site to change it but if they do so I don't

Marc van Selm
NATO C3 Agency
Communication Systems Division, A-Branch
Tel: +31 70 3142454
Private: selm@cistron.nl, selm@het.net, http://www.cistron.nl/~selm

"In a world without walls and borders, who needs windows and gates ??"
Received on Fri Oct 16 1998 - 01:08:30 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:31 MST