Re: same URL, different encoding

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 15 Jul 1999 01:45:54 +0200

Duane Wessels wrote:

> I believe the '~' is illegal.

Both yes and no. It was illegal but has changed to be legal.

RFC 1738 (old URL specification): Unsafe, must not be used unencoded
RFC 2396 (URI): Perfectly legal, part of the unreserved set

RFC 2616 (HTTP/1.1) refers to RFC 2396 for URL syntax and semantics.

HTTP/1.0 specification refers to older URI specifications where ~ was
unsafe.

> I could instead argue that your origin server should send HTTP redirects
> for the equivalent forms. So '~' and '%7e' get redirected to a URL with '%7E'.

Why should it bother to redirect %7e to %7E? They are semantically
equivalent in all specifications (even the really old ones).

The question on ~ ws %7e is however an entirely different issue as seen
above. It is safe to assume that %7e (or %7E) can be used in place of ~,
but not the rewerse. The tricky part is however that a HTTP proxy is not
allowed to rewrite the URL even to a semantically equivalent form since
there may be uncompliant servers which does not follow the defined
semantic equivalence rules.

For a proxy my vote is to default to handle them all as different. To
not do so should be optional and tagged with a warning that it may
violate some aspects of the HTTP specifications.

--
Henrik Nordstrom
Spare time Squid hacker
Received on Wed Jul 14 1999 - 17:39:35 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:47:23 MST