Squid 1.NOVM.16 oddity fetching gopher document

From: WWW server manager <webadm@dont-contact.us>
Date: Thu, 18 Sep 1997 01:44:57 +0100 (BST)

I've just encountered an oddity fetching a gopher document via Squid
1.NOVM.16 (can't say if it was happening with earlier versions).

The URL is gopher://gopher.cam.ac.uk:7000/11/CambUniv/Events/Upcoming ,
and while fetching it direct (no cache) seems to work fine, consistently,
via Squid it mostly seems to mangle the end of the document (a gopher
menu with a lot of long descriptions comprising the links).

The valid version (as displayed by lynx, fetching direct) ends like

(FILE) 1997-12-05 (Fri): Computer course: Managing Small WWW Servers (Windows NT)*
(FILE) 1997-12-05 (Fri): Computer course: Word for Windows 6.0 on IBM PC: Advanced*
(FILE) 1997-12-06 (Sat): Computer course: World Wide Web: Further Exploration* [NB 09.30 start]
(FILE) 1997-12-08 (Mon): Computer course: Oracle (Relational Database Management System): Introduction

but via Squid the end almost always (I *think* forcing a reload got the full
version once or twice, but rarely) looks like

 [INLINE] 1997-11-27 (Thu): Computer course: Scripting on UNIX with PERL: Introduction*
 [INLINE] 1997-11-28 (Fri): Computer course: Word for Windows 6.0 on IBM PC: Intermediate*
 [INLINE] 1997-12-01 (Mon): Computer course: World Wide Web for Beginners: Introductory Practical [NB 09.30 start]

i.e. it's lost a substantial number of items from the end of the document.
However, with Netscape 3.0 the effect seems to be slightly different, since (a)
it typically shows the very last real entry (1997-12-08) but with a mangled
link description, preceded by the 1997-11-24 entry, and with at least one of
the earlier descriptions mangled showing a mangled description and (on
examinging the Squid-generated HTML) a mangled link, e.g.

<IMG BORDER=0 SRC="internal-gopher-unknown"> <A

(wrapped here, one line in the original). Link URL and description corrupted.

Checking Squid's cache/log, the size for successive retrievals of the same
gopher URL (which would vary on a timescale of days, but not of minutes or
seconds) in response to reload requests varies widely - e.g. 8006, 9194, 10245,
and 12737 bytes (random order) for what should be the same data each time.

Looking at a sample cache file, the generated HTML has lines with
<IMG ... SRC="internal-gopher-test"> except for the "mangled" lines,
which typically have SRC="internal-gopher-unknown" and a description which
is either as intended but with the start missing, or with the gopher descriptor
repeated as the link description. So it's *not* the browser misinterpreting
what Squid is sending - the gopher document is being converted to HTML

Does anyone else see this sort of problem viewing the gopher URL referenced
above (or is it just my cache misbehaving)? Any ideas on *why* it's happening?

                                John Line

University of Cambridge WWW manager account (usually John Line)
Send general WWW-related enquiries to webmaster@ucs.cam.ac.uk
Received on Wed Sep 17 1997 - 18:09:12 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:37:06 MST