Re: [squid-users] Incorrect HTTP GET request from squid

From: Eric Lawson <E.Lawson@dont-contact.us>
Date: Tue, 16 Jul 2002 17:12:18 +1000

>>>> Colin Campbell <sgcccdc@citec.qld.gov.au> 07/16/02 04:54pm >>>
>Hi,
>
>On Tue, 16 Jul 2002, Eric Lawson wrote:
>
>> OK here's the deal. I have 5 brand new Sun Cobalt CacheRaq4's straight
>> out of the box using the default configuration. When I was doing some
>> testing before putting the boxes into production I was encountering
>> some strange problems with certain websites. Prime example is
>> http://www.smh.com.au. If I go to this site bypassing the cache server
>> I get the proper site. If I configure my browser to go through the
>> cache server I get http://www.f2.com.au returned to my browser. Now
>> both of these sites have the same IP address returned from a DNS
>> query. I put a sniffer on my network to capture traffic to/from the
>> cache server to see what was happening and the following is an extract
>> from these captures. This is the packet from my browser to the cache
>> server with the HTTP GET request
>>
>> Ethernet II Internet Protocol, Src Addr: (192.168.21.111), Dst Addr:
>> (10.65.9.132) Transmission Control Protocol, Src Port: 1112 (1112),
>> Dst Port: 3128 (3128), Seq: 5 Hypertext Transfer Protocol GET
>> http://www.smh.com.au/ HTTP/1.0\r\n Accept: image/gif,
>> image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms
>> Accept-Language: en-au\r\n User-Agent: Mozilla/4.0 (compatible; MSIE
>> 5.5; Windows NT 4.0)\r\n Host: www.smh.com.au\r\n Proxy-Connection:
>> Keep-Alive\r\n \r\n
>>
>> This is the packet from the cache server to the www.smh.com.au server
>>
>> Ethernet II Internet Protocol, Src Addr: (10.65.9.132), Dst Addr:
>> (203.26.51.42) Transmission Control Protocol, Src Port: 3100 (3100),
>> Dst Port: 80 (80), Seq: 30173 Hypertext Transfer Protocol GET /
>> HTTP/1.0\r\n Accept: image/gif, image/x-xbitmap, image/jpeg,
>> image/pjpeg, application/vnd.ms Accept-Language: en-au\r\n Via: 1.0
>> sydcache:3128 (Squid/2.3.STABLE4)\r\n X-Forwarded-For:
>> 192.168.21.111\r\n Host: www.smh.com.au\r\n Cache-Control:
>> max-age=172800\r\n Connection: keep-alive\r\n \r\n
>>
>> Now as you can see the HTTP GET request has been changed, it is not
>> passing the URL that was typed in http://www.smh.com.au/ , it has
>> replaced it with a /
>>
>> My question is why is it doing this, and what can I do to fix it?.
>
>This is perfectly normal behaviour. The brwoser puts the full URL in
>because that's how it tells the cache server where to go. The cache server
>pulls the URL apart into host:port and path. It connects to the host:port
>and tells the web server the path component (/). Just in case there's
>multiple web servers on the same IP, there's a "Host: www.smh.com.au"
>header.
>
>If you sniffed your browser bypassing the cache you'd see the same
>behaviour.
>
>I tried www.smh.com.au and had no problem and I go:
>
> browser->squid->plug-gw->squid->world
>
>Maybe you need to sniff the response from the web server to you cache.
>
>Colin
>--
>Colin Campbell
>Unix Support/Postmaster/Hostmaster
>CITEC
>+61 7 3227 6334

I have sniffed my browser bypassing the cache and I see exactly the same thing as the packet from my browser to the cache server. The only difference is the Destination Address.

I also have the responses from the web server. The web server at the other end is doing exactly what it is supposed to. It is replying with the smh.com.au website if I bypass the cache server, and is responding with the default web server for that ip address. f2.com.au if I try to go through the cache server. Why doesn't the cache server just send the same GET request (GET http://www.smh.com.au/ HTTP/1.0) that my browser is sending? Instead of GET / HTTP/1.0

By the way, this box is running Squid 2.3.STABLE4

Eric

######################################################################
                         Warning

This email message and any attached files may contain information
that is confidential and subject of legal privilege intended only
for use by the individual or entity to whom they are addressed. If
you are not the intended recipient or the person responsible for
delivering the message to the intended recipient be advised that
you have received this message in error and that any use, copying,
circulation, forwarding, printing or publication of this message or
attached files is strictly forbidden, as is the disclosure of the
information contained therein. If you have received this message in
error, please notify the sender immediately and delete it from your
Inbox.

www.nca.gov.au
######################################################################
Received on Tue Jul 16 2002 - 01:16:27 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:09:15 MST