Re: [squid-users] Cannot get videos from msnbc that have # in URL - Found it!

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 12 Nov 2008 18:53:52 +1300

Nicole wrote:
> On 12-Nov-08 My Secret NSA Wiretap Overheard Amos Jeffries Saying :
>>>
>>> Hello all
>>>
>>> I have started to receive complains from people trying to get video's
>>> from
>>> msnbc.com that use a # character in the URL.
>>>
>>> Such as:
>>>
>>> http://www.msnbc.msn.com/id/22425001/vp/27657223#27657223
>>> http://www.msnbc.msn.com/id/22425001/vp/27652443#27652443
>>>
>>>
>>> The access log shows that it is removing the pound sign and everything
>>> after.
>>>
>>> 7 TCP_MISS:DIRECT
>>> 9.2.2.7 - - [11/Nov/2008:09:59:30 -0800] "GET
>>> http://www.msnbc.msn.com/id/22425001/vp/27657223 HTTP/1.1" 200 477
>>> TCP_MISS:DIRECT
>>> 9.2.2.7 - - [11/Nov/2008:10:00:18 -0800] "GET
>>> http://www.msnbc.msn.com/id/22425001/vp/27652443 HTTP/1.1" 200 477
>>> TCP_MISS:DIRECT
>>>
>>>
>>> I cannot see in my config why it would be truncating out the pound
>>> character.
>>>
>>> Any assistance greatly appreciated.
>> The # symbol is part of the private URL format for certain objects such as
>> HTML pages and has meaning only in the receiving browser software. Client
>> software is expected to strip it before sending the request.
>>
>> Amos
>
>
>
> I found it!
>
> I have been obfuscating myself as using a squid cache for some time using the
> code below. However this line - header_access Via deny all - blocks those URLS
> from showing up.
>
>
> #### Blocks info that says Hey I'm Squid
> header_access Via deny all
> header_access X-Forwarded-For deny all
> header_access X-Cache deny all
> ####
>
> Thus
>
> #### Blocks info that says Hey I'm Squid
> ##header_access Via deny all
> header_access X-Forwarded-For deny all
> header_access X-Cache deny all
> ####
>
> Allows me to see the videos. How weird is that!
> I do not know what effect it will have on my obfuscation yet however.
>

People who really care will be able to see your Squid in the middle and
what version it is. No private details are lost.

> Any idea's on why their URLs (and as far as I know only their URLs) would break
> with this?
>

'Smart' software at their end provisioning the stream HTTP encoding
based on the client user-agent. But smart enough to notice and adapt for
middleware like Squid.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE5 or 3.0.STABLE10
   Current Beta Squid 3.1.0.1
Received on Wed Nov 12 2008 - 05:53:58 MST

This archive was generated by hypermail 2.2.0 : Wed Nov 12 2008 - 12:00:03 MST