Re: [squid-users] Caching Pandora

From: Adrian Chadd <adrian_at_squid-cache.org>
Date: Sun, 26 Jul 2009 13:24:35 +0800

This doesn't surprise me. They may be trying to maximise outbound
bits, or try to retain control over content, or not understanding
caching, or all/combination of the above.

I'd suggest contacting them and asking.

adrian

2009/7/26 Jason Spegal <jspegal_at_comcast.net>:
> A little bit messy but here are some snippets.
>
> ###Access.log
>
> 1248572380.275    178 10.10.122.248 TCP_REFRESH_UNMODIFIED/304 232 GET
> http://images-sjl-1.pandora.com/images/public/amz/1/2/0/4/727361124021_500W_495H.jpg
> - DIRECT/208.85.40.13 -
> 1248572409.144   8472 10.10.122.241 TCP_MISS/200 1581181 GET
> http://audio-sjl-t3-2.pandora.com/access/7008639604707703825.mp4? -
> DIRECT/208.85.41.38 application/octet-stream
> 1248572439.512     94 10.10.122.241 TCP_MEM_HIT/200 55396 GET
> http://images-sjl-2.pandora.com/images/public/amz/3/0/2/3/602498413203_500W_499H.jpg
> - NONE/- image/jpeg
> 1248572570.898    300 10.10.122.248 TCP_MISS/200 6521 GET
> http://images-sjl-3.pandora.com/images/public/amz/2/2/4/4/039841434422_130W_130H.jpg
> - DIRECT/208.85.41.23 image/jpeg
> 1248572600.538  29937 10.10.122.248 TCP_MISS/200 7704188 GET
> http://audio-sjl-t3-2.pandora.com/access/3642267922875646389.mp3? -
> DIRECT/208.85.41.38 application/octet-stream
> 1248572615.735  11507 10.10.122.241 TCP_MISS/200 2109481 GET
> http://audio-sjl-t2-2.pandora.com/access/5722981497105294607.mp4? -
> DIRECT/208.85.41.36 application/octet-stream
> 1248572635.903    179 10.10.122.248 TCP_REFRESH_UNMODIFIED/304 232 GET
> http://images-sjl-3.pandora.com/images/public/amz/2/2/4/4/039841434422_130W_130H.jpg
> - DIRECT/208.85.41.23 -
> 1248572641.444     40 10.10.122.241 TCP_HIT/200 21616 GET
> http://images-sjl-2.pandora.com/images/public/amz/8/7/6/1/602498611678_300W_273H.jpg
> - NONE/- image/jpeg
>
> ###Store.log
>
> 1248572380.275 RELEASE -1 FFFFFFFF 097EAE1108DCEF192ED1C3BFF1F6C1B5  304
> 1248572380        -1        -1 unknown -1/0 GET
> http://images-sjl-1.pandora.com/images/public/amz/1/2/0/4/727361124021_500W_495H.jpg
> 1248572409.144 RELEASE -1 FFFFFFFF 6B93B1BF958703B3FC3CD1ADDD515695  200
> 1248572400        -1 1248572400 application/octet-stream 1580815/1580815 GET
> http://audio-sjl-t3-2.pandora.com/access/7008639604707703825.mp4?
> 1248572570.897 SWAPOUT 00 0004CF23 BEEE111A39B596B14903743011AF2C36  200
> 1248572570 1248490006        -1 image/jpeg 6181/6181 GET
> http://images-sjl-3.pandora.com/images/public/amz/2/2/4/4/039841434422_130W_130H.jpg
> 1248572600.538 RELEASE -1 FFFFFFFF 070416ED935AD18DCA793569D2C6A652  200
> 1248572570        -1 1248572570 application/octet-stream 7703822/7703822 GET
> http://audio-sjl-t3-2.pandora.com/access/3642267922875646389.mp3?
> 1248572615.735 RELEASE -1 FFFFFFFF B0EB42B39131DF028BA3BE9A39CC24E4  200
> 1248572604        -1 1248572604 application/octet-stream 2109115/2109115 GET
> http://audio-sjl-t2-2.pandora.com/access/5722981497105294607.mp4?
> 1248572635.903 RELEASE -1 FFFFFFFF CDCA0D3510080D121E5578310976676E  304
> 1248572635        -1        -1 unknown -1/0 GET
> http://images-sjl-3.pandora.com/images/public/amz/2/2/4/4/039841434422_130W_130H.jpg
> 1248572886.822 RELEASE -1 FFFFFFFF A95C86074129546301911C2FC251071D  200
> 1248572872        -1 1248572872 application/octet-stream 2086824/2086824 GET
> http://audio-sjl-t1-1.pandora.com/access/5188159311574708305.mp4?
>
> ###Wireshark
>
> Hypertext Transfer Protocol
> HTTP/1.0 200 OK\r\n
> Date: Sun, 26 Jul 2009 05:12:58 GMT\r\n
> Server: Apache\r\n
> Content-Length: 6137729\r\n
> Cache-Control: no-cache, no-store, must-revalidate, max-age=-1\r\n
> Pragma: no-cache, no-store\r\n
> Expires: -1\r\n
> Content-Type: application/octet-stream\r\n
> X-Cache: MISS from ichiban\r\n
> X-Cache-Lookup: MISS from ichiban:3128\r\n
> Via: 1.0 ichiban (squid)\r\n
> Proxy-Connection: keep-alive\r\n
> \r\n
>
> mos Jeffries wrote:
>>
>> Jason Spegal wrote:
>>>
>>> I was able to cache Pandora by compiling with --enable-http-violations
>>> and using a refresh_pattern to cache everything regardless. This however
>>> broke everything by preventing proper refreshing of any site. If it could be
>>> worked where violations only happened as directly specified in the
>>> configuration it would be a workable solution. I did some testing and I
>>> could not confirm that it was anything in the configuration file itself that
>>> was causing the issue. I wouldn't recommend using this as such.
>>>
>>
>> Which indicates that there are fine tuning possible to cache just Pandora.
>> Find yoursef one of the Pandora URLs in your access.log and take a visit to
>> www.redbot.org or the ircache.org cacheability engine.
>>
>>
>> Amos
>>
>>
>>
>>>
>>> Henrik Nordstrom wrote:
>>>>
>>>> lör 2009-07-25 klockan 12:05 -0600 skrev Brett Glass:
>>>>
>>>>>
>>>>> One of the largest consumers of our HTTP bandwidth is Pandora, the free
>>>>> music service. Unfortunately, Pandora marks its streams as non-cacheable and
>>>>> also puts question marks in the URLs, which is a huge waste of bandwidth.
>>>>> How can this be overridden?
>>>>>
>>>>
>>>> The questionmark can be ignored. See the "cache" directive. But if there
>>>> is other parameters behind there (normally not logged) that just may not
>>>> help..
>>>>
>>>> Regarding non-cacheable.. most crap can be overridden by
>>>> refresh_pattern.
>>>>
>>>> But, if it's a streaming service (I know nothing about Pandora) then you
>>>> are quite likely out of luck.
>>>>
>>>> Regards
>>>> Henrik
>>>>
>>>>
>>>
>>
>>
>
>
Received on Sun Jul 26 2009 - 05:24:46 MDT

This archive was generated by hypermail 2.2.0 : Sun Jul 26 2009 - 12:00:04 MDT