Re: [squid-users] Request processing question

From: David Lawson <david@dont-contact.us>
Date: Sun, 6 Apr 2008 15:19:15 -0400

On Apr 6, 2008, at 4:59 AM, Henrik Nordstrom wrote:
> lör 2008-04-05 klockan 23:26 -0400 skrev David Lawson:
>> I've got a couple questions about how Squid chooses to fulfill a
>> request. Basically, I've got a cache with a number of sibling peers
>> defined. Some of the time it makes an ICP query to those peers and
>> then does everything it should do, takes the first hit, makes the
>> HTTP
>> request for the object via that peer, etc. Some, perhaps most, of
>> the
>> time, it doesn't even make an ICP query for the object, it just goes
>> direct to the origin server.
>
> The primary distinction is hierarchical/nonhierarchical requests.
> Siblings is only queried on hierarchical requests.
>
> non-hierarchical:
> - reload requests
> - cache validations if you have non-Squid ICP peers
> - non-GET/HEAD/TRACE requests
> - authenticated requests
> - matching hierarchy_stoplist

Hmmm, okay, that was more or less the assumption I was working under,
but the behavior I'm seeing doesn't seem to match that. One of my
coworkers did a packet capture of two requests, one of which resulted
in an ICP query, the other of which bypassed the ICP query process
entirely and went direct to the origin.

ICP:

    GET http://www.foo.com:8881/towns/baz/x1151547945 HTTP/1.0\r\n
        Request Method: GET
        Request URI: http://www.foo.com:8881/towns/baz/x1151547945
        Request Version: HTTP/1.0
    Host: www.foo.com:8881\r\n
    Accept: text/html,text/plain,application/*\r\n
    From: user@google.com\r\n
    User-Agent: gsa-crawler (Enterprise; GIX-01642; user@google.com)\r\n
    Accept-Encoding: gzip\r\n
    If-Modified-Since: Sun, 16 Mar 2008 22:22:39 GMT\r\n
    Via: 1.0 cache2.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
    X-Forwarded-For: 64.233.190.112\r\n
    Cache-Control: max-age=86400\r\n
    \r\n

Non-ICP:

Hypertext Transfer Protocol
    GET http://www.bar.com:8881/baz/news/rss HTTP/1.0\r\n
        Request Method: GET
        Request URI: http://www.bar.com:8881/baz/news/rss
        Request Version: HTTP/1.0
    Host: www.wickedlocal.com:8881\r\n
    User-Agent: Yahoo-Newscrawler/3.9 (news-search-crawler at yahoo-
inc dot com)\r\n
    Via: 1.0 cache4.ghm.zope.net:80 (squid/2.5.STABLE12)\r\n
    X-Forwarded-For: 69.147.86.154\r\n
    Cache-Control: max-age=86400\r\n
    \r\n

Any ideas about why those requests were processed differently?

>> I've also got a broader, more general question of how a request flows
>> through the Squid process, when ACLs are processed, are they before
>> or
>> after any rewriter is done to the URLs, etc., but that's a really
>> secondary thing, right now I'm just concerned with the ICP question.
>
> Depends on which access directive you look at. Generally speaking
> http_access is before url rewrites, the rest after.

Ah, okay. Thanks Henrik, I appreciate the info.

--Dave
Systems Administrator
Zope Corp.
540-361-1722
david@zope.com
Received on Sun Apr 06 2008 - 13:19:18 MDT

This archive was generated by hypermail 2.2.0 : Thu May 01 2008 - 12:00:04 MDT