2008 Australian Developer Meetup: Day 1 notes
Attendees
- Mark Nottinghman - Yahoo
- Tim Starling - Wikimedia
- Adam Carter - Optus
- John Ferlito
- Amos Jeffries - Treenet / Squid
- Ben Seberry
- Benno Rice - Squid
- Robert Collins - Canonical / Squid
- Adrian Chadd - Xenion / Squid
User Presentation: Mark Nottingham, Yahoo!
Overview
- started with multiple somewhat-synched frontend boxes
- migrated to front-end apache servers pulling content from back-end application servers (tech.u.com?)
- migrated to mid-tier Squid between front-end and back-end
- saturated front-end to mid-tier Squid network hardware
- small Squid on front-end box w/ no disk cache to take the heat off the mid-tier Squid
- Peering for synchronisation; -not- just caching!
- Quick-abort hackery so content is pulled into the cache even if the front-end request dies
- stale-if-error and revalidate stuff so stale content can be served whilst content is being fetched
- upcoming work - invalidate cached responses for any method? (Benno!); invalidate stuff via HTCP between peers; related responses for invalidation?
Cache Channels
- The problem:
- a large number of infrequently changing objects (compared to the hot set) w/ short expiry times enter the cache
- these objects are infrequently accessed but are accessed frequently enough to require revalidation every few minutes
- revalidating tens of thousands of stale objects causes significant load on the backend server
- The solution:
- "Cache-channels" - subscribes to a feed on an origin server (ATOM feed, Rendezvous, etc) to incrementally subscribe to updating freshness
- An external process is queried about the freshness of an object and can return a variety of statuses
- STALE
- FRESH freshness=<time> res{<Header>}=<value>
- External process subscribes to the feed and keeps an in-memory invalidation history
- Generally keeping an in-memory invalidation history of n days, depending on object invalidation rate
- This significantly reduces back-end load
- It prevents the backend from having to revalidate thousands of objects every few minutes
- The "side band" revalidation can occur in bulk without impacting on the general apache backend performance
Upcoming Squid-2 contributions
- method_t generalisation, a la Squid-3, implemented by Benno Rice
- invalidation of variants and methods
- HTCP improvements
Flickr Architecture
[TBD]
User Presentation: Tim Starling, Wikimedia
Overview
- Wikimedia has a large number of squids in Amsterdam, Korea, Florida
- Databases and backends live in Tampa, Florida
- Predominantly Squid-2.6
- (from Adrian: Wikimedia helped test Squid-2.7 and COSS in particular)
- Peaks at ~ 50,000 req/sec across the entire Squid cluster
Architecture
- AMS, KR Squids use Florida Squids as parent
- Florida Squids talk to the backend
- Backend exports cachable stuff
- Edge Squids rewrite the Cache-Control headers in replies to non-cachable
- (via a patch so it happens on replies only - its not needed for Squid-3, and it should probably be merged into Squid-2.8)
- X-Vary-Options patch allows squid to "normalise" Vary related header options (eg accept-encoding: arguments) before being turned into Vary tags
- Invalidation information uses HTCP
- Multicast HTCP on each LAN
- Python helper turns multicast HTCP into unicast HTCP to other areas, then turns a unicasted HTCP stream into LAN multicast
- Logging happens over UDP
- UDP logging to concentrator
- Real-time statistics are calculated
- Samples of the logging data are written to disk
Issues
- Why was the X-Vary-Options patch not commented on by the Squid team?
- Adrian: We're busy and thats not really our area of expertise
- Adrian: If someone is willing to be responsible for certain code areas and is able to debug problems as they come up (oh, and do quality work) then really, they should be brought on as a developer.
- In general: X-Vary-Options is a good idea; the current implementation may be too specific but it does highlight a current issue with the Vary headers in general
- Mark: there's some discussion about killing Vary from the HTTP protocol..
- Adrian would like to bring Tim onboard as a Squid committer if he's interested in developing and maintaining the Vary code and potentially merging back Wikimedia changes into Squid
- Adrian: Why not document the HTCP multicast-unicast helper and check into the Squid repository; this way the software stays "current"
- Others: Squid's "core" shouldn't include this; its simple enough to keep external
Developer Discussions (Afternoon)
[TBD]
Introduction
- About Squid
- Why Squid?
- Squid Developers
- How to Donate
- How to Help Out
- Getting Squid
- Squid Source Packages
- Squid Deployment Case-Studies
- Squid Software Foundation
Documentation
- Quick Setup
- Configuration:
- FAQ and Wiki
- Guide Books:
- Non-English
- More...
Support
- Security Advisories
- Bugzilla Database
- Mailing lists
- Contacting us
- Commercial services
- Project Sponsors
- Squid-based products
Miscellaneous
- Developer Resources
- Related Writings
- Related Software:
- Squid Artwork
Web Site Translations
Mirrors
- Website:
- gr il pl ... full list
- FTP Package Archive