Introduction / accelerator feature ideas from Flemming Frandsen on 2003-02-20 (squid-dev)

From: Flemming Frandsen <ff@dont-contact.us>
Date: Thu, 20 Feb 2003 20:03:17 +0100

Hi y'all, Iv'e been using squid in accelerator mode for quite some time
and I'm generally happy with it, but resently we had a huge spike in
demand (6000 stampeeding customers wanted to buy 4300 tickets at the
same time) and a number of very serious problems became visible.

The problems aren't in squid as such, but when I sat down and analyzed
the nature of the different bugs I realized that squid could solve the
problems, I've written the ideas up here:
http://dion.swamp.dk/apache_scheduler.html

A short recap of the problems:
A) Race conditions exist in the webapplication (not that uncommon I
guess) that means that having two identical requests running at the same
time in different apache processes will either result in one of them
blowing up or simply returning the wrong result.

B) When a client hits a webserver it's more or less random what
webserver he hits, now my application does a lot of caching so the first
time a client hits another apache process it's a much harder hit than if
the client had hit a resently used one.

C) When the backlog is long enough clients will get impatient and abort
the connection, but squidie seems more than happy to keep serving the
request (I don't quite know if this is true or the clients just give up
when the request is being run).

D) Almost 100% of the content on the site is dynamically generated, the
only static bits are css files and a tiny bit of graphics on very few
pages, so very few different requests will be cache hits, so all this
writing everything to disk business seems a litte wasted.

The solutions (if I can wrap my mind around squids guts)
A) Add a lock on the requesting bit so a user can only have one request
running in the webserver at any one time (users are identified by a
session id in a cookie), this should take care of the race conditions,
doing it in Apache is not a choice as you have already lost if you tie
up an Apache process.

B) When users are identified by their session id it's relatively easy to
maintain a list of the 5-10 latest server processes that the client has
talked to (this calls for the server connections to be kept alive, but
squid already does this, right?). The number of open server connections
will need to be limited, I havn't found that option anywhere.

C) Maybe it would be possible to keep the client from disconnecting
while waiting for the request to complete, the only way I can think of
is to send the http header and keep appending a character to a special
header (like X-calm-down-beavis).

D) Writing every response to disk seems like a big waste of time and
file descriptors when almost none of the response are ever going to be
needed again.

Anyway, those are the problems that I'm trying to solve, but I might
need some help finding the right bits to tweak...

-- 
  Regards Flemming Frandsen - http://dion.swamp.dk
  PartyTicket.Net co founder & Yet Another Perl Hacker

Received on Thu Feb 20 2003 - 13:04:49 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:19:16 MST