Re: Slow responses

From: Martin Hamilton <martin@dont-contact.us>
Date: Thu, 29 Jan 1998 17:21:05 +0000

-----BEGIN PGP SIGNED MESSAGE-----

Bertold Kolics writes:

| Martin Hamilton could tell us a bit more about JANET top-level cache
| setup. (Martin, are you there? ;-))

Now I am :-))

As people have already mentioned, you can find info about configs and
also some stats on our WWW server. We decided to "design" it, so
you'll have to battle through the FRAMEs interface to get to the
actual content. Sorry about that!

I should say that we took over running the JANET caches in the Summer
of 1997, during which we gradually migrated each of the existing
machines over to running Squid 1.NOVM. We've also bought several more
machines - these are split physically into two clusters right now, but
will probably be further subdivided over more sites as the next
iteration of the JANET backbone is put into place over the next few
months. Over the next 2.5 years we will have money for additional
machines, and it looks as though these will mostly be based on PC
hardware running Linux or a BSD derivative - and Squid.

Squid's CPU requirements are (currently :-) so low that a P166 would
be overkill for a top level cache. So, the theory is that we should
be able to pick up truckloads of obsoleted 486's, Pentiums, Pentium
Pros, and now of course Pentium MMXes. Slap an Ethernet card, a large
disk or two and a bit more memory, and off you go... Cheers, Intel!

Initially we had all of the machines configured as peers of each
other. This worked OK during the Summer, but broke down with the
added load of the student traffic in the Autumn. Even with multicast
ICP the load created by the peering was too great - though this didn't
manifest itself in terms of things like increased CPU usage by Squid.
So, we decided to run the machines for a while with some of the fancy
options turned off - so no ICP query logging, and no peering between
top level caches at all. Actually, that's not completely true - we do
have a test peering with a Mirror Image "Terabyte" server for
long-term evaluation purposes.

This might sound bad, but each machine has typically 16GB or 18GB or
cache disk :-)

Apart from the odd blip, especially at the start, we seem to be doing
OK with this approach. So, the plan is to gradually reintroduce
peering until we start to see a degradation in performance - so that
we have an idea of what sort of peering arrangements will be feasible
for a given volume of traffic. At the same time, we have more and
more sites using us as parents for their site caches.

We also have finally managed to get access to an 8Mbit/s line to the
States which had formerly been used by JANET for its main service, and
then as a fallback in the event of our 45Mbit/s US line going down.
Unfortunately this is set to disappear in a few week's time, but we
may still be able to get some results out of it which should help us
to develop some plans for dedicated international bandwidth in the
future. For instance - how much of 8Mbit/s does a single busy cache
server need ? What if we peer it with an NLANR cache at the other
end ? What difference do persistent connections make ?

Ultimately we want to encourage more and more sites to run their own
cache servers. We're planning to do this in two ways (in addition to
the general peer pressure effect, and the impetus added by the
imminent imposition of usage based charging) - one is to contact
technical people and computer centre managers at sites which don't run
caches at the moment. The other is to develop a very simple Squid
installation for the small sites which we could remotely manage - e.g.
on a subscription basis. For more information on this, check out
<URL:http://wwwcache.ja.net/dev/lsd/>.

Since it's not at all clear that the best plan in the long term is for
individual site caches to use us as parents, we're also investigating
some alternatives. Of course this is really research, but we have to
call it "service development" since nobody wanted to fund WWW caching
research...

These ongoing things are :-

  * adding support for an ICP request header option which
      lets an ICP client indicate that it's not interested
      in ICP_MISS replies, only in ICP_HITs - requires the
      Squid timing mechanism to be hacked over :-(

  * building on the concept implemented by Mirror Image of a
      machine with lots of disks which harvests newly cached
      resources from site servers, but looks like a regular
      sibling for Squid peerings

  * implementing a registry of URL -> cache server mappings,
      based on ICP announcements and an ICP "referral"
      response, so that we act as a sort of Grand Central
      for WWW caching. Policy is left to the site cache
      admin - e.g. which referrals to trust.

  * tools for profiling network usage, nominally on cache
      LANs, but hopefully generalised to the point that they
      can be used in a variety of situations

Just in case the parenting concept really does turn out to scale, e.g.
with the introduction of persistent connections, we're also looking at
the feasibility of something a bit more like proper routing for
connections between peering caches - kind of like an MBONE/6BONE style
overlay network for WWW caching.

Some of these activities have funding from TERENA and are being done
in partnership with other national research network cache operators.
For more info, see <URL:http://www.terena.nl/projects/approved/>.

Any questions/comments/contributions welcome :-)

I'll try to post news here of any interesting developments.

Ciao!

Martin

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQCVAwUBNNC6fNZdpXZXTSjhAQFjWwP/XnAQyTlLw66IqcdMEer3Al1l94Zkfq6B
xGUXhM4ebhkEZhFaPk/SuZwGaLtesfaJ8TraE+x/T5NbUssXl4MRqZH+zpB4r1uS
7q0eMO3QBcZVV6fGN3lCQLd0soWDms27uqkzk900sxL6MILL6IfJssirA6pFW+lZ
oOkwg8pyMl0=
=2ka1
-----END PGP SIGNATURE-----
Received on Thu Jan 29 1998 - 11:30:20 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:34 MST