Re: Ethernet traffic from David J N Begley on 1998-08-12 (squid-users)

From: David J N Begley <david@dont-contact.us>
Date: Wed, 12 Aug 1998 19:22:54 +1000 (EST)

On Wed, 12 Aug 1998, John Cougar wrote:

> On Tue, 11 Aug 1998, David J N Begley wrote:
> > On Mon, 10 Aug 1998, Ethy H Brito wrote:
> > > the inbound traffic is equal to the outbound traffic. The differences
> > > are neglectable. The hit ratio reports 40%.
> > > Shouldn't the outbound be at least 40% greater than the inbound?
[...]
> > I can't account for the exact behaviour you're seeing, but I can offer
> > this advice; whenever you start measuring raw Ethernet or IP traffic on
> > interfaces, you can forget about getting all the numbers to exactly match
> > what Squid reports as the amount of traffic it has sent/received.
[...]
> One of the "bench-nicks" (as opposed to benchmarks - don't want to cause
> confusion here ... ;-) of caching is that over some (hopefully short)
> period of time, you _should_ see a reduction in the amount of
> inbound-traffic on an interface vs the out-bound stuff as a simple way of
> gauging the effectiveness of a cache device. It _has_ to be!
>
> Sorry, Dave, I agree with Ethy on this matter.

Well, as I said above, I can only offer advice, suggestions for where to look
for the answer; the *real* answer can only come from Ethy by examining
traffic on the segment to see *what* traffic is being measured. The point I'm
trying to make is, "There's no simple 'it's xyz - tweak option def' answer".

As mentioned to Ethy off-list, there are any number of reasons why this may be
so, including (but not limited to):

- whether or not there are other devices on the same subnet transmitting
any form of broadcast or multicast traffic (eg., ARP, BOOTP/DHCP, RIP, &c.);

- regular traffic to from the machine that does not carry proxy object data
(eg., DNS requests/replies, SNMP management traffic, PINGs, TELNET for
remote login/admin, FTP if moving files around, &c.);

- whether or not the interface being measured is the only one in the machine
(ie., all traffic between proxy and clients travels across the same
interface that proxy to remote server traffic uses);

- whether or not the LAN segment is shared or a dedicated switch port; and,

- the usual TCP issues such as fragmentation and retransmission.

To interpret any statistic, you have to first know what you are measuring;
for example, an interface counts inbound and outbound bytes - that's it. The
interface doesn't distinguish between inbound bytes from external Internet
sites or from internal (to the organisation) clients (making requests). If
you want that, try looking at RMON2.

Also, if you're talking about a 40% hit rate in terms of object
requests/counts then there's absolutely no reason why you should expect a 40%
reduction in traffic; after all, not every request/object is going to be the
same size so you may be saving a lot in terms of requests but very little in
terms of actual traffic.

> If you have set up a cache device and have (like I have) stripped most
> other services off the said box, then surely the argument for "other"
> traffic types being present in the byte counts on an interface doesn't
> hold!

Sure it does - it's just what *type* of traffic is being counted that may
change. After all, if I say "It's because your internal and external traffic
traverse the same interface" then I may be flat out wrong because you may have
two interfaces with only proxy traffic on one and only internal management
traffic on another; ergo, I can only remind people of the issues - it's up to
them to investigate their own circumstances and answer the question "Why?".

> Otherwise, what are we doing here? The whole idea of Squid & caching is
> to reduce overall amounts of traffic on the 'Net ...

Of course it is - but is measuring the Ethernet interface on your proxy server
guaranteed to tell you (even close to) what amount of traffic you're saving?
Absolutely not. Could it possibly tell you, depending on network/host
configuration? Sure it can. Is every instance going to be the same and have
the same solution/quick-fix? No way.

Short of examining every single frame to enter/leave our F/Ethernet interfaces
(thus only counting the specific type of traffic we wanted - measuring traffic
savings) we resorted to post-processing Squid's proxy logs, plus looking at
the overall influx of identifiable Web traffic on our Internet link (rather
than the interface on the proxy). These are still just approximations, but
they're sure to be more accurate that just looking at raw byte counts on an
interface with no regard for *what* is being measured.

> I would also be asking questions .... as I (and others) did when
> deploying early versions of the Cisco Cache Engines, which, incidently,
> _never_ portrayed this phenomenon, and frequently caused an increase in
> the load on a network. Go figure.

Precisely - every circumstance is different.

"Network" as in internally, or externally? If you're looking to see how much
you're "saving" through the use of a proxy, surely the idea is to measure how
much traffic between your network and the external Internet is being saved?

> Ethy, you may need to get a bit more focussed on the issue and try to
> determine what traffic is present on the interface you are measuring, and
> then what proportion is HTTP, and so on ... to actually determine the
> cause of the effect you are observing.

Egads, sounds familiar. :-)

dave
Received on Wed Aug 12 1998 - 02:24:10 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:41:29 MST