RE: Help in using logs for simulator from Nottingham, Mark (Australia) on 1998-09-28 (squid-users)

From: Nottingham, Mark (Australia) <mark_nottingham@dont-contact.us>
Date: Tue, 29 Sep 1998 09:28:44 +1000

you should be aware of the proceedings for USITS '97, there was a lot of
relavent work.

http://www.usenix.org/

> -----Original Message-----
> From: G.C.Jawaheer@city.ac.uk [mailto:G.C.Jawaheer@city.ac.uk]
> Sent: Tuesday, September 29, 1998 2:11 AM
> To: squid-users@ircache.net
> Subject: Help in using logs for simulator
>
>
> Hello,
>
> I am new to Squid. I am developing a program in order to simulate
> the performance of different cache replacement policies of proxies.
> However, being a newbie to the field, I need to check the correctness
> of my reasoning and assumptions. Consequently, in the following
> paragraphs, I am exposing my objectives and my line of thoughts and
> I'll be most grateful to anybody who can answer my queries and
> correct my mistakes.
>
> I have at my disposal the access logs of a Squid 1.1 in native
> format, i.e., "time elapsed remotehost code/status bytes method URL
> rfc931 peerstatus/peerhost". The proxy from which I obtained these
> logs was configured as a peer cache. My aim is to use these logs in
> order to obtain the following:
>
> 1] the requested URL
> 2] the time/date of the request
> 3] the size of the document returned to the client
>
> I am not interested in knowing anything about the clients making the
> requests. I am also not taking into consideration the consistency of
> documents.
>
> Is it possible to extract the above mentioned data from the native
> format of access logs?
>
> Does Squid make real time decisions about which document to cache,
> i.e., is it correct to assume that, to every requested document,
> there is or there will eventually be, a copy in a cache somewhere,
> albeit the local proxy cache or a parent cache?
>
> Going back to the native format of the access log of Squid 1.1,
> "time elapsed remotehost code/status bytes method URL rfc931
> peerstatus/peerhost", the "time" field will give me the "time/date of
> the request" (albeit as UNIX time stamp). I can get the "requested
> URL" from the "URL" field. But what about the "size of the document
> returned to the client". The "bytes" field IS NOT the size of the
> returned document, at least not under all circumstances. However,
> from what I understand, there are situations where the "byte" field
> will be the "size of returned document" (perhaps when I have a
> TCP_HIT). Am I right? For example, when I have an ICP_QUERY with a
> UDP_MISS, the "bytes" field must be interpreted differently.
>
> Thus, in order to retrieve those records where the "byte" field
> represent the size of the returned document to the client, I need to
> look for particular code/status combinations. Is that correct? Do I
> need to look for particular peerstatus/peerhost combinations
> also?Assuming that my above reasoning is correct, I need help to
> interpret the code/status combinations, i.e., I don't know which
> code/status combinations to look for?
>
> Last but not least, I thank anybody who can pull me out of this pool
> of ignorance.
>
> Regards,
>
> Gawesh
> +++++++++++++++++++++++++++++++++++++++++++++++++++
> Gawesh C JAWAHEER
> MSc Information Systems and Technology, 1997/98
> City University, UK
> +++++++++++++++++++++++++++++++++++++++++++++++++++
>
Received on Mon Sep 28 1998 - 16:30:09 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:12 MST