Re: read HTML text

From: wellington ricardo gasparin <gasparin@dont-contact.us>
Date: Mon, 24 Apr 2006 11:04:32 -0300

2006/4/23, Henrik Nordstrom <henrik@henriknordstrom.net>:
> tor 2006-04-20 klockan 15:28 -0300 skrev wellington ricardo gasparin:
>
> > Given the StoreEntry pointer for a cached object, how can I read its
> > contents (HTML text)?
>
> Depends on "who" you are and why.
>
> Squid never (or almost never) keeps entire objects in memory. Instead
> there is just what is currently needed to be sent to the clients. Also,
> how this is done differs significantly between on-disk cache hits and
> other requests.. (misses and memory hits have a lot in common however).
>
> The official API for getting content out of a StoreEntry is the
> undocumented storeclient API which primarily consists of
>
> storeClientRegister to register a new client of StoreEntry.
>
> storeClientCopy to request some data from the object
>
> storeUnregister to unregister to client from the StoreEntry.
>
> client in this is "a internal reader of the StoreEntry", not neccesarily
> a client of Squid..
>
>
>
> But depending on "who" you are and why maybe this is not the interface
> you are looking for. If you could tell a little more about what it is
> you need to solve perhaps we can guide to a better place to get access
> to the data you need.
>
> Regards
> Henrik
>
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.3 (GNU/Linux)
>
> iD8DBQBETAUp516QwDnMM9sRAs+qAJ4860MppDgSfT9d5BjdB0OvF5cd9gCfWFRq
> TXltlZwYU23r2kWK06uKk6w=
> =ef1f
> -----END PGP SIGNATURE-----
>
>
>

I want to read the body of a web page, this way I will create a vector
model of semantics. Through distance semantics it will tell which
objects will stay or will remain in cache.
It is to make a new policy replacement.

Thanks
Received on Mon Apr 24 2006 - 08:04:34 MDT

This archive was generated by hypermail pre-2.1.9 : Mon May 01 2006 - 12:00:03 MDT