Re: [RFC] Have-Digest and duplicate transfer suppression

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 16 Aug 2011 11:27:23 +1200

 On Mon, 15 Aug 2011 23:17:55 +0200, Henrik Nordström wrote:
> mån 2011-08-15 klockan 09:50 -0600 skrev Alex Rousskov:
>
>> I do not like aborted retrievals as the default method of handling a
>> digest-based hit. Aborted transactions have negative side-effects
>> and
>> some of those effects are not controlled by Squid (e.g., monitoring
>> software may trigger an alert if too many requests are aborted).
>>
>> I agree that we can switch from entities to instances, provided we
>> are
>> OK with excluding 206, 302, and similar non-200 responses from the
>> optimization. By instance definition, Squid would not be able to
>> compute
>> or use an instance digest if the response is not 200 OK. We can hope
>> that the vast majority of non-200 responses are either not cachable
>> or
>> are very small and not worth optimizing.
>
> The bulk bandwidth where you would find duplicates is in positive GET
> responses.
>
> Not being able to support 206 duplicate detection without caching the
> full 200 in the "topmost" cache is a little annoying however.
>
>> > In requests you can optionally add an digest based condition
>> similar to
>> > If-None-Match but here If-None-Match already serves the purpose
>> quite
>> > well, so use of the digest condition should probably be limited to
>> cases
>> > where there is no ETag.
>>
>> Or to cases where ETag lies about response content changes.
>
> True, but I kind of doubt there is much bandwidth to be found in
> those
> cases.
>
>> > To optimize bandwidth loss due to unneeded transmission a slow
>> start
>> > mechanism can be used where the sending part waits a couple RTTs
>> before
>> > starting to transmit the body of a large response where an
>> instance
>> > digest is presented. This allows the receiving end to check the
>> received
>> > instance digest and abort the request if not interested in
>> receiving the
>> > body.
>>
>> Besides my general dislike for aborted transactions becoming a norm
>> (see
>> above), "a couple RTT" delay is a high price to pay because each RTT
>> is
>> a few seconds already.
>
> Seconds? What kind of network is this?

 Satellite (long distance), submarine radio (long wave, low bitrate), or
 ad-hoc ground relay (multiple long distance IP hops).

 The RTT details on latter two are mostly classified. But GEO-sync
 satellites are publicly documented. A single ground-satellite-ground
 loop can have close to 1sec RTT at the IP level. With complications such
 as triangular routing with a ground-ground uplink that only gets worse.

 Also satellites with routers aboard were due to go up sometime over the
 last year, so they might also get ground-satellite-satellite-ground
 loops now. I'm not sure what the real numbers are there, but the early
 days there were things like 2-3 seconds RTT discussed. Mostly due to
 low-power requirements, send/receive context switching (!!), or buffer
 bloat on queuing to cope with bitrates. So reasonable to expect there
 are at least some with really crap performance.

 Amos
Received on Mon Aug 15 2011 - 23:27:28 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 16 2011 - 12:00:03 MDT