[PATCH] ICAP service chains and sets

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 26 Jun 2009 17:53:55 -0600

Hello,

    Please consider the following changes for Squid3 trunk inclusion.
They have been been tested in the lab and will be put in production. The
ICAP chains feature (i.e., a "pipeline" of ICAP services) has been on
many wish lists. ICAP service sets allow for backup ICAP servers which
are typical in busy/critical adaptation environments.

I compressed the 90KB patch (am I being too paranoid?), but the attached
text file shows squid.conf documentation describing the features listed
below. I have also quoted relevant commit messages that may help with
code understanding. The patch includes adaptation notes.dox file with
developer-centric documentation.

If approved, I will try to do a "bzr merge" instead of a raw patch to
preserve commit messages, using a Robert-provided trick (thanks, Robert!).

This work depends on the enhanced logging features submitted previously.

Thank you,

Alex.
bb:approve

--------------------------------
Support adaptation sets and chains, including dynamic ICAP chains.

  - Support adaptation service sets and chains
    (adaptation_service_set and adaptation_service_chain)

  - Dynamically form chains based on ICAP X-Next-Services header
    (icap_service routing=on)
------------------------------------

> Support adaptation service sets and chains.
>
> An adaptation service set contains similar, interchangeable services. No more
> than one service is successfully applied. If one service is down or fails,
> Squid can use another service. Think "hot standby" or "spare" ICAP servers.
>
> Sets may seem similar to the existing "service bypass" feature, but they allow
> the failed adaptation to be retried and succeed if a replacement service is
> available. The services in a set may be all optional or all essential,
> depending on whether ignoring the entire set is acceptable. The mixture of
> optional and essential services in a set is supported, but yields results that
> may be difficult for a human to anticipate or interpret. Squid warns when it
> detects such a mixture.
>
> When performing adaptations with a set, failures at a service (optional or
> essential, does not matter) are retried with a different service if possible.
> If there are no more replacement services left to try, the failure is treated
> depending on whether the last service tried was optional or essential: Squid
> either tries to ignore the failure and proceed or terminates the master
> transaction.
>
>
> An adaptation chain is a list of different services applied one after another,
> forming an adaptation pipeline. Services in a chain may be optional or
> essential. When performing adaptations, failures at an optional service are
> ignored as if the service did not exist in the chain.
>
> Request satisfaction terminates the adaptation chain.
>
>
> When forming a set or chain for a given transaction, optional down services
> are ignored as if they did not exist.
>
> ICAP and eCAP services can be mixed and matched in an adaptation set or chain.
>
>
>
> * Implementation notes
>
> The notes below focus on _changes_. Adaptation terminology and current layers
> are now being documented in src/adaptation/notes.dox
>
> Service sets and chains are implemented as ServiceGroup class kids. They are
> very similar in most code aspects. The primary external difference is that
> ServiceSet can "replace" a service and ServiceChain can find the "next"
> service. The internal search code is implemented in ServiceGroup parent and
> is parametrized by the kids.
>
> Before the adaptation starts, Squid calculates the adaptation "plan", which is
> just an iterator into the ServiceGroup. The client- and server-side adaptation
> initiators used to deal with Service pointers. They now deal with ServiceGroup
> pointers. The only interesting difference is that a ServiceGroup does not have
> a notion of being optional or essential. Thus, if adaptation start fails, we
> do not know whether the failure can be bypassed. Fortunately, starting an
> adaptation does not require anything that depends on the adaptation services,
> so we now simply assert that the start succeeds.
>
> If the entire adaptation fails, the callers are notified as before. They are
> told whether they can ignore the failure as before. No changes there.
>
> A new Adaptation::Iterator class has been added to execute the adaptation
> plan. That class is responsible for iterating the services in a service group
> until the plan is exhausted or cannot progress due to a final failure.

> Dynamically form adaptation chains based on the ICAP X-Next-Services header.
>
> If an ICAP service with the routing=1 option in squid.conf returns an ICAP
> X-Next-Services response header during a successful REQMOD or RESPMOD
> transaction, Squid abandons the original adaptation plan and forms a new
> adaptation chain consisting of services identified in the X-Next-Services
> header value (using a comma-separated list of adaptation service names from
> squid.conf). The dynamically created chain is destroyed once the new plan is
> completed or replaced.
>
> This feature is useful when a custom adaptation service knows which other
> services are applicable to the message being adapted.
>
> Limit adaptation iterations to adaptation_service_iteration_limit to protect
> Squid from infinite adaptation loops caused by ICAP services constantly
> including themselves in the dynamic adaptation chain they request. When the
> limit is exceeded, the master transaction fails. The default limit of 16
> should be large enough to not require an explicit configuration in most
> environments yet may be small enough to limit side-effects of loops.
>
> TODO: Add metadata support to eCAP API and honor X-Next-Services there as
> well. Currently, only ICAP services can form dynamic chains but the formed
> chains may contain eCAP services.
>
>
> Other improvements:
>
> Polished adaptation service configuration in squid.conf. Old format with an
> anonymous bypass option is deprecated but still supported. Quit with a fatal
> message if an adaptation service is misconfigured (debugging level-0 messages
> do not seem to work at that stage, but that is probably another, general bug).
>
>
> Polished HttpRequest::adaptHistory() interface so that the code that knows the
> history is needed can force history creation without complex
> configuration-time preparations and state. Currently, all adaptation history
> users but the logging-related ones know runtime whether the history must be
> created (e.g., when a certain ICAP header is received).
>
>
> Fixed "canonical" Request URL maintenance when ICAP clones requests.
> TODO: The urlCanonical() must become HttpRequest::canonical(), hiding the
> often out-of-sync canonical data member.
>
>
> Fixed ICAP request parsing (for ICAP logging). We used to parse Request-Line
> as if it were the first header. TODO: optimize by parsing only when needed.

This is not a true patch. Just a summary of config changes.

NAME: icap_service
TYPE: icap_service_type
IFDEF: ICAP_CLIENT
LOC: Adaptation::Icap::TheConfig
DEFAULT: none
DOC_START
+ Defines a single ICAP service using the following format:
+
+ icap_service service_name vectoring_point [options] service_url
+
+ service_name: ID
+ an opaque identifier which must be unique in squid.conf
+
+ vectoring_point: reqmod_precache|reqmod_postcache|respmod_precache|respmod_postcache
                 This specifies at which point of transaction processing the
                 ICAP service should be activated. *_postcache vectoring points
                 are not yet supported.
+
+ service_url: icap://servername:port/servicepath
+ ICAP server and service location.
+
+ ICAP does not allow a single service to handle both REQMOD and RESPMOD
+ transactions. Squid does not enforce that requirement. You can specify
+ services with the same service_url and different vectoring_points. You
+ can even specify multiple identical services as long as their
+ service_names differ.
+
+
+ Service options are separated by white space. ICAP services support
+ the following name=value options:
+
+ bypass=on|off|1|0
+ If set to 'on' or '1', the ICAP service is treated as
+ optional. If the service cannot be reached or malfunctions,
+ Squid will try to ignore any errors and process the message as
+ if the service was not enabled. No all ICAP errors can be
+ bypassed. If set to 0, the ICAP service is treated as
+ essential and all ICAP errors will result in an error page
+ returned to the HTTP client.
+
+ Bypass is off by default: services are treated as essential.
+
+ routing=on|off|1|0
+ If set to 'on' or '1', the ICAP service is allowed to
+ dynamically change the current message adaptation plan by
+ returning a chain of services to be used next. The services
+ are specified using the X-Next-Services ICAP response header
+ value, formatted as a comma-separated list of service names.
+ Each named service should be configured in squid.conf and
+ should have the same method and vectoring point as the current
+ ICAP transaction. Services violating these rules are ignored.
+ An empty X-Next-Services value results in an empty plan which
+ ends the current adaptation.
+
+ Routing is not allowed by default: the ICAP X-Next-Services
+ response header is ignored.
+
+ Older icap_service format without optional named parameters is
+ deprecated but supported for backward compatibility.
 
 Example:
+icap_service svcBlocker reqmod_precache bypass=0 icap://icap1.mydomain.net:1344/reqmod
+icap_service svcLogger reqmod_precache routing=on icap://icap2.mydomain.net:1344/respmod
 DOC_END
 
NAME: adaptation_service_set
TYPE: adaptation_service_set_type
IFDEF: USE_ADAPTATION
LOC: none
DEFAULT: none
DOC_START

        Configures an ordered set of similar, redundant services. This is
        useful when hot standby or backup adaptation servers are available.

            adaptation_service_set set_name service_name1 service_name2 ...

        The named services are used in the set declaration order. The first
        applicable adaptation service from the set is used first. The next
        applicable service is tried if and only if the transaction with the
        previous service fails and the message waiting to be adapted is still
        intact.

        When adaptation starts, broken services are ignored as if they were
        not a part of the set. A broken service is a down optional service.

        The services in a set must be attached to the same vectoring point
        (e.g., pre-cache) and use the same adaptation method (e.g., REQMOD).

        If all services in a set are optional then adaptation failures are
        bypassable. If all services in the set are essential, then a
        transaction failure with one service may still be retried using
        another service from the set, but when all services fail, the master
        transaction fails as well.

        A set may contain a mix of optional and essential services, but that
        is likely to lead to surprising results because broken services become
        ignored (see above), making previously bypassable failures fatal.
        Technically, it is the bypassability of the last failed service that
        matters.

        See also: adaptation_access adaptation_service_chain

Example:
adaptation_service_set svcBlocker urlFilterPrimary urlFilterBackup
adaptation service_set svcLogger loggerLocal loggerRemote
DOC_END

+NAME: adaptation_service_chain
+TYPE: adaptation_service_chain_type
+IFDEF: USE_ADAPTATION
+LOC: none
+DEFAULT: none
+DOC_START
+
+ Configures a list of complementary services that will be applied
+ one-by-one, forming an adaptation chain or pipeline. This is useful
+ when Squid must perform different adaptations on the same message.
+
+ adaptation_service_chain chain_name service_name1 svc_name2 ...
+
+ The named services are used in the chain declaration order. The first
+ applicable adaptation service from the chain is used first. The next
+ applicable service is applied to the successful adaptation results of
+ the previous service in the chain.
+
+ When adaptation starts, broken services are ignored as if they were
+ not a part of the chain. A broken service is a down optional service.
+
+ Request satisfaction terminates the adaptation chain because Squid
+ does not currently allow declaration of RESPMOD services at the
+ "reqmod_precache" vectoring point (see icap_service or ecap_service).
+
+ The services in a chain must be attached to the same vectoring point
+ (e.g., pre-cache) and use the same adaptation method (e.g., REQMOD).
+
+ A chain may contain a mix of optional and essential services. If an
+ essential adaptation fails (or the failure cannot be bypassed for
+ other reasons), the master transaction fails. Otherwise, the failure
+ is bypassed as if the failed adaptation service was not in the chain.
+
+ See also: adaptation_access adaptation_service_set
+
+Example:
+adaptation_service_chain svcRequest requestLogger urlFilter leakDetector
+DOC_END

+NAME: adaptation_service_iteration_limit
+TYPE: int
+IFDEF: USE_ADAPTATION
+LOC: Adaptation::Config::service_iteration_limit
+DEFAULT: 16
+DOC_START
+ Limits the number of iterations allowed when applying adaptation
+ services to a message. If your longest adaptation set or chain
+ may have more than 16 services, increase the limit beyond its
+ default value of 16. If detecting infinite iteration loops sooner
+ is critical, make the iteration limit match the actual number
+ of services in your longest adaptation set or chain.
+
+ Infinite adaptation loops are most likely with routing services.
+
+ See also: icap_service routing=1
+DOC_END

Received on Fri Jun 26 2009 - 23:54:00 MDT

This archive was generated by hypermail 2.2.0 : Sun Jun 28 2009 - 12:00:06 MDT