ICAP / eCAP server loadbalancing

From: Goran Slaviæ <gslavic_at_gmail.com>
Date: Thu, 5 Dec 2013 21:13:00 +0100

*I seem to recall load balancing other than connection based having some
issues which have not been adequately resolved yet (although solving these
directly are more like PhD level problems IMHO, so you may want to discuss
the choice and scope of project with your advisor in light of these and
Measurement Factory feedback).
   
My master thesis is only the part of the project that is preformed at the
School of Electrical Engineering of the Belgrade University. Scope and
jurisdiction of my part of the project is only program implementation of
concepts defined by lot of people involved in the project. In another word I
am not alone in this and there are a lot of people that can help me
(especially with theoretical/mathematical/implementation problems).

* How are you defining "load"? CPU? memory? traffic bytes? request count?
 ICAP service we are developing is handling a lot of different data types
and it is handling some of them within our ICAP service and others using
external programs. This presents the significant problem for load balancing
because "number of connections" is not the relational to the actual "load"
that the adaptation server is experiencing. Couple examples:
 - large number of "text connections" can be handled with small processor
load,
 - resizing of the single "large picture connection" can take significant
load on a processor thread.
 - adaptation of the video stream has large "byte count" BUT is handled by a
dedicated video streaming server which means that although byte count is
large, adaptation server has no processor load other then handling and
redirecting the data flow to and from video server.
Solution that we are proposing will enable ICAP server to update its load
parameters (for example CPU load) and send them to the squid service (either
as a ICAP/HTTP header or during the ICAP_OPTIONS call). Also we have couple
ideas for other load metrics that we will try in other to see which one will
work best.

* One ICAP service may be shared with multiple proxies. How do you intend to
communicate the load metrics to all of them to prevent making overloading
worse?

Solution is similar to the X-Next-Services adaptation of the service plan.
Every ICAP response will have a (let us call it) "X-Server-Load" Header that
will communicate to the proxy server what are the absolute and relative
parameters of the current server load of the ICAP server. Every proxy server
will have the ability to either ignore this data or use it as means to
construct "load ranking list" so that the proxy server will know which
service is under least load. Besides that, every send request will trigger a
incrementation of the load_variable associated with the said service in
ranking list with the (predetermined or user defined within squid.conf file)
"quant of load" to the service that the request is sent. If ICAP server is
capable to return the actual server load, this data will be used to update
load_variable of the service on Squid side. If ICAP server is not capable to
return the actual load, then and every response will trigger a
decrementation in the load_variable of the service that responded. Also it
will give developer of the ICAP service and the person handling the SQUID
administration a way to define the load / load balance in a way that is open
to specific parameters of the services they are designing and implementing
(One service can have max_load of 1000 and the load_quant of 5, other can
have a max_load of 100 and the load_quant of 1 .)

* How do you propose to solve the problem of connection timeout and
starvation?
 - As I understand it ICAP traffic is off-balanced towards a
primary-secondary service design rather than round-robin or such in order to
keep as many connections active as required. Thus avoiding idle timeouts,
connection closure races and TCP setup costs renewing connections.

To be defined. We plan to do further research regarding this issue.

* How do you propose to solve the problem of ICAP server conext switching?
 - The primary-secondary design also causes the primary server software and
memory to remain active as constantly as possible in the ICAP server. Thus
preventing the servers OS inadvertently swapping inactive memory out and
reduces delays re-loading it.

To be defined. We plan to do further research regarding this issue.

* True ICAP server load-balancing would be a useful addition -- it has
been requested a few times before. Defining "load" in a flexible and
extendible way would be the challenging part of this problem, as Amos has
already discussed. Please detail how you want to solve that problem here,
before implementing your changes. Reviewing how other application balance
services (in general, not just ICAP) is probably a good idea.
On the configuration side, consider adding option(s) to the existing
adaptation_service_set directive. It is meant for identical services and so
would work well for load balancing applications IMO.

That goes without saying. Both because of backward compatibility and logic
that we should extend the existing mechanism of adaptation service handling
and not build things from scratch.

* Please note that Squid already supports a primitive form of ICAP load
balancing using a combination of "adaptation_access" and "random" ACL.
It is possible to balance accesses to ICAP services by using appropriate
adaptation_access match probability. For example, for even balancing of
accesses to 3 services, the random match probabilities should be 1/3, 1/2,
and 1.

As I already answered Amos, that load balancing (on the random principles)
only works if the number of connections is relational to the workload of the
adaptation service/server. Also I tried the said primitive form of load
balancing and it had some rather unfortunate consequences (some services
went "down" and where not recovered, others where underused .)

Finally, please make sure your implementation works with eCAP if at all
possible.

That also goes without saying because otherwise the solution is only partial
solution and solves only the part of the problem.

With a hope of further cooperation
G.Slavic
Received on Thu Dec 05 2013 - 20:13:07 MST

This archive was generated by hypermail 2.2.0 : Fri Dec 06 2013 - 12:00:10 MST