squid-ng : increasing squid's flexibility and performance

Adrian Chadd <adrian@creative.net.au>

$adrian$

This document is designed to outline some ideas of mine on how to improve
squid's flexibility and performance. This document is a work in progress,
so please check the document regularly for changes.


* I'd like to change how the client->server path works. Currently, all data
  transfer happens through the storage manager. Ie:

(MISS, and roughly as I don't remember the full details ...)

client request made
client request parsed
store entry created
server connection opened
client registers as a store client on the above store entry
client requests through storeClientCopy() the beginning of the object
..
..
server gets data, appends it to store entry
storeClientCopy() returns to the client the data portion
client handles data portion, and then requests the next portion through
  storeClientCopy()

The server connection throttling is done through the deferred read handler.

Instead, I'd like to abstract the client and server setup a little. A
'request' is passed to a 'server' which handles the request, eg:

client --> clientlet --request--> servlet --> server

This would be a straight proxy with no caching. One client request would
correspond to one server request. In order to do ACL checking, you would
simply insert an ACL module in the path:

client -> clientlet -> client ACL check -> servlet -> server

A similar thing could be done for redirectors, eg:

client -> clientlet -> client ACL check -> redirect -> servlet -> server

If a module isn't required, it simply isn't called. This would reduce the
stack depth you can end up with when processing a request.

To make this a multiprotocol server proxy, you would do the following:

client -> clientlet -> .. -> connection manager

the connection manager takes the request, and does pretty much what
fwdDispatch() does. Say its an FTP connection:

client -> clientlet -> .. -FTP URL-> connection manager

client -> clientlet -> .. -FTP URL-> connection manager -> ftpservlet -> server
client -> clientlet -> .. -FTP URL-> ftpservlet -> server

The connection manager creates an ftpservlet object and then attaches it
to the clientlet it in its place. All communication can happen now
directly between the servlet and the clientlet.

This way, you can attach various modules in between the clientlet and
connection manager/servlet path when required, but remove them when
not required. For some applications various extra features (hierarchy
types, redirectors, ACLs) can only be used when required, and not
affecting the performance when not required. It also should make the code
a lot easier to manage and extend, and various new features such as
content filtering could be implemented.

A 'cache' could then be implemented by creating a cache-aware connection
manager, which would probably be the storage manager.


* after thinking about it a little more, I think that comm_read/comm_write
  in their present state might not be the most efficient way of doing things.
  comm_write() schedules write select callback via commSetSelect(), and then
  handles writing data to the client at another point through the comm loop.
  I think that when data is to be written to a client, as much data as possible
  should be attempted to write, and then if there is any left schedule a
  write select callback. A similar thing should be done for read (which is
  in a way done now, and would be defeated through use of comm_read()). If
  the OS buffers happen to be filled, this should minimise the number of
  loops through comm_select() and improve throughput.

* Deferred reads do not work well inside an event-driven environment.
  The basic problem is this:

commSetSelect(fd, READ, handler)
  register READ event on a file descriptor
..
..
comm_select()
  fd:READ is ready
  check defer handler
  handler indicates defer, so register another READ event on fd ..

..
comm_select()
  fd:READ is ready (because it was previously ready)
  check defer handler
  handler indicates defer, so register another READ event on fd ..

..
comm_select()
  fd:READ is ready (because it was previously ready)
  check defer handler
  handler indicates defer, so register another READ event on fd ..

There is no easy way in an event-driven IO system to tell whether a fd
should be deferred at event register time. Therefore the deferred read
mechanism must be changed to schedule IO events when data is actually
required, and not rely on the deferred read handler to indicate at
FD read time whether the data is needed.

Eg:

commSetSelect(sfd, READ, server_read)
  register READ event on a file descriptor
..
comm_select()
  sfd:READ is ready
    server_read() - pipe data to client
    client_write() - all data was written, return ok
    all data was written, so commSetSelect(sfd, READ, server_read)
..
comm_select()
  sfd:READ is ready
    server_read() - pipe data to client
    client_send() - not all data was written, so
      commSetSelect(cfd, WRITE, client_write)
    not all data was written, so don't reregister a read from the server
..
comm_select()
  cfd:WRITE is ready
    client_write() - rest of data was written, notify server we want more
    server_more() - client wants more data, so
      commSetSelect(sfd, READ, server_read)

This suits the above servlet/clientlet module well, since there are
notification channels between the servlet and clientlet to perform
data flow control.


* SMP support?

One thread per connection will not be a really useful way of doing things.
I have an idea which involves one worker thread per CPU. Each worker thread
is like a mini-squid, handling IO events and handling them through callbacks.
The storage manager would need to be SMP-aware, along with most of the
library functions, but I'm sure it could be done. I'd like some comments
from SMP/thread-aware people.


* Format of various structures

clientlet_t {
	int type;
	void *clientlet_data;

	func *clientlet_havedata(clientlet_t *, servlet_t *, data ..);
	func *clientlet_wantdata(clientlet_t *, servlet_t *, data ..);
	func *clientlet_error(clientlet_t *, servlet_t *, error ..);
	func *clientlet_servletclose(clientlet_t *, servlet_t *, reason ..);
};

type:

clientlet_havedata:

clientlet_wantdata:

clientlet_error:

clientlet_servletclose:


servlet_t {
	int type;
	void *servlet_data;

	func *servlet_havedata(servlet_t *, clientlet_t *, data ..);
	func *servlet_wantdata(servlet_t *, clientlet_t *, data ..);
	func *servlet_error(servlet_t *, clientlet_t *, error ..);
	func *servlet_clientletclose(servlet_t *, clientlet_t *, reason ..);

};

type:		the type of servlet. Examples are SERVLET_HTTP, SERVLET_FTP,
		SERVLET_NNTP, SERVLET_RA.

servlet_havedata:
		this is called by the clientlet whenever it has data for
		the servlet. The servlet returns how much data was used.
		If -1 is returned, this indicates that the servlet can
		not deal with any more data. (I'm guessing that 0 also
		means this .. )

servlet_wantdata:
		this is called by the clientlet when it wants more data
		from the servlet. This typically happens after the
		clientlet->clientlet_havedata routine is called and -1
		is returned (which halts the data flow) to restart the
		data flow.

servlet_error:
		this is called by the clientlet to indicate to the servlet
		that the clientlet has experienced an error. The clientlet
		is no longer valid after the function call.

servlet_clientletclose:
		this is called by the clientlet to indicate that it is
		closing. The clientlet will then become invalid.


request_t {
	int type;

	void *request_data;

	func *request_close(request_t *);

};

type:		the type of request. This indicates which connection protocol
		module gets this request. For example, the type could be
		REQUEST_HTTP, which means that it is a HTTP request. Whether
		the request inside request_data is a HTTP request or an FTP
		request is irrelevant, because that is a connection protocol
		issue (the connection module for HTTP will handle creating
		the required servlets which can then speak REQUEST_HTTP back
		to the clientlet). Generally the request_t type will be the
		same as the clientlet type, unless the clientlet supports
		multiple request protocols.
request_close:	called when the request_t is being deallocated, to deallocate
		request_data and any other ancillary data relationships it
		might have.


* Example request/reply flows

The following examples indicate the request flow, showing how the code
should hold together.