squid-ng : increasing squid's flexibility and performance Adrian Chadd $adrian$ This document is designed to outline some ideas of mine on how to improve squid's flexibility and performance. This document is a work in progress, so please check the document regularly for changes. * I'd like to change how the client->server path works. Currently, all data transfer happens through the storage manager. Ie: (MISS, and roughly as I don't remember the full details ...) client request made client request parsed store entry created server connection opened client registers as a store client on the above store entry client requests through storeClientCopy() the beginning of the object .. .. server gets data, appends it to store entry storeClientCopy() returns to the client the data portion client handles data portion, and then requests the next portion through storeClientCopy() The server connection throttling is done through the deferred read handler. Instead, I'd like to abstract the client and server setup a little. A 'request' is passed to a 'server' which handles the request, eg: client --> clientlet --request--> servlet --> server This would be a straight proxy with no caching. One client request would correspond to one server request. In order to do ACL checking, you would simply insert an ACL module in the path: client -> clientlet -> client ACL check -> servlet -> server A similar thing could be done for redirectors, eg: client -> clientlet -> client ACL check -> redirect -> servlet -> server If a module isn't required, it simply isn't called. This would reduce the stack depth you can end up with when processing a request. To make this a multiprotocol server proxy, you would do the following: client -> clientlet -> .. -> connection manager the connection manager takes the request, and does pretty much what fwdDispatch() does. Say its an FTP connection: client -> clientlet -> .. -FTP URL-> connection manager client -> clientlet -> .. -FTP URL-> connection manager -> ftpservlet -> server client -> clientlet -> .. -FTP URL-> ftpservlet -> server The connection manager creates an ftpservlet object and then attaches it to the clientlet it in its place. All communication can happen now directly between the servlet and the clientlet. This way, you can attach various modules in between the clientlet and connection manager/servlet path when required, but remove them when not required. For some applications various extra features (hierarchy types, redirectors, ACLs) can only be used when required, and not affecting the performance when not required. It also should make the code a lot easier to manage and extend, and various new features such as content filtering could be implemented. A 'cache' could then be implemented by creating a cache-aware connection manager, which would probably be the storage manager. * after thinking about it a little more, I think that comm_read/comm_write in their present state might not be the most efficient way of doing things. comm_write() schedules write select callback via commSetSelect(), and then handles writing data to the client at another point through the comm loop. I think that when data is to be written to a client, as much data as possible should be attempted to write, and then if there is any left schedule a write select callback. A similar thing should be done for read (which is in a way done now, and would be defeated through use of comm_read()). If the OS buffers happen to be filled, this should minimise the number of loops through comm_select() and improve throughput. * Deferred reads do not work well inside an event-driven environment. The basic problem is this: commSetSelect(fd, READ, handler) register READ event on a file descriptor .. .. comm_select() fd:READ is ready check defer handler handler indicates defer, so register another READ event on fd .. .. comm_select() fd:READ is ready (because it was previously ready) check defer handler handler indicates defer, so register another READ event on fd .. .. comm_select() fd:READ is ready (because it was previously ready) check defer handler handler indicates defer, so register another READ event on fd .. There is no easy way in an event-driven IO system to tell whether a fd should be deferred at event register time. Therefore the deferred read mechanism must be changed to schedule IO events when data is actually required, and not rely on the deferred read handler to indicate at FD read time whether the data is needed. Eg: commSetSelect(sfd, READ, server_read) register READ event on a file descriptor .. comm_select() sfd:READ is ready server_read() - pipe data to client client_write() - all data was written, return ok all data was written, so commSetSelect(sfd, READ, server_read) .. comm_select() sfd:READ is ready server_read() - pipe data to client client_send() - not all data was written, so commSetSelect(cfd, WRITE, client_write) not all data was written, so don't reregister a read from the server .. comm_select() cfd:WRITE is ready client_write() - rest of data was written, notify server we want more server_more() - client wants more data, so commSetSelect(sfd, READ, server_read) This suits the above servlet/clientlet module well, since there are notification channels between the servlet and clientlet to perform data flow control. * SMP support? One thread per connection will not be a really useful way of doing things. I have an idea which involves one worker thread per CPU. Each worker thread is like a mini-squid, handling IO events and handling them through callbacks. The storage manager would need to be SMP-aware, along with most of the library functions, but I'm sure it could be done. I'd like some comments from SMP/thread-aware people. * Format of various structures clientlet_t { int type; void *clientlet_data; func *clientlet_havedata(clientlet_t *, servlet_t *, data ..); func *clientlet_wantdata(clientlet_t *, servlet_t *, data ..); func *clientlet_error(clientlet_t *, servlet_t *, error ..); func *clientlet_servletclose(clientlet_t *, servlet_t *, reason ..); }; type: clientlet_havedata: clientlet_wantdata: clientlet_error: clientlet_servletclose: servlet_t { int type; void *servlet_data; func *servlet_havedata(servlet_t *, clientlet_t *, data ..); func *servlet_wantdata(servlet_t *, clientlet_t *, data ..); func *servlet_error(servlet_t *, clientlet_t *, error ..); func *servlet_clientletclose(servlet_t *, clientlet_t *, reason ..); }; type: the type of servlet. Examples are SERVLET_HTTP, SERVLET_FTP, SERVLET_NNTP, SERVLET_RA. servlet_havedata: this is called by the clientlet whenever it has data for the servlet. The servlet returns how much data was used. If -1 is returned, this indicates that the servlet can not deal with any more data. (I'm guessing that 0 also means this .. ) servlet_wantdata: this is called by the clientlet when it wants more data from the servlet. This typically happens after the clientlet->clientlet_havedata routine is called and -1 is returned (which halts the data flow) to restart the data flow. servlet_error: this is called by the clientlet to indicate to the servlet that the clientlet has experienced an error. The clientlet is no longer valid after the function call. servlet_clientletclose: this is called by the clientlet to indicate that it is closing. The clientlet will then become invalid. request_t { int type; void *request_data; func *request_close(request_t *); }; type: the type of request. This indicates which connection protocol module gets this request. For example, the type could be REQUEST_HTTP, which means that it is a HTTP request. Whether the request inside request_data is a HTTP request or an FTP request is irrelevant, because that is a connection protocol issue (the connection module for HTTP will handle creating the required servlets which can then speak REQUEST_HTTP back to the clientlet). Generally the request_t type will be the same as the clientlet type, unless the clientlet supports multiple request protocols. request_close: called when the request_t is being deallocated, to deallocate request_data and any other ancillary data relationships it might have. * Example request/reply flows The following examples indicate the request flow, showing how the code should hold together.