Re: Explaining internal errors from Amos Jeffries on 2010-06-28 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Mon, 28 Jun 2010 23:41:26 +0000

On Mon, 28 Jun 2010 13:12:27 -0600, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> Hello,
>
> Squid may respond with HTTP 500 Internal Server Error
> (HTTP_INTERNAL_SERVER_ERROR) when we hit a bug or when the communication
> with another server fails. The admin may need to distinguish these and
> other cases (and their sub-cases) because they may require different
> administration actions (e.g., fix a broken ICAP server or configure more
> ephemeral ports).

NP: some of these are not 500 errors. The correct HTTP/1.1 status changes
have not yet been ported from 2.7.

>
> Some of the corresponding errors are also reported in cache.log but many
> are not. Moreover, it is often difficult to correlate cache.log and
> access.log information, especially on a busy proxy. When we are asked to
> triage or explain various TCP_MISS/500 and NONE/500 log entries, we have
> very few usable sources of information.
>
>
> I suggest adding two new access.log format codes that will contain the
> error page ID returned by Squid and the corresponding "error details".
>
> The "error page ID" field will explain, in most cases, what happened. We
> will log such strings as ERR_SHUTTING_DOWN and ERR_UNSUP_REQ. These will
> come from the err_type enum and the error page ID already maintained by
> Squid.
>
> The "error detail" field will detail why the error happened, helping to
> distinguish different causes for the same error page ID. The exact
> interpretation of the details will depend on the error page ID. For
> example, it could be the OS error number for I/O-related errors.
>
> Any objections or better ideas?

+1 on the pageid being logged. It's fairly often asked for.

Not so sure on the reason text. Some of the pages are very fuzzy, though
that just means more specific pages are needed I suppose. If you can find
some consistent reason explanation for each case it should be set to go in
the page %E code for the user as well.

With both tags, don't forget the %icap:: / %http:: scoping so we can have
multiple servers error on one request (particularly with icap failover
possible when working with a dead http origin).

Being more consistent with use of the ERROR: and FATAL: prefixes to
cache.log (as you noted, many errors are not even logged in the
administrative cache.log!). With the error page details there to link
specific problems to URL and error pageid and reason.

Amos
Received on Mon Jun 28 2010 - 23:41:32 MDT

This archive was generated by hypermail 2.2.0 : Wed Jun 30 2010 - 12:00:08 MDT