Re: cachemgr output normalization

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 30 Oct 2011 00:25:52 +1300

On 29/10/11 20:21, Kinkie wrote:
> On Sat, Oct 29, 2011 at 3:41 AM, Amos Jeffries<squid3_at_treenet.co.nz> wrote:
>> On 29/10/11 08:39, Alex Rousskov wrote:
>>>
>>> On 10/28/2011 12:36 PM, Kinkie wrote:
>>>>
>>>> Hi all,
>>>> Now that cachemgr happily responds to proper http requests, I've
>>>> started toying with a browser-based, all-javascript/DHTML cachemgr.cgi
>>>> replacement.
>>>> I have a first beta available in launchpad at
>>>> lp:~kinkie/squid/cachemgr-js.
>>>
>>> Can you start it somewhere so that we can actually see it in action?
>
> Ok. Will do so in the next few days.
>
>>>> It currently requires help from squid
>>>> via a rproxy setup to serve the two HTML and one javascript file, but
>>>> if we can have squid serve them internally, it'll be a no-brainer for
>>>> single-instance monitoring.
>>>
>>> Will this be a part of Squid source tarball distribution? It may make
>>> sense to keep it separate because it may require separate translations,
>>> have a separate release cycle, etc.
>>>
>>
>> The bundle tarball seems to have turned into a 'suite' of things long ago.
>> Its up to teh distros hpw they split and install the separate tools. Many
>> for example drop the cache utilities and announce tools (maybe not even
>> noticing they are packaged), bundle the squidclient and CGI separately for
>> install, and group the helpers and squid core binaries together as a set.
>>
>> This seems to work well.
>>
>> If we like, we can easily package a separate tarball in parallel, like the
>> langpack is done.
>>
>>>
>>>> There's a few open points which need a bit of thinking though:
>>>> 1- what is the best way to serve a few static html and javascript files?
>>>
>>> Same way we serve icons, I guess. IIRC, Henrik has summarized how
>>> Squid-as-web-server should [not] work some time ago, but those changes
>>> are probably outside this project scope so I would just reuse icons code
>>> for now.
>>
>> Yes. I still have the internal-server branch almost completed. Just needs to
>> be updated to the latest sources and re-tested, re-audited.
>>
>> If the JS files are installed and loaded from the /var/www/squid area same
>> as the icons it will be easier to both get out immeditaely and integrate
>> with internal-server when we gets to finishing it.
>
> If the installation is connected to the Internet, we don't even need
> to bundle the .js files, all that would be needed is a 1Kb-ish html
> launcher.

Where do you propose newly opened browsers will get the JS from to
bootstrap the display?

>
>>> If this is difficult, for any reason, just put them somewhere on
>>> squid-cache.org and hard-code the URLs in your Javascript, for now.
>>>
>>
>> At first glance I think serving the JS page loader and a stub HTML page to
>> run it as the http://*/squid-internal-manager/ 'index' URL is probably the
>> user-friendliest way to roll it out.
>>
>>
>> I think I spoke to you a while ago about implementing the /menu action with
>> TXT + HTML format. But got stuck at the problem of generating right TXT in
>> the workers or the HTTP headers output. And report aggregating duplication
>> of the page.
>
> I'm aiming for something simpler now: rework the text output so that
> it is still human-readable but a bit more machine-parser-friendly so
> that transcoding (mostly transcoding tables) is simpler.
>
> The way I'd think to do the separation between data collection and
> rendering is not really to use ASN.1 but structs. If multiple types
> need to be supported it'll just have to be unions with a type
> identifier attached. But in order to get there we need to reorganize
> the actions' contents: most actions return two or more tables, this
> needs to be reduced to one table per action, or it won't be renderable
> by snmp.
>
>>>> 2- does cachemgr get engaged only via GET method or can we have it
>>>> also answer to POST requests? The reason is that GET requests in
>>>> javascript are subject to a same-origin policy, while POST are not. It
>>>> would allow for multi-server monitoring and it would make point 1 a
>>>> nice-to-have and not a requirement
>>>
>>> The difference is minor from Squid code point of view so we can support
>>> both, eventually. IIRC, we do that in Co-Advisor, with little code (that
>>> we would be happy to copy to Squid).
>>
>> I looked at this when advising Arthur on the shell backend handling. He
>> needs POST as well.
>>
>> Cachemgr takes any method, but filtered GET for all the actions so far
>> implemente. I think we can easily add any method we want for new actions. Or
>> adjust the existing actions to handle non-GET.
>
> The actions don't really need to know for now: we can safely ignore
> body-supplied parameters. If we don't, the adaptation should not be
> done in the actions, that's too much duplication.
>
>>>> 3- we need to make the output from cachemgr handlers follow some
>>>> common guidelines.
>>>
>>> Sure.
>>>
>>> How do you want to post-process that output in Javascript? Some
>>> find-and-replace commands using regular expressions? Is it very
>>> difficult to have action-specific post-processing?
>
> Find-and-replace, inspired to cachemgr.cgi.
>
>> The JS should make use of the format parameter to do format-specific
>> processing. Like CGI does action-specific processing right now.
>
> Current Js is limited to browser-based rendering. Think of it as an
> alternative to cachemgr.cgi, not as a self-contained tool.

I was imagining that you had the JS doing AJAX / XHR in the background
to load the reports yes?
  So trying to load the report in format A. ON success passing it to the
format-A display function. On fail trying with format B...

Instead of deciding up front based on the action name.

That way the JS would be truly independent of how far we had managed to
convert the reports. Like cachemgr is not.

>
>> The existing TXT format can be dumped wholesale straight into PRE tags and
>> the actions updated to more useful formats one by one. This resolves the
>> back-compat and third-party script problems as text can be phased
>> out/upgraded over the long term.
>>
>> * For tabular data CSV, TSV, HTML, XML formats are all commonly supported
>> and useful under different scenarios.
>> * For list data XML or HTML is probably best.
>> * For key=value data XML, HTML, or TXT is probably best.
>
> That's the point: key = value data is really a two-columns table.
> Which is rendered as key = value probably due to historic reasons.
> That's exactly what I'd like to change.

The k-v entries I've had the change to look at were actually
a.b.c.key=value which is more tree structured than K-V pair.

>
>>>> This poses a problem of
>>>> compatibility with third-party software. We can either have a
>>>> transition phase where we duplicate actions, or we can just decide
>>>> that we don't have the resources to care, and we just warn the authors
>>>> that we know of about our intentions so that they have time to adapt.
>>>
>>> Indeed. Perhaps this should be discussed on squid-users as well.
>>>
>>
>> This is where I greatly favour the format upgrade. We can both warn them the
>> TXT is going to become human-only view, and that there is more efficient
>> machine-readable formats available for their immediate use.
>
> That's for phase-two. Phase-one would be to make txt more machine-friendly.

Altering TXT without some alternative and lead time already being
available for people to convert to could cause problems.
  I know it seems a bit messy now with each txt report having its own
standard of syntax. But the people depending on it dont really care
much, they already have code setup to handle those syntax.

PS. sorry if I confused you with talks earlier and/or with Alex, about
being free to update the reports. I meant the values and fields reported
not the deeper layouts or syntax of the report itself.
  If you get what I mean?

>
>>>> This is also in my opinion a prerequisite step to support multiple
>>>> output formats in cachemgr.
>>>
>>> Is there consensus that Squid itself should support multiple output
>>> formats? I kind of doubt it is the right thing to do in general. If
>>> Squid outputs easy-to-parse, consistent data, other applications can
>>> post-process and beautify it in many different ways.
>>>
>>
>> From me yes.
>>
>> I also see and agree that opening the door to anything at all is a bad idea.
>> We can and should provide a small restricted set of suitable formats for the
>> data based on its type table/list etc. Such as the ones I mention up above.
>
> I'd add JSON, but apart from that yes.

Yeah. I thought about that after sending.

>
>>>> I'm willing to spend the time to do this if we agree that it should be
>>>> done.
>>>
>>> Yes, the output should be standardized.
>>>
>>
>> Agreed.
>>
>> The first step though, is standardizing and checking that all actions are
>> upgraded to use a single internal-only format for data transfer from workers
>> to somewhere for assembly/aggregation and transformation the other formats
>> as needed. ASN was proposed earlier. Yes?
>
> That's second step :)

I disagree. You would then be:
  1 altering the worker report format to X
  2 altering the worker report format to internal syntax
  3 replicating step (1) in the coordinator

Better to save work+time and just do the second two steps as one
auditable unit (per action if they get big).

>
> First step, standardize textual representation in output with current
> infrastructure
> Second step, restructure actions (one table per action)

Adding new actions with structured grouping/path syntax:
  .../menu
  .../menu/peers
  .../menu/kids
  .../cache/fqdn
  .../cache/ip
  .../cache/usernames
  ...

> Third step, separate internal and extenral representation
> Fourth step, add more formats.
>

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.16
   Beta testers wanted for 3.2.0.13
Received on Sat Oct 29 2011 - 11:26:00 MDT

This archive was generated by hypermail 2.2.0 : Sat Oct 29 2011 - 12:00:08 MDT