Re: Helpers and Notes

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Sat, 26 Jan 2013 21:27:15 +0200

On 01/26/2013 06:05 AM, Amos Jeffries wrote:
> On 26/01/2013 10:39 a.m., Tsantilas Christos wrote:
>> Hi all,
>>
>> While we are working in a project we seen that the helpers do wrong
>> usage of the Notes interface.
>>
>> This is maybe it is our mistake (and mostly my mistake) because we do
>> not have make good documentation of this interface.
>> In any case I believe that we should fix some thinks.
>>
>> In the Notes.h file there are the following classes:
>>
>> class Note
>> -----------
>>
>> This is a class used to store a note configuration not a note it self.
>> It is a note name plus a list of possible values with an ACL list in
>> each value.
>> It is used to store configurations for the following cfg lines:
>> note key value acl1 acl2
>>
>>
>> class Notes
>> ------------
>> It is used to store a list of Notes. A list like the following:
>> note key valueA acl1a acl2a
>> note key valueB acl1b acl2b
>>
>> This is used to add custom notes (or metaHeaders) in an HTTP request
>
> With the ACL list being optional and the word "note" being dropped by
> the config parser what is actually stored is:
>
> NotesList{
> Note{key,value,NULL},
> Note{key,value,NULL}
> ...
> }

Well, not actually. What it is stored is the following:
 NoteList {
      Note {key, Values{{value1, NULL}, NULL} }
      Note {key, Values{{value1, NULL}, NULL} }
      ...
 }

A Note keep a note name plus a list of values/ACLs. The Acls used to
select the correct value for a Note.
It is designed to store configuration not a note pair.

Moreover to get the Value of a Note a new method added the
Notes::firstValue() to retrieve the first value of the list. It is not
so cool. It is a little confusing...

>
> ... which is clearly a list of kv-pairs without any specific information
> about delimiter, display semantics, or input source location. Nowhere
> in ths implementation or documentation is it clear that this list
> structure is *only* ever to be used for parsing the config file. Rather
> the simplicity of the information stored implies strongly that it is
> created as re-usable storage structure.
> ***As was requested when you were planning the Notes feature.***

Of course it can be used to store kv-pairs. What I am saying is that the
NotePairs class is better structure to keep this list.

>> class NotePairs
>> -----------------
>> This is an HttpHeader kid class used to store the notes (or
>> MetaHeaders). We select HttpHeader class because it has some interesting
>> features. For example assume that two different subsystems add the same
>> note:
>> X-SQUID-ProcessedBy: ReWriter
>> X-SQUID-ProcessedBy: Adaptation
>>
>> Then this is will be merged to
>> X-SQUID-ProcessedBy: ReWriter, Adaptation
>
> Yes NotePairs is clearly and specifically a Mime syntax version of
> kv-pairs.
>
> The interesting and useful fetures are *only* useful when outputting
> kv-pairs as ICAP or HTTP headers in Mime format. Those features are not
> useful in any other context Squid uses kv-pairs. ... which is why the
> helpers avoid that type until the final logging stage where it has no
> choice.

So we are agree that the NotePairs class is more useful to keep kv-pairs
at least for the ICAP/HTTP headers. But this is does not mean that it
can not be used for storing helpers kv-pairs.
It is just better to use the same structures for storing similar
information. It is better to just use NotePairs for storing notes,
because it can handle more cases (kv-pairs, ICAP/eCAP metaheaders and
maybe other type of notes in the future)

Also storing all notes in the same object maybe it is good policy
because they can be used in other than the logging cases.

Currently all notes merged just before logging because they are used
only here.

>> Squid helper Notes problems
>> -------------------------------
>> The changes made in trunk revno:12490 and revno:12495
>>
>> Problems I am seeing:
>> - it uses class Notes to store the notes/metaHeaders not the NotePairs.
>> It adds the "Notes HelperReply:notes" and the "Notes
>> HttpRequest::helperNotes" members.
>
> This is intentional. see above.
>
> The first implementation attempt used a NotePairs list for helpers. The
> exact same behaviour of merging keys together which make NotePair
> wonderful for ICAP header display and logging display make this NotePair
> type difficult to use as a kv-pair list for helpers and any other use
> where HTTP field-value syntax is needed.

Why?

> The other difficulty was the whole or at least a large segment of the
> HTTP parser becomes a dependency, which was not worth the cost and
> effort for the simplicity of NotePairs. This was probably not noticable
> when integrating with ICAP due to it already having that dependency in
> place for ICAP header parsing.

You do not have to use any HTTP parser. You can just add a kv-pair (a
key/value pair), simply using the NotePairs::putExt(key, value).
Why the Notes::add(key, value) is better?

>
> The need for multiple de-references ( A->value[X].value()->value ) when
> accessing NotesList is accepted as a cost (for now) in exchange for the
> benefit of *not* merging key values together and not having to link in
> the HTTP parser.

I did not found such cases in the existing code.
What I am seeing is the following usage:
 Note::Pointer statusNote = reply.notes.find("status");
  ...
 const char * result = statusNote->firstValue();
 status = (http_status) atoi(result)
 ...
 urlNote = reply.notes.find("rewrite-url");
 if (urlNote != NULL && strcmp(urlNote->firstValue(), http->uri)
   ....

The above if we use NotePairs instead of Notes can be easily written as
follows:
 String status = reply.notes.getByName("status");
 status = (http_status) atoi(status.termedBuf())
 ...
 status urlNote = reply.notes.getByName("rewrite-url");
 if (!urlNote.empty() && strcmp(urlNote->termedBuf(), http->uri)

Moreover we can add a member in NotePairs the
  const char *NotePairs::find(const char *key)
to simplify it more.

Also we have an interface in configured Notes to deny user to use some
names for notes which are reserved. We can add here the helper reserved
names like "status", "rewrite-url" etc.

Maybe we should require by the user to use always the X- prefix in his
kv-pairs keys. For example the "status" is a key which can be selected
by anyone. The user is better to use the "X-Status" as kv-pair key.

> If you are solving problems in the Notes structure systems please
> resolve that dereference issue.
>
>>
>> - It adds some extra methods which does not make sense and confuse more
>> the Notes interface:
>> * NotePairs::append(const Notes::NotesList &src)
>> * NotePairs::append(const NotePairs *src)
>> These functions used to add a Notes list to a NotePairs List. The
>> HttpRequest::helperNotes to AccessLogEntry::notes.
>
> By 'add' you mean 'append list S to this list'. Which is why it is
> called 'append()'. Where is the problem? naming, semantics and operation
> conform to standard conventions.

The problem is the semantics here. A Notes list holds many informations
which there is not any way to add them in a NotePairs list. It holds a
list of "key plus a value/ACL list", not just a key/value pair list.
Only that.

> The first above converts a NotesList into a NotesPairs list.
> Specifically because the NotePairs list is hard-coded to utilize the
> MIME format key-value pairs with colon (:) delimiter - which in Squid
> are stored in HttpHeader class types.

The HttpHeader holds a list of HttpHeaderEntry which consist my a name
and a value. It is a key/value pair.

>
> The second is added such that any existing AccessLogEntry list can be
> appended to by other sources producing NotePairs lists (ie RESPMOD).
>
>> But It is easier to use NotePairs for HttpRequest::helperNotes member
>> and just use the (pre-)existing NotePairs::append (the
>> HttpHeaders::append)
>
> Causing the need to covert a key=value string pair into an HttpHeader
> object. Causing the need to define said HttpHeader as a custom header
> and fill out its value set. All of which adds HTTP protocol operations
> (and validity checks) for the sake of parsing the very non-HTTP input
> "OK foo=hello".

If the HttpHeader is not a good structure to base the NotePairs we
should use a different structure. But I still believe that the cost
adding a kv-pair is not so huge...

Also HttpHeader/HttpHeaderEntry uses mempools, which is a plus.

>
> Not to mention that "foo_=hello" (_ being defined for use by custom key
> names) is an invalid HTTP header name rejected by the HttpHeader parser
> infrastructure, and I have no intention of altering that HTTP-specific
> parser (potentially affecting on-wire HTTP validity checks) for the sake
> of unrelated code that does not even use HTTP or Mime syntax.

The HTTP headers parser will not be used. The foo_ will be used as key
name without any problem.
But even if there is problem using "_" in the kv-pair keys and it is so
important to use it in key names we should rewrite the NotePairs class
to support it.

>
>> * Notes::add(const String &noteKey, const String &noteValue);
>> * Notes::add(const Notes &src);
>> * Notes::find(const String &noteKey)
>> To add and search for a note in Notes List. But it is wrong. We do
>> not store such Notes here. We are storing Notes configuration, while
>> parsing the cfg file.
>
> So we have a linked list of key-value pairs writen with an generic
> interface which is not allowed to be used anywhere outside of Parsing?
> That goes against the design requirements I requested back at the
> beginning. That Notes and NotesList be a generic list of kv-pairs we
> could add to arbitrarily and output with various format delimiters.

Again read my comments above. The Note does not keep only a key/value
pair. Of course it can be used but it is not so normal, it is designed
to store other information ...

>
> We have four desired uses for NotesList in Squid:
>
> 1) [A-Z]key ':' ' ' value - I/O for ICAP in Mime header format
> 2) key '=' value - I/O for ICAP or HTTP in HTTP header field-value syntax
> 3) key '=' value - helper I/O syntax
> 4) key ' ' value - squid.conf 'notes' directive syntax
>
> NP: (1) has limitations on key character set, and optional (desired 1
> SP) whitespace after the ':' delimiter. The others do not.

Nor the ":" nor the space stored in HttpHeader list, only the Header
name and the header value.

>
>
>> We should use NotePairs for HttpRequest::helperNotes and
>> HelperReply::notes members and use the NotePairs::putExt or
>> NotePairs::addEntry(new HttpHeaderEntry(HDR_OTHER, name, value)); to
>> adding note/value pairs.
>>
>> - Initially we had select the AccessLogEntry to store notes. The
>> AccessLogEntry selected because it considered as good structure to hold
>> HTTP request X-data. (But we had problem on this).
>> I am not sure if the HttpRequest::helperNotes is the correct position to
>> add Notes. We should consider if we can add notes to
>> AccessLogEntry::notes, or select an other X-data structure.
>
> HttpRequest is currently abused as a transport mechanism to relay the
> notes through to AccessLogEntry, which is not directly accessible from
> the helper handlers.
> If ALE can be made accessible to both helpers and ACL match system in a
> useful way the HttpRequest member can be removed.

We need to select which structure will/can play the role of the X-data.
Of course I agree we can hold all required information in HttpRequest
object.

>
> NP: adding straight to the ALE list of NotePairs adds an extra problem,
> that the kv-pair is now *copied* into Mime syntax. Once the note has
> been converted to Mime header syntax it becomes much more difficult to
> process the list and match a single kv-pair out of a set of identical
> ones. ACLs for example do not have access to the ALE object to test
> particular key values and even if they did they can only retrieve the
> entire list of all values associated with each key name.
>
>> I am working on changing the above problems. Also maybe we need to
>> change the names of some classes to avoid confusion in the future, for
>> example rename Note and Notes to NoteRule and NoteRules or similar.
>>
>> Any opinions on the above?
>
> When writing the helper interface use of Notes the problem was only that
> Notes* classes were hard-coded for a specific use-case output.
>
> We had long discussions during the design phase and auditing about using
> Notes as both header format "Key:value" and field format "key=value" - I
> was under the impression that you had taken those usage requirements on
> board and designed Note class which clearly takes a Key and Value
> separately (without any additional details like ACLs!).

We may had select wrong name for Note and Notes class. But this is the
only problem. Please see my comments above, and my first mail of this
thread which explains the classes.

>
>
> Other than the link with ACLs, class Note is fine for use as a generic
> kv-pair node.
> Other than your documentation of it, class NotesList is reasonably okay
> for use as a generic list of Note kv-pair. One problem is the use of
> std::list<Node::Pointer> which causes a double-indirection layer when
> trying to access any node position of the list.
> For example: list->values[0]->value().keyValue() instead of just
> list[0]->keyValue().
>
>
> To resolve what you see as an API abuse "problem" I think all that is
> needed:
> 1) make Note the base type representing just a kv-pair
> 2) make a sub-class associating ACLs as a config specific sub-type of it.
> 2) make a "Packer" for NotesList that outputs in each of the earlier
> mentioned syntax formats.
>
>
>
> To resolve the indirection problems in NotesList I propose adding a
> template to Squid sources which presents the *next pointer and
> linked-list accessors and iterators. Such that any class can inherit
> from it to be a node in a linked-list.
> We can then unify all the classes and structures which are performing
> their own linked-list implementations into inheritence from this template.
> - NotesList would then be a class inheriting from ListNode<> and Note.
>
> Sound reasonable?

Nope.
I still believe that the best is something like the following:
 - Rename Note and Notes to NoteRule and NoteRules, to avoid confusion
in the future.
 - Maybe rename NotePairs to Notes, for the same reason.
 - Use NotePairs for HttpReply::notes and HttpRequest::helperNotes members
 - If the HttpHeaders is not a good choice as base class of NotePairs we
should re-implement the NotePairs to fit our needs. But I still believe
that it is not so bad ...

>
> Amos
>
Received on Sat Jan 26 2013 - 19:27:27 MST

This archive was generated by hypermail 2.2.0 : Sun Jan 27 2013 - 12:00:14 MST