Re: parsing quoted-string HTTP header fields

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Fri, 27 May 2011 14:20:40 +0300

On 05/26/2011 11:13 PM, Alex Rousskov wrote:
> On 05/26/2011 10:20 AM, Amos Jeffries wrote:
>>>
>>> Implementation:
>>>
>>>> while (*pos != '"'&& len> (pos-start)) {
>>>> +
>>>> + if (*pos =='\r') {
>>>> + pos ++;
>>>> + if (*(pos++) != '\n' || (*pos != ' '&& *pos != '\t')) {
>>>
>>> Can the above double increment lead to *pos pointing beyond the string
>>> boundaries?
>>
>> Yes. A dangerous only to the debugs(). Which needs a --pos--
>>
>>>
>>> Will the above incorrectly accept CR x HT sequence, where "x" is any
>>> character other than LF?
>>
>> Exactly how dangerous are solo-CR floating around in quoted-string?
>
> I do not know, but my point is that we are going to "remove" or "delete"
> valid characters. For example,
>
> foo: "1\r2\t3"
>
> will put "13" into "val" instead of either rejecting the value or
> putting "1\r2\t3" or at least "1 2\t3" into "val".

The above part of code has problems (for example check for the end of
string), but does not have the problem you are describing here..

>
> The culprit is the "*(pos++) != LF || ... *pos != HT" expression, which
> means we compare different characters inside one expression. I am not
> sure that is intentional, and it seems to result in the first character
> after CR ("2" in my example) being replaced with a space.

Comparing different characters inside one expression was intentional, it
has some problems, but will not replace the first character after CR
with a space...

while (*pos != '"' && len > (pos-start)) {
+
+ if (*pos =='\r') {

//here we are going to skip the "\r" character

+ pos ++;
+ if (*(pos++) != '\n'
// here if the character after "\r" is not a "\n" the parsing will fail,
  else will check if the char after "\r\n" is a space.

                  || ( *pos != ' ' && *pos != '\t')) {
// if there is not a space after "\r\n" the parsing will fail.

+ debugs(66, 2, "failed to parse multiline quoted-string
header field near '" << pos << "'");
+ val->clean();
+ return 0;
+ }
+ pos++; // (pos-start) here may exceed len but we do not care...

  // we are here because we had a "\r\n " string so just replace this
string with a single space in val:
+ val->append(" "); //replace "\r\n " or "\r\n\t" with single
space
+ }
+

This part of code maybe can be written as:

if (*pos =='\r')
{
      if ( (pos-start) > len-3 || //not enough space
          *(pos+1) != '\n' || // not '\n' after '\r'
          (*(pos+2) != ' '&& *(pos+2) != '\t')){//not space after "\r\n"
         debugs(66, 2, "failed to parse multiline quoted-string header
field '" << start << "'");
         val->clean();
         return 0;
      }
      pos+=3;
      val->append(" ");
}

>
> HTH,
>
> Alex.
>
>
>
Received on Fri May 27 2011 - 11:21:04 MDT

This archive was generated by hypermail 2.2.0 : Fri May 27 2011 - 12:00:05 MDT