Re: parsing quoted-string HTTP header fields

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 26 May 2011 14:13:38 -0600

On 05/26/2011 10:20 AM, Amos Jeffries wrote:
>>
>> Implementation:
>>
>>> while (*pos != '"'&& len> (pos-start)) {
>>> +
>>> + if (*pos =='\r') {
>>> + pos ++;
>>> + if (*(pos++) != '\n' || (*pos != ' '&& *pos != '\t')) {
>>
>> Can the above double increment lead to *pos pointing beyond the string
>> boundaries?
>
> Yes. A dangerous only to the debugs(). Which needs a --pos--
>
>>
>> Will the above incorrectly accept CR x HT sequence, where "x" is any
>> character other than LF?
>
> Exactly how dangerous are solo-CR floating around in quoted-string?

I do not know, but my point is that we are going to "remove" or "delete"
valid characters. For example,

  foo: "1\r2\t3"

will put "13" into "val" instead of either rejecting the value or
putting "1\r2\t3" or at least "1 2\t3" into "val".

The culprit is the "*(pos++) != LF || ... *pos != HT" expression, which
means we compare different characters inside one expression. I am not
sure that is intentional, and it seems to result in the first character
after CR ("2" in my example) being replaced with a space.

HTH,

Alex.
Received on Thu May 26 2011 - 20:14:12 MDT

This archive was generated by hypermail 2.2.0 : Fri May 27 2011 - 12:00:05 MDT