Re: POST procedure (squid 2.7 stable9) from Henrik Nordstr�m on 2012-03-20 (squid-dev)

From: Henrik Nordstr�m <henrik_at_henriknordstrom.net>
Date: Tue, 20 Mar 2012 12:24:54 +0100

Pleae keep discussions cc to squid-dev list.

I have some issues with this

1. What should the response to the POST look like? Normally the response
to POST is generated by the requested web server and may contain pretty
much anything. Usual responses are a HTML web page or a redirect to some
page.

2. How would Squid know it's allowed to do this? It's a major deviation
from normal HTTP flow.

3. What do the POST request finally forwarded to the web server look
like?

4. What do you mean in 5? What is preserved and how, for what purpose?

5. In 2 you say the third request gets forwarded. In 3 you say it
receives a response immediately. Which is it?

When I asked you to describe the operation in HTTP terms I mean HTTP
terms at the HTTP protocol level. Actual contents of requests, responses
in an imagined implementation of what you want to accomplish.

Regards
Henrik

tis 2012-03-20 klockan 11:27 +0800 skrev Schulz Xu:
> I use the attached pic as an example to illustrate my idea in HTTP
> level.
> 1. three clients send POST to the squid. POST(1) and POST(2) have the
> same key word in their HTTP request body. POST 3 doesn't have any
> keyword in its body
> 2. squid receive the requests. After examine the bodies, POST(1)/(2)
> are packed then forwarded. POST(3) is directly forwarded as usual.
> 3. each client ,no matter he is sending a request with keyword or not,
> receives the reply immediately as before.
> 4. when squid receives the POST with keyword, squid saves the request
> in the its own file system. After certain time or capacity, squid
> will pack the POSTs that saved and forward them to the web server.
> 5. keywords are preserved in the squid.
>
>
> I hope I make my point clear and sincerely thanks for your suggestion
> on illustrating ideas.
>
> 在 2012年3月19日上午11:27，Henrik Nordström
> <henrik_at_henriknordstrom.net>写道：
> Please describe your goal in terms of HTTP methods and
> results, not
> abstract "twitter like". It will be very hard to help you if
> not
> understanding what you want to accomplish at the HTTP level.
>
> Discussing this based on twitter is pretty meaningless. It's
> not twitter
> (at least not today), and it also does not say anything about
> what the
> sequence of HTTP operations and their expected results may
> look like.
>
> You mention that the requests may be send to the web server
> later. So
> what you want to implement is a POST queue?
>
> When will the POST queue be run?
>
> What response should be sent to the requesting clients?
> Normally they
> expect a response from the requested web server.
>
> What to do if the web server transaction then fails?
>
> Regards
> Henrik
>
> mån 2012-03-19 klockan 08:54 +0800 skrev Schulz Xu:
> > My goal is simply to save the content of HTTP post.
> > For example, when a user twitter a tweet. The client will
> send a POST
> > request to the Twitter.
> > If the Twitter uses the squid as proxy servers, squid will
> usually
> > forward this HTTP request to the web server. Further, the
> web server
> > will save the content which indicates a successful upload
> procedure.
> > But, what I want to is to save the content in HTTP Post(i.e.
> the
> > tweet) on squid.
> > When receive the POST, first analyze the request then call
> StoreIo...
> > (I am tryin' to figure it out)to store the content in the
> squid file
> > system.
> >
> >
> > Why I am doing this is because I figure that for many
> Twitter-like
> > websites, they are often confronted with high income of POST
> requests
> > at certain special moment(Super Bowl, etc.). If we manage to
> use the
> > proxy servers to preserve the POST than to forward to the
> web server,
> > we can largely reduce the pressure on web servers since we
> distribute
> > the requests on many proxy servers. The content that saved
> on squid
> > can be packed and then sent to the web server later.
> > I hope I made point clear.
> >
> > 在 2012年3月18日下午9:59，Henrik Nordström
> > <henrik_at_henriknordstrom.net>写道：
> > sön 2012-03-18 klockan 21:19 +0800 skrev Schulz Xu:
> > > Thanks for the tips. I trace back from the
> > >
> >
> ProcessRequest->clientAccessCheck->ClientFinishRewrite->ClientStoreURLRewrite->clientRedirect->clientAccCheckDone
> > > On the other hand, I follow from tryParse to
> > > clientCheckFollowXForwardFor. But other problem
> occur.
> > > 1. in clientCheckFollowXForwardFor, what is
> > FOLLOW_X_FORWARDED_FOR???
> >
> >
> > It's a define set by configure, enabled by the
> > --enable-follow-x-forwarded-for configure options.
> If enabled
> > then
> > FOLLOW_X_FORWARDED_FOR is true and the related code
> gets
> > compiled. If
> > not enabled then it's false/undefined and the
> related code do
> > not get
> > compiled.
> >
> > > 2. no matter how, the procedure will finally lead
> to the
> > > clientAccessCheck-->aclNBCheck-->aclCheck--> then
> I am
> > kinda lost in
> > > the codes couldn't find the way from function
> aclCheck to
> > function
> > > clientAccessCheckDone (which I think is the
> process.)
> >
> >
> > aclNBCheck calls the given callback on completion,
> i.e.
> > clientAccessCheckDone in case of
> clientAccessCheck().
> >
> > aclNBCheck(http->acl_checklist,
> clientAccessCheckDone,
> > http);
> >
> > processes the given checklist, then calls
> > clientAccessCheckDone with the
> > result and http as argument.
> >
> > > 3. If I am about to INTERCEPT the POST request and
> manage to
> > store the
> > > body-content in the file system(I'll start with
> ufs but I
> > hope that I
> > > can utilize the COSS),
> >
> >
> > ufs and coss uses the same interface, only
> implementation
> > differ.
> >
> > > what procedure you think I should follow. At the
> moment,
> > I'd like
> > > examine the code in processReq. Cuz usually when
> tagged as
> > MISS, the
> > > squid will be asked to forward the POST.
> However,can you
> > tell me a
> > > simple procedure to store the content in the
> squid?
> >
> >
> > I am not entirely sure what you want to do with
> stored POST
> > data.
> >
> > What is it from an HTTP layer point of view that you
> want to
> > accomplish?
> > Please describe your goal in terms of HTTP requests
> and their
> > responses.
> >
> > Regards
> > Henrik
> >
> >
> >
> > >
> > >
> > > Thank you( and your time)
> > >
> > > Schulz
> > >
> > >
> > > 在 2012年3月16日下午9:33，Henrik Nordström
> > > <henrik_at_henriknordstrom.net>写道：
> > > tor 2012-03-15 klockan 18:40 -0700 skrev
> Schulz:
> > >
> > >
> > > > Simply speaking, when dealing with the
> POST
> > request, squid
> > > mainly use
> > > > the
> readRequest-->tryParseHttp-->parseHttpReq(in
> > which using
> > > the
> > > > urlParse to find the method )
> > >
> > >
> > > That's the general request parsing path,
> parsing the
> > request
> > > into
> > >
> > > Request line
> > > Request headers
> > > Body pipe (not parsed, only transferred)
> > >
> > > this is the same for all requests with
> POST no
> > different from
> > > a
> > > GET/HEAD/OPTIONS/PUT/DELETE/whatever. Even
> CONNECT
> > gets parsed
> > > the same
> > > way.
> > >
> > >
> > > > When parseHttpReq retrieve all the
> useful info,
> > then the
> > > tryParse
> > > > would react according to the http. But
> tryParse
> > only return
> > > the bytes
> > > > it used to the readRequest. Where is the
> following
> > > procedure?
> > >
> > >
> > > When the request is parsed it's added to
> the list of
> > pending
> > > requests on
> > > the connection and handed off to the chain
> of access
> > controls.
> > >
> > > In Squid-2 the access control chain starts
> with
> > > clientCheckFollowXForwardedFor() and ends
> in
> > > clientProcessRequest(),
> > > where it checks it it's a cache etc.
> > >
> > > then if the request needs to be forwarded
> (which is
> > always the
> > > case for
> > > POST) it calls fwdStart(), which does a
> bit of a
> > dance to
> > > figure the
> > > right path for forwarding the request,
> opens a
> > connection and
> > > hands off
> > > to the respective protocol handler, i.e.
> http.
> > >
> > > Up to this point the body of the POST
> request is not
> > touched
> > > at all.
> > > Only the protocol handler accesses the
> body of the
> > POST
> > > request.
> > >
> > > Regards
> > > Henrik
> > >
> > >
> > >
> > >
> >
> >
> >
> >
> >
>
>
>
>
>
Received on Tue Mar 20 2012 - 11:25:01 MDT

This archive was generated by hypermail 2.2.0 : Tue Mar 20 2012 - 12:00:07 MDT