Re: "concurrency" attribute and questions.

From: louis gonzales <linuxlouis_at_gmail.com>
Date: Sun, 5 Apr 2009 08:27:55 -0400

Amos,
Thanks for your responses, they make things clearer for me, so now I
can ask better questions :) What I'd like to do is have my PERL
helper fork as necessary, rather than starting up (children=50) or
(children=100), or N external_acl_type "instances" which is not
efficient and based off of an indeterminate number of users, 50 or 100
may not be enough, or at times too many.

What settings in the squid.conf line tell Squid the external helper
will fork to handle subsequent objects? Below is my current line:
external_acl_type eXhelperI children=1 %LOGIN %METHOD %{Host}
/usr/lib/squid/eXhelper.pl

since I set "children=1" only one "eXhelper.pl" starts up with Squid,
with the idea in mind that "eXhelper" forks children processes as
necessary. I'm still trying to determine what state information Squid
passes to the external helper besides the %LOGIN/%METHOD... [ below
you mentioned an ID token, are you referring to the %LOGIN ID token?
Or something else? ].

I understand that Squid forks the eXhelper.pl, which means Squid owns
the ppid(parent process ID) of the "eXhelper.pl" - ideally I'd like to
have this single child then, fork subprocesses too, currently I'm
uncertain what input trigger(or signal) if any exists, to have the
single external helper fork the subprocess to check the object and how
to uniquely ensure the "OK" or "ERR" goes back to the calling "ID
token"

Thank you again - I did write some comments below.

On Sun, Apr 5, 2009 at 6:05 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
> louis gonzales wrote:
>>
>> List,
>> 1) for the "concurrency" attribute does this simply indicate how many
>> items in a batch will be sent to the external helper?
>
> Sort of. The strict definition of 'batch' does not apply. Better to think of
> it as a window max-size.
Louis: So should I have the PERL helper "buffer" the data passed to
it, rather than "reading line by line" - if "buffer" what are the
"start and end" identifiers?

>
> So from 0 to N-concurrency items will be passed straight to the helper
> before squid starts waiting for their replies to free-up the slots.
Louis: If 0-(jth) objects belong to a specific user's request and
(jth)-Nth belong to a different user request, assuming concurrency is
set to N, how does one differentiate in the external helper, which set
belongs to who (I'm using the %LOGIN parameter so I know which userID,
as authenticated by ldap, is making the request) - in other words,
after I've determined "OK" for the 0-(jth) and "ERR" for the
(jth)-Nth, the specific instance of the helper will need to return two
different values. Basically my helper checks each one of the squid
passed Objects(URL/%LOGIN) pairs against the ACL's in the postgresql
database. My use case, guarantees the only end user application will
be a web browser, so with that assumption, when the end user opens
www.foxnews.com, for instance, there are a multitude of objects, so my
specific question is: when squid goes to retrieve all of these objects
for the requesting user, does Squid - a) with concurrency set high
enough, send all of these objects to the same external helper instance
and await a single "OR" or "ERR"? and b) with concurrency off, does
Squid one-to-one object-to-external_helper_instance awaiting for "OK"
or "ERR"?

>
>>
>> 1.1) assuming concurrency is set to "6" for example, and let's assume
>> a user's browser session sends out "7" actual URL's through the proxy
>> request - does this mean "6" will go to the first instance of the
>> external helper, and the "7th" will go to a second instance of the
>> helper?
>
> 1-6 will go straight through probably with the IDs 1->6.
> #7 may or may not go straight through, depending if one of the first 6 was
> finished at that time.
Louis: is it ever possible with concurrency enabled, that objects from
two different users will enter into a single external helper instance?

>
>>
>> 1.1.1) Assuming the 6 from the first part of the batch return "OK" and
>> the 7th returns "ERR", will the user's browser session, render the 6
>> and not render the 7th?
>
> Depends entirely on how the ERR/OK results are used in squid.conf.
>
> (you might be denying on OK or allowing on ERR).

>
>>  More importantly, how does Squid know that
>> the two batches - one of 6, and one with 1, for the 7 total, know that
>> all 7 came from the same browser session?
>
> There is no such thing as a browser session to Squid.
>
> Each is a separate object, these 7 happen MAY be coming from the same IP,
> but may be different software for all squid cares, or may come from more
> than one IP completely.
Louis: right, but Squid obviously has to know which IP the request
came from, in order to serve the page(s), so when the external helper
processes the "OK" or "ERR", certainly those will trace back the path
from which they came to the "correct requesting application(browser or
other)".

>
>>
>> What I have currently:
>> - openldap with postgresql, used for my "user database", which permits
>> me to use the "auth_param squid_ldap_auth" module to authenticate my
>> users with.
>> - a postgresql database storing my acl's for the given user database
>>
>> Process:
>> Step1: user authenticates through squid_ldap_auth
>> Step2: the user requested URL(and obviously all images, content, ...)
>> get passed to the external helper
>> Step3: external helper checks those URL's against the database for the
>> specific user and then determines "OK" or "ERR"
>>
>> Issue1:
>> How to have the user requested URL(and all images, content, ...) get
>> passed as a batch/bundle, to a single external helper instance, so I
>> can collectively determine "OK" or "ERR"
>>
>> Any ideas?  Is the "concurrency" attribute to declare a maximum number
>> of "requests" that go to a single external helper instance?
>
> number of *parallel* requests the helper can process. Most helpers shipped
> with Squid are non-parallel (concurrency=1).

>
>>  So if I
>> set concurrency to 15, should I have the external helper read count++
>> while STDIN lines come in, until no more, then I know I have X number
>> in a batch/bundle?
>
> Depends on the language your helper is coded in. As long as it can process
> 15 lines of input in parallel without mixing anything up.
Louis: PERL. I asked above, should I be "buffering" the objects/data
sent from Squid to the external helper or, reading line by line? If
buffer, how do I identify "start and end" of unique object request
data?

>
> Looks like a perl helper, they can do parallel just fine with no special
> reads needed. But it must handle the extra ID token at the start of the line
> properly.
Louis: what is the ID token, if not the %LOGIN and %{HOST} information?

>
>>
>> Obviously there is no way to predetermine how many URL's/URI's will
>> need to be checked against the database, so if I set concurrency to
>> 1024, "presuming to be high enough" that no single request will max it
>> out, then I can just count++ and when the external helper is done
>> counting STDIN readlines, I can process to determine "OK" or "ERR" for
>> that specific request?
>
> additional point to this:
>  the ttl=N option will cache the OK/ERR result for that lookup for N
> seconds. This can greatly reduces the number of tests passed back even
> further.
Louis: Thanks! This is good to know!!! :)

>
>>
>> Issue2:
>> I'd like to just have a single external helper instance start up, that
>> can fork() and deal with each URL/URI request, however, I'm not sure
>> Squid in its current incarnation passes enough information OR doesn't
>> permit specific enough passback (from the helper) information, to make
>> this happen.
>
> Squid passes an ID for each line of input. As long as the result goes back
> out stdout of the helper Squid itself forked with that ID at the front Squid
> does not care the order of responses.
Louis: does this mean Squid is in a "waitpid()" mode for the pending
external helper that was forked? Is it using some named pipe?

>
> You will need to make sure your parallel child stdout/stderr write to your
> parent helpers stdout/stderr. But it should be possible.
>
>
> Amos
> --
> Please be using
>  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE13
>  Current Beta Squid 3.1.0.6
>

-- 
Louis Gonzales
BSCS EMU 2003
HP Certified Professional
louis.gonzales_at_linuxlouis.net
Received on Sun Apr 05 2009 - 12:34:52 MDT

This archive was generated by hypermail 2.2.0 : Mon Apr 06 2009 - 12:00:03 MDT