Re: [squid-users] Jobs from Amos Jeffries on 2011-06-29 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 30 Jun 2011 00:25:37 +1200

On 30/06/11 00:12, Mohsen Pahlevanzadeh wrote:
> On Wed, 2011-06-29 at 21:04 +1200, Amos Jeffries wrote:
>> On 29/06/11 19:28, Mohsen Pahlevanzadeh wrote:
>>> On Wed, 2011-06-29 at 02:32 +1200, Amos Jeffries wrote:
>>>> On 29/06/11 01:37, Mohsen Pahlevanzadeh wrote:
>>>>> On Tue, 2011-06-28 at 06:01 -0700, John Doe wrote:
>>>>>> From: Mohsen Pahlevanzadeh<mohsen_at_pahlevanzadeh.org>
>>>>>>
>>>>>>>>> We must write a program that along with normal tasks, it had do a
>>>>>>>>> variety of jobs,But i need to PURGE and insert cache.
>>>>>>> I must a write a web appl thet it manages squid with many extra job.
>>>>>>> normal job: every work that squid can do.
>>>>>>> Variety of job: a range of task that my boss ordered.
>>>>>>
>>>>>> Amos question: What are these "normal tasks" and "variety of jobs"?
>>>>>> Your answer: extra job, normal job, a variety of job, a range of tasks...
>>>>>> Which does not answer the question at all...
>>>>>> Can you name the main tasks/jobs you need to do?
>>>>>> By example: start/stop/restart/reload squid, reset cache, purge/cache url?
>>>>>> Graph statistics, etc...
>>>>>> I believe that for most of these, you do not need to play with the squid code...
>>>>>>
>>>>>> JD
>>>>> 1. PURGE from my program, but i can't call squidclient PURGE -m
>>>>> "blahblah" from my code.
>>>>
>>>> Sure you can. Several prefetchers just run exec("squidclient blah")
>>>>
>>>>
>>>> But now that it is clear you are building a whole management app, not
>>>> just a prefetcher an HTTP library would probably be the better way to
>>>> go. libcurl or whatever the equivalent is for your chosen language.
>>> It has problem. Because you suppose server has 2000 or more request
>>> concurrent, Then 2000 times squidclinet had run, it's wrong......
>>> So i must write a function same as purge and do it from code.
>>>>
>>>>> 2.Insert into cache.
>>>>
>>>> HTTP GET request. Tricky. Since you will have to figure out whether the
>>>> clients will be asking for plain or compressed copies.
>>>>
>>>> By far and away the best way to do this is simply not to bother doing it
>>>> at all. Squid is designed to do the work of figuring out where objects
>>>> are and how to get them to the client fastest.
>>>>
>>>> Inserting objects into the cache may _seem_ to be a good idea. But HTTP
>>>> is very complicated and there is a very good chance you will push the
>>>> wrong variants of each object into the cache.
>>>>
>>> Your suggestion is very very nisce and i'll start HTTP Definitive Guide
>>> of O'Reilly.I know libcurl, it's good idea that we request to squid for
>>> push and squid itself push.
>>
>> My point earlier was to use libcurl for PURGE as well. No difference to
>> you between PURGE and GET other than the name.
>>
>>>
>>>>> 3.concurrent receive of site(minimum 100 sites)
>>>>
>>>> If by "site" you mean website. Squid is used by ISP. They have
>>>> accessible site numbers ranging in the high millions or billions. These
>>>> are all concurrently available to an ISP situation, so safe bet on that
>>>> requirement.
>>>>
>>>> If by "site" you mean visitor. One Squid routinely handles hundreds or
>>>> thousands of clients depending on your hardware specs. Or it may
>>>> overload the network on _one_ client requesting TB sized objects.
>>>>
>>>> You need to figure out a request/time-unit metric or a concurrent
>>>> connections metric and test that is achievable with the desired
>>>> configuration. The squid config file is a mix between simple
>>>> on/off/value settings and a big script which tells Squid how to operate
>>>> on a request. Seemingly simple changes can easily raise or lower the
>>>> response speed by whole orders of magnitude.
>>>>
>>> We must do following task:
>>> 1. Possibility to download concurrent 100 sites (website) from
>>> internet.
>>> 2.Possibility to filter them that i know squid uses url_regex.
>>> 3.Possibility to answer as proxy for minimum 1000 concurrent request.
>>> 4.Time of answering to each request must be 100 mili second.
>>
>> Possible trouble.
>>
>> If you were setting up a reverse-proxy situation with controlled
>> network between the proxy and web servers that could be a realistic
>> goal. In those cases it is relatively easy to get a lot of requests in
>> the low dozens of millisec.
>>
>> However your earlier requirement (1), indicates that your traffic will
>> be from the Internet. You will be 100% at the mercy of external
>> administrators when it comes to timing.
>>
>>> 5.Space and algorithm should be chosen that can expand to store 10000000
>>> pages.
>>
>> Sounds like you are tasked with creating a whole new proxy system. That
>> was all tasks _your app_ has to do was it not?
>>
>> Amos
> You suppose i have a server A and its a logger and server B is
> squid.Server A is connected to the internet.Server A download pages and
> server wants to deliver pages to squid:3128 but i don't know squid can
> accept just pages?How i do instruct it to server B?
> --mohsen

There is no "push" operation in HTTP caching. Only "pull".

serverA needs to create a GET request and send it to serverB(squid) to
locate the real source and relays the GET on somewhere else. serverB
caches the reply on its way back through serverB to serverA.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.12
   Beta testers wanted for 3.2.0.9 and 3.1.12.3

Received on Wed Jun 29 2011 - 12:25:44 MDT

This archive was generated by hypermail 2.2.0 : Wed Jun 29 2011 - 12:00:02 MDT