Re: [squid-users] Squid with PHP & Apache

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 26 Nov 2013 16:30:01 +1300

On 26/11/2013 10:13 a.m., Ghassan Gharabli wrote:
> Hi,
>
> I have built a PHP script to cache HTTP 1.X 206 Partial Content like
> "WindowsUpdates" & Allow seeking through Youtube & many websites .
>

Ah. So you have written your own HTTP caching proxy in PHP. Well done.
Did you read RFC 2616 several times? your script is expected to to obey
all the MUST conditions and clauses in there discussing "proxy" or "cache".

NOTE: the easy way to do this is to upgrade your Squid to the current
series and use ACLs on the range_offset_limit directive. That way Squid
will convert Range requests to normal fetch requests and cache the
object before sending the requested pieces of it back to the client.
http://www.squid-cache.org/Doc/config/range_offset_limit/

> I am willing to move from PHP to C++ hopefully after a while.
>
> The script is almost finished , but I have several question, I have no
> idea if I should always grab the HTTP Response Headers and send them
> back to the borwsers.

The response headers you get when receiving the object are meta data
describing that object AND the transaction used to fetch it AND the
network conditions/pathway used to fetch it. The cachs job is to store
those along with the object itself and deliver only the relevant headers
when delivering a HIT.

>
> 1) Does Squid still grab the "HTTP Response Headers", even if the
> object is already in cache or Squid has already a cached copy of the
> HTTP Response header . If Squid caches HTTP Response Headers then how
> do you deal with HTTP CODE 302 if the object is already cached . I am
> asking this question because I have already seen most websites use
> same extensions such as .FLV including Location Header.

Yes. All proxies on the path are expected to relay the end-to-end
headers, drop the hop-by-hop headers, and MUST update/generate the
feature negotiation and state information headers to match its
capabilities in each direction.

>
> 2) Do you also use mime.conf to send the Content-Type to the browser
> in case of FTP/HTTP or only FTP ?

Only FTP and Gopher *if* Squid is translating from the native FTP/Gopher
connection to HTTP. HTTP and protocols relayed using HTTP message format
are expected to supply the correct header.

>
> 3) Does squid compare the length of the local cached copy with the
> remote file if you already have the object file or you use
> refresh_pattern?.

Content-Length is a declaration of how many payload bytes are following
the response headers. It has no relation to the servers object except in
the special case where the entire object is being delivered as payload
without any encoding.

>
> 4) What happens if the user modies a refresh_pattern to cache an
> object, for example .xml which does not have [Content-Length] header.
> Do you still save it, or would you search for the ignore-headers used
> to force caching the object and what happens if the cached copy
> expires , do you still refresh the copy even if there is no
> Content-Length header?.

refresh_pattern does not cause caching of any objects. What it does is
tell Squid how long an object is valid for before it needs to be
revalidated or replaced. In some situations this can affect caching
decision, in most it only affects expiry.

Objects without content-length are handled differently by HTTP/1.0 and
HTTP/1.1 software.

When either end of the connection is advertising HTTP/1.0 the sending
software is expected to terminate the TCP connection on completion of
the payload block.

When both ends advertise HTTP/1.1 the sending software is expected to
use Transfer-Encoding:chunked in order to keep the connection alive
unless the client sent Connection:close.
 Doing the HTTP/1.0 behaviour is also acceptible if both ends are
HTTP/1.1, but causes a performance loss due to churn and setup costs of TCP.

>
> I am really confused with this issue , because I am always getting a
> headers list from the internet and I send them back to the browser
> (using PHP and Apache) even if the object is in cache.

I am really confused about what you are describing here. You should only
get a headers list from the upstream server if you have contacted one.

You say the script is sending to the browser. This is not true at the
HTTP transaction level. The script sends to Apache, Apache sends to
whichever software requested from it.

What is the order you chained the Browser, Apache and Squid ?

  Browser -> Squid -> Apache -> Script -> Origin server
or,
  Browser -> Apache -> Script -> Squid -> Origin server

Amos
Received on Tue Nov 26 2013 - 03:30:08 MST

This archive was generated by hypermail 2.2.0 : Wed Nov 27 2013 - 12:00:08 MST