Re: [squid-users] Can url_rewrite_program determine the referer?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Mon, 17 Oct 2011 17:03:31 +1300

On 17/10/11 15:51, dustfinger x wrote:
> On Sun, Oct 9, 2011 at 4:43 PM, Amos Jeffries<squid3_at_treenet.co.nz> wrote:
>> On Sun, 9 Oct 2011 16:07:00 -0600, dustfinger x wrote:
>>>
>>> On Sun, Oct 9, 2011 at 12:19 PM, Diego Woitasen
>>> <diego_at_woitasen.com.ar> wrote:
>>>>
>>>> On Sun, Oct 9, 2011 at 11:10 AM, dustfinger x
>>>> <dustfinger_at_muddymukluk.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> The system is using Squid version 3.1.8.
>>>>>
>>>>> I have configured squid to use a url_rewrite_program that redirects
>>>>> users to the company portal sites under certain circumstances. The
>>>>> problem is that the portal sites references external content and the
>>>>> external content URI's are also being re_written. Is there any way for
>>>>> me to determine if a particular uri has a referer? It would be idea if
>>>>> I could determine the referer's domain name, but even just knowing if
>>>>> the uri request has a referer would be helpful.
>>>>>
>>>>> Sincerely,
>>>>>
>>>>> dustfinger.
>>>>>
>>>>
>>>> Use "acl referer_regex" and url_rewriter_access. For example:
>>>>
>>>> acl intranet_ref referer_regex *.intranet.com.*
>>>> url_rewriter_access deny intranet_ref
>>>> url_rewriter_access allow all
>>>>
>>>> Regards,
>>>> Diego
>>>>
>>>>
>>>> --
>>>> Diego Woitasen
>>>>
>>>
>>> Hi Diego,
>>>
>>> Thank you very much for your response. The challenge that I face is
>>> that there will probably be a lot of referred to domains and these may
>>> potentially change over time. If it is possible to somehow determine
>>> the referrer from the url_rewrite_program, then that would be idea.
>>> Your solution is not totally out of the question, but it would be a
>>> maintenance issue for me.
>>>
>>> I suspect that what I want to do is simply not supported, but I did
>>> read one user's post that he was able to pass the referrer to the
>>> redirector using user variables. The poster did not detail how he went
>>> about this though.
>>
>> Sounds way too complex. You can use external ACL in url_rewrite_access to
>> make the ACL checks real-time based on arbitrary data source. Usage the same
>> as in http_access.
>>
>> Amos
>>
>
> Hi,
>
> Correct me if I am wrong, but in your suggested solution I would still
> have to know in advance all of the domain names that I wanted to
> redirect, or all of the domain names that I do not want to redirect.
> Is it possible to use your suggested solution in the following
> scenario:
>
> Suppose that I have in a database a list of URI's that I would like to
> allow access to. Consider one of these URI's and let's referrer to
> that URI as URI_A. It turns out that URI_A contains content that is
> hosted by a domain not contained in our database of URI's that we
> would like to allow access to. Let's refer to the referred-to URI as
> URI_UNLISTED. Now when a user requests content from URI_A, any of the
> content that is referred to by URI_A, but is hosted by URI_UNLISTED,
> is redirected. The result is that none of the reffered to content will
> be returned to the client and that is not the behavior that I am
> looking for.

If I understand that right you want:
  * unknown page 'URI_A' which is pointing *to* one of your acceptable
URLs to be automatically accepted.
  * AND you want all other references that page makes (URI_UNLISTED) to
be also automatically accepted.

Assuming that is correct, two major problems:
  1) what if the URI_A refers to a URI_UNLISTED before it refers to the
acceptable URL?
  2) what if malicious person adds reference to one of your acceptable
URLs to a page you actually want blocked?

I hope I misunderstood your above paragraph. The one below is clearer
and seems to describe a safer set of requirements.

>
> This is what I am looking for. If a client makes a direct request to
> URI_UNLISTED, then I would like to redirect that request by rewriting
> the URI. If the client makes a request to URI_A, and URI_A refers to
> URI_UNLISTED, then I would like all of the content to be accessible,
> with no URI rewriting. That is, the request to URI_UNLISTED is
> accepted since it is being referred to by a URI that is in our
> database.
>
> I know that if I could gain access to the referrer in the
> url_rewrite_program, then I could achieve this behavior.
>
> Does anyone know how I could achieve the behavior that I have described.

You don't need a URL-change helper. You need an access control helper.

This is how I would code up the config and script to meet your scenario
requirements:

  # external ACL helper. To determine if this URL request is acceptable.
  # may be a read-only lookup, or may add thing to the database live.
  # whatever you desire as side-effects, it could do.

  external_acl_type urlTest %URI %{Referer}>h /path/to/script

  acl urlIsOkay external urlTest

  # redirect to this URL if the requested URL is bad.
  deny_info http://example.com/badurl.html urlIsOkay

  # deny the bad URL requests.
  # NOTE: using allow here will not trigger the redirect above.
  http_access deny !urlIsOkay

  # the regular
  http_access allow localnet
  http_access deny all

You risk malicious persons adding forged Referer: headers to their
requests in order to get past your access controls. This is a standing
risk with depending on the Referer in any security system you need to be
aware of.

see http://www.squid-cache.org/Doc/config/external_acl_type for details
on the external ACL directive and its parameters. What I wrote above was
an example and you will need to tune things further.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.16
   Beta testers wanted for 3.2.0.13
Received on Mon Oct 17 2011 - 04:03:39 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 18 2011 - 12:00:04 MDT