On 7/19/08, André Warnier <
aw@ice-sa.com> wrote:
> From a recent thread originally dedicated to find out if a proxy server can
> be really "transparent", I'll first quote a summary from "solprovider".
>
> quote
>
> I think the confusion is between an network proxy server and a Web
> "reverse" proxy server.
>
> A network proxy server handles NAT (Network Address Translation). A
> company internally uses private IP addresses (e.g. 10.*.*.*). All
> Internet traffic from these internal addresses use a network proxy
> server to reach the Internet. The proxy server changes the
> originating IP Addresses on the outbound packets from the internal
> network IP address to the proxy's Internet IP address. Responses from
> the Internet server are received by the proxy server and changed again
> to be sent to the originating computer on the internal network. The
> browser uses the Internet domain name so Cookies are not affected.
>
> A Web "reverse" proxy server handles multiple software applications
> appearing as a single server. The applications can be found on
> multiple ports on one server or on multiple hardware servers. Visitor
> traffic to several applications goes to one IP Address. The Web
> server at that IP Address decides where the request should be sent
> distinguishing based on the server name (using Virtual Servers) or the
> path (using Rewrites). If the applications use Cookies, the
> application Cookies must be rewritten by the Web proxy server because
> the browsers use the server name of the Web proxy server, not the
> application servers.
> 1. The browser requests
http://myapp.example.com.
> 2. The Web proxy server
myapp.example.com sends the request to
>
myInternalApplicationServer.example.org.
> 3. The
myInternalApplicationServer.example.org sends a
> response with a
> Cookie for
myInternalApplicationServer.example.org to the
> Web proxy
> server.
> 4. The Web proxy server changes the Cookie from
>
myInternalApplicationServer.example.org to
>
myapp.example.com.
> 5. The browser receives the Cookie for
myapp.example.com and send the
> Cookie with future requests to the Web proxy server.
> 6. The Web proxy server sends the incoming Cookies with the request to
> the application server as in #2. (Depending on security, the incoming
> Cookies may need to be changed to match the receiving server.)
> 7. GOTO #3.
>
> Deciding the type of proxy server being used may be confusing. An
> Internet request for an internal server can be handled with either
> type depending on the gateway server.
> - Network proxy: The gateway uses firewall software for NAT -- all
> requests for the internal server are sent to the internal server. The
> internal server sends Cookies using its Internet name.
> - Web proxy: The gateway is a Web server. Internal application
> servers do not use Internet names so the gateway must translate URLs
> and Cookies.
>
> --
> The specification in the OP was how to Web proxy requests:
> 1. Server receives request for
>
http://www.example.com/amazon/...
> 2. Server passes request to
http://www.amazon.com/...
> 3. Server translates response from amazon so the visitor receives
> Cookies from .
example.com.
> 4. Future requests are translated so the Web proxy server
> (
www.example.com) sends the requests including Cookies to
amazon.com.
>
> Read
http://httpd.apache.org/docs/2.0/mod/mod_proxy.html
> Read the sections applying to "reverse" proxies. Ignore "forward"
> proxying because that process is not transparent -- the client
> computer must be configured to use a forward proxy.
>
> I once had difficulty with ProxyPass and switched to using Rewrites so
> I would handle this with something like:
> RewriteEngine On
> RewriteRule ^/amazon/(.*)$
http://www.amazon.com/$1 [P]
> ProxyPassReverseCookieDomain
amazon.com example.com
> ProxyPassReverse /amazon/
http://www.amazon.com/
> This should handle Cookies and handle removing/adding "/amazon" in the
> path.
>
> We have not discussed changing links in pages from
amazon.com to use
>
example.com. This simple often-needed functionality has been ignored
> by the Apache httpd project. (This functionality was included in a
> servlet I wrote in 1999.) Research "mod_proxy_html".
>
> unquote
>
> Now, I believe that there is still a third type of proxy, as follows :
>
> When I configure my browser to use "
ourproxy.ourdomain.com:8000" as the
> HTTP proxy for my browser, it means that independently of whatever NAT may
> be effected by an internal router that connects my internal network to the
> internet, something else is going on :
> Whenever I type in my browser a URL like "
http://www.amazon.com", my
> browser will not resolve "
www.amazon.com" and send it a request like :
> GET / HTTP/1.1
> Host:
www.amazon.com
>
> Instead, my browser will send a request to "
ourproxy.ourdomain.com:8000",
> as follows :
> GET
http://www.amazon.com HTTP/1.1
> Host:
www.amazon.com
> ...
>
> The server at "
ourproxy.ourdomain.com:8000" will then look up in his page
> cache, to see if it already has this page from a previous access. Then it
> will either return this cached page, or retrieve the page anew from
> "
www.amazon.com", cache it (maybe) and deliver the newly-fecthed page. (I am
> skipping a lot of details about freshness, no-cache etc..)
>
> The main (original) question was however : what happens in this case to
> cookies possibly set by "
www.amazon.com" ?
>
> I personally imagine that such a proxy server (which I guess is the
> "forward" kind) caches only page contents, not the HTTP headers returned
> with each page, or am I wrong ?
>
> And in any case, if a page was returned from "
www.amazon.com" along with a
> "Set-Cookie" HTTP header, it should not be cache-able by the proxy server,
> or am I wrong again ?
>
> And, if such a proxy retrieves a new page from an external server, and the
> page comes back with a "Set-Cookie" header, this cookie header is then
> passed unchanged to the original browser requester, isn't it ?
>
> And the requesting browser should accept this cookie as originating from
> "
www.amazon.com", even if technically this answer comes back from the proxy
> server, no ?
> André