Saturday, March 26, 2011

Configuring a Transparent Proxy/Webcache in a Bridge using Squid and ebtables

A proxy/Webcache is a computer which sits between your LAN and your Internet connection, usually in the gateway. Its job is to capture and save every Web page that the client machines in your LAN visit, so that the next time someone requests a page, the proxy/Webcache already has it and sends it to the client. This saves bandwidth and usually speeds Web navigation. A bridge works exactly like a two-port switch. It passes everything from one port to the other, but if we have a Linux box acting like a switch, we can do wonderful things, because we actually "see" the traffic.

Why would I need a bridge with Squid?

There are some cases in which you do not have access to the gateway, or your gateway is a piece of dedicated hardware. Furthermore, if a bridge is used, you do not have to change anything in your network, just plug in the bridge and start working. If the Linux box acting as a proxy/Webcache is eaten by a big green monster, you can just reconnect the cables, and everything goes back to normal until you replace it.
Remember to document where in your network the bridge is. Bridges do not appear in traceroutes, and that may be a bit confusing and hard to locate in a big network.
Ok, let's start.

Setting up Squid

First, get squid running. There is a lot of documentation in the Squid distribution, so I won't cover basic configuration here. On my Fedora box, I just installed the rpm, and that was all.
Check that the following lines are present in /etc/squid/squid.conf:
httpd_accel_host virtual
httpd_accel_port 80
httpd_accel_with_proxy on
httpd_accel_uses_host_header on
Also check that your network appears in the ACLs section. For example, if your network is 192.168.1.0 netmask 255.255.255.0, use:
acl our_networks src 192.168.1.0/24
For testing, you may omit the "acl" line and just comment this:
http_access deny all
and use this instead:
http_access allow all
Be careful if you don't want to allow everyone to use your Webcache. I recommend using this configuration only for testing.
Start squid. In Fedora, you can use:
bash# service squid start
Other distributions may use:
bash# /etc/init.d/squid start
or you can start it manually. The first time you run it, it will take a few moments to build its cache files. Be patient.
In Fedora, let's make sure squid starts automatically:
bash# chkconfig squid on

Configuring the bridge

This couldn't be easier:
ifconfig eth0 0.0.0.0 promisc up
ifconfig eth1 0.0.0.0 promisc up

brctl addbr br0
brctl addif br0 eth0
brctl addif br0 eth1

ifconfig br0 200.1.2.3 netmask 255.255.255.0 up
route add default gw 200.1.2.254 dev br0
Potential Pitfall:
If your PC locks or kernel panics, it's because you have a bad network adapter card. Most cheap motherboards have a bad integrated NIC. Just get a better NIC; even an old Realtek should work fine.
In this example, I suppose you are using eth0 and eth1. In the ifconfig line, I assigned IP address 20.1.2.3 to the bridge so I can access it remotely. Use an IP address in your network. Don't forget it; you will need it later.
You may check that the bridge is working by using tcpdump:
bash# tcpdump -n -i eth0                         
                       ...
         (lots of funny stuff)
                       ...
bash# tcpdump -n -i eth1
                       ...
         (lots of funny stuff)
                       ...
Plug your machine into the network, and everything should work. Your Linux box is now a big, expensive two-port switch.

Configuring transparent redirection

We're able to see all the traffic in our network, because we are in the middle. Now we want to catch Web traffic and redirect it directly into Squid.
First, let's see if squid is correctly configured.
Go to a PC in your LAN and manually configure a proxy. If you use Firefox, for example, go to the Edit menu and select Preferences. Select General and click "Connection Settings", choose "Manual Proxy Configuration", and enter the IP address of your bridge. The port is 3128, unless you have changed it.
Try surfing the Web. If it works, you have squid running and working as desired. Now we'll move on to the fun stuff and build a "brouter".
First, install ebtables on the bridge machine. Then, just run these two commands:
bash# ebtables -t broute -A BROUTING -p IPv4 --ip-protocol 6 \
        --ip-destination-port 80 -j redirect --redirect-target ACCEPT

bash# iptables -t nat -A PREROUTING -i br0 -p tcp --dport 80 \
        -j REDIRECT --to-port 3128
The first command says that packets passing through the bridge going to port 80 will be redirected to the local machine, instead of being bridged. The second uses iptables to redirect those packets to local port 3128, so squid can take care of them.
Check squid's log to see whether you're catching traffic:
bash# tail -f /var/log/squid/access.log
You should see a lot of "[x]__HIT" messages, meaning that all that content is being caught.
Congratulations, you have a Transparent Proxy/Webcache!

Fine Tuning

You may want to fine-tune squid, adjusting how much memory or disk space it will use. Just edit /etc/squid/squid.conf.
Remember to create the ACLs (Access Control Lists) for your networks.
You may want to have a script to set up all of this at boot. Use something like this:
ifconfig eth0 0.0.0.0 promisc up
ifconfig eth1 0.0.0.0 promisc up

brctl addbr br0
brctl addif br0 eth0
brctl addif br0 eth1

ifconfig br0 200.1.2.3 netmask 255.255.255.0 up
route add default gw 200.1.2.254 dev br0

ebtables -t broute -A BROUTING -p IPv4 --ip-protocol 6  \
 --ip-destination-port 80 -j redirect --redirect-target ACCEPT
iptables -t nat -A PREROUTING -i br0 -p tcp --dport 80  \
 -j REDIRECT --to-port 3128
Save it and put it in /var/my-start-scripts/bridgeBrouter-up.sh. chmod it to 0755 and put a line in /etc/rc.local as follows:
/var/my-start-scripts/bridgeBrouter-up.sh
Have fun!