Tuesday, November 17, 2009

redundant load-balancing firewall system

A redundant load-balancing firewall system, using FreeBSD.


Goal: Configure two or more redundant IPF-based firewalls, which
will also act as load-balancers (henceforth referred to as
"FWLBs") for an internet services cluster. If one firewall
fails, the second will take over as the firewall/load
balancer. Barring any catastrophes, this should result in
a highly-available, almost zero downtime environment. This
hopefully saves you loads of money on expensive dedicated
hardware devices.

We'll end up with a configuration like the following:

                         ~~~
                       ( net )
                         ~~~
                          |
                          |
                          |
                      ----------
                     |  switch  |
                      ---------- 
                      /        \
                     /   VIP    \
             ---------         --------
            |  fwlb1  |       | fwlb2  |
             ---------         --------
                    \    VIP   /
                     \        /
                      ---------
                     | switch |
                      ---------
                      /      \ 
                     /        \
              ---------        ---------
             | server1 |      | server2 |
              ---------        ---------

Where the first VIP (Virtual IP) is the publicly known IP of
your site, and the second is an RFC 1918 private address that
your servers will use as a default route and NAT through.


Prerequisites:

- Two dual-homed FreeBSD-STABLE boxes
- Two switches/hubs
- Two or more real servers for a cluster
- /usr/ports/net/pen
- /usr/ports/net/freevrrpd
- /usr/ports/sysutils/daemontools or /usr/ports/sysutils/runit
- kernel compiled with IPFILTER and IPFILTER_LOG


Pen is a simple but flexible load balancer. FreeVRRPd is what will be handling our failover for us. Daemontools(or Gerrit Pape's excellent, more licensing-friendly and featureful workalike, runit) is not strictly necessary, but it's a tool I find quite useful for process control and supervision, so I've used it in the examples here.

Procedure:

PART 1: The Firewall

Create an IPF(or IPFW, or PF, for that matter) ruleset, allowing
traffic in on the port of the service you'll be balancing.
Optionally, allow the cluster to be able to NAT through
this box, if they'll be needing to initiate outbound
connections. IPF/IPNAT configuration is beyond the scope
of this document, but there's been plenty written on the
subject. The two points important to keep in mind are to
make sure you allow free multicast communication between
both firewalls(VRRP requires it), and to make sure other hosts
can't(or you'll run into some unpleasant security possibilities).
For more IPF info, please refer to:

http://www.nwo.net/ipf/ipf-howto.html
http://coombs.anu.edu.au/~avalon/ip-filter.html

Short story - build a kernel with the options above, add the following
to /etc/rc.conf:

ipfilter_enable="YES"
ipfilter_rules="/etc/ipf.conf"
ipnat_rules="/etc/ipnat.conf"
ipnat_flags="-CF"
ipmon_enable="YES"

And put your rulesets in the proper places.

PART 2: The balancer

1) Create the necessary users and directories.

mkdir -p /etc/supervise/pen/log
mkdir -p /var/chroot/pen
mkdir -p /var/log/pen
pw useradd pen -s /bin/false -d /var/chroot/pen
pw useradd penlog -s /bin/false -d /var/chroot/pen
chown penlog:pen /var/log/pen

2) Create the runfiles for pen.

cd /etc/supervise/pen
cat << _EOF_ > run
#!/bin/sh

exec 2>&1
exec pen -d -u pen -j /var/chroot/pen -C localhost:8888 -f -r 80 hostname1 hostname2

_EOF_
chmod 755 run
cd log

cat << _EOF_ > run
#!/bin/sh

exec /usr/local/bin/setuidgid penlog /usr/local/bin/multilog s999999 n20 /var/log/pen

_EOF_
chmod 755 run


This will configure pen to run chrooted in /var/chroot/pen,
with a control port of 8888. It will be balancing port 80
incoming to port 80 on hostname1 and hostname2. This is
configured for round-robin balancing - if you require sticky
sessions, remove the "-r" flag. This example has pen
logging somewhat verbosely, to aid in debugging. You may
wish to remove the "-d" in a production environment.

3) Start up the load-balancing services.

cd /service
ln -s /etc/supervise/pen
echo "csh -cf '/usr/local/bin/svscanboot &'" >> /etc/rc.local
csh -cf '/usr/local/bin/svscanboot &'
sleep 5 && svstat pen


You should now be able to point your browser, dns resolver, etc
to either of the IPs of these machines, and see it balancing
out to your real servers. You can confirm this by tailing
/var/log/pen/current on both machines.

PART 3: Redundancy

1) First, configure syslog to log VRRP info to its own file.

touch /var/log/freevrrpd.log
cat << _EOF_ >> /etc/syslog.conf

!freevrrpd
*.* /var/log/freevrrpd.log

_EOF_


2) Configure FreeVRRPd

Until this point, both machines have been equal. Now, you
need to choose which FWLB is going to be your primary. On
this machine, Copy /usr/local/etc/freevrrpd.conf.sample to
/usr/local/etc/freevrrpd.conf. Edit the file, and configure
it along the following lines:

# public-facing VRID
[VRID]
serverid = 1
interface = fxp0
priority = 255
addr = 198.123.111.1/32
password = vrid1
vridsdep = 2

# backend VRID
[VRID]
serverid = 2
interface = fxp1
priority = 255
addr = 10.0.0.1/32
password = vrid2
vridsdep = 1

This results in 2 VRIDs being created - one for the front-facing
network, and one for the rear-facing one that the cluster will
be using. In this example, both VRIDs are configured to consider
this host the master server during VRRP elections.

Note that both VRIDs depend on the other, specified by the
"vridsdep" field. This is important - it means that if one
of the interfaces in a FWLB fails, the other one will
automatically go into backup mode, failing both interfaces
to the slave FWLB. This avoids the backend servers trying
to route through a machine with a dead front-end connection.

You should now copy this file over to the slave FWLB, and
change both priority fields to be 100. Change the password
field on both to something more original, but certainly don't
rely on VRRP passwords as a security measure. If another box
outside of this cluster is in a position to communicate with
it over VRRP, you've got a problem.

3) Start FreeVRRPd

You can now start up freevrrpd on both boxes:

cp /usr/local/etc/rc.d/freevrrpd.sh{.sample,}
/usr/local/etc/rc.d/freevrrpd.sh start


PART 4: Failover testing

Now you just need to verify that this whole setup works in
the case of a failure. First, configure both FWLB boxes
to start an SSH daemon, so we have something to connect to
to verify that the interfaces fail properly. Try the
following scenarios:

- From one of the machines in your cluster, ssh to 10.0.0.1,
and log in. Verify the hostname of the machine is the
hostname of the master FWLB.
- While watching /var/log/pen/current on FWLB1, connect to
198.123.111.1, port 80 from a machine on the front-end
network. Verify that you see the connection occur.

- Pull the front-end interface of FWLB1.
- Watch the logs on FWLB2. Connect to 198.123.111.1, port 80,
and verify that the connection occurred.
- SSH again to 10.0.0.1. You should see the hostname of FWLB2
in the SSH banner.

- Reconnect the front-end interface of FWLB1. Verify that
both interfaces on FWLB1 recover back to master state.

Now conduct the same tests when unplugging the backend interface.
For fun, you may want to just hit the reset button on FWLB1
while actively hitting the web servers.


Notes:

Removing servers from a pool:

Pen cannot permanently remove servers from a pool, but if you need
to have it ignore a server while doing upgrades or such, you can
just do:

penctl localhost:8888 server $servername blacklist 99999

This will blacklist the server for 99999 seconds, giving adequate time
to seed the server. When the server is back in service, just do:

penctl localhost:8888 server $servername blacklist 1

This will reset the blacklist timeout to 1 second, bringing it back
into the pool.

Permanently adding or removing servers to/from a pool:

In the case that more servers need to be added or removed, the
/service/pen/run file will need to be edited. Simply add the
hostname to the end of the pen command, and do:

svc -t /service/pen

This will TERM and restart pen. While this may not cause interruption
to the user, it's probably wise to do this while in maintenance
mode or off hours.


References:

http://www.faqs.org/rfcs/rfc2338.html
http://www.bsdshell.net/hut_fvrrpd.html
http://siag.nu/pen/
http://cr.yp.to/daemontools.html
http://smarden.org/runit


© 2004 David Thiel --- lx [@ at @] redundancy.redundancy.org

Updated Sun Mar 21 18:23:28 PST 2004

No comments: