Wednesday, November 25, 2009

High Availability FreeBSD www cluster

High Availability FreeBSD www cluster

man carp

SG cluster

freevrrpd

wackamole

================================================

There's clustering and clustering.  Neither of the two applications
the OP mentioned needs anything like as tight a coupling as what many
commercial 'cluster' solutions provide, or that compute-cluster solutions
like Beowulf or Grid Engine[!] provide.

WWW clustering requires two things:


* A means to detect failed / out of service machines and redirect traffic to alternative servers

   * A means to delocalize user sessions between servers

The first requirement can be handled with programs already mentioned

such as wackamole/spread or hacluster -- or another alternative is hoststated(8)[*] on OpenBSD. You can use mod_proxy_balancer[+] on recent Apache 2.2.x to good effect. Certain web technologies provide this

sort of capability directly: eg. mod_jk or the newer mod_proxy_ajp13

modules for apache can balance traffic across a number of back-end tomcat workers: of course this only applies to sites written in Java.

If you're dealing with high traffic levels and have plenty of money to spend, then a hardware load balancer (Cisco Arrowpoint, Alteon Acedirector, Foundry ServerIron etc.) is a pretty standard choice.

The second requirement is more subtle.  Any reasonably complicated
web application nowadays is unlikely to completely stateless.  Either
you have to recognise each session and direct the traffic back to the
same server each time, or you have to store the session state in a way

that is accessible to all servers -- typically in a back-end database. Implementing 'sticky sessions' is generally slightly easier in terms of application programming, but less resilient to machine failure. There

are other alternatives: Java Servlet based applications running under
Apache Tomcat can cluster about 4 machines together so that session
state is replicated to all of them.  This solution is however not at
all scalable beyond 4 machines, as they'll quickly spend more time passing

state information between themselves than they do actually serving incoming web queries.

Mail clustering is an entirely different beast.  In fact, it's two
different beasts with entirely different characteristics.


The easy part with mail is the MTA -- SMTP has built in intrinsic concepts of fail-over and retrying with alternate servers. Just set up appropriate MX records in the DNS pointing at a selection of servers and it all should work pretty much straight away. You may need to share certain data between your SMTP servers (like greylisting status, Bayesian spam filtering, authentication databases) but the software is generally written with this capability built in.

The hard part with mail clustering is the mail store which provides the

IMAP or POP3 or WebMail interface to allow users to actually read their mail. To my knowledge there is no freely available opensource solution

that provides an entirely resilient IMAP/POP3 solution.  Cyrus Murder

comes close, in that it provides multiple back-end mail stores, easy migration of mailboxes between stores and resilient front ends. The typical approach here is to use a high-spec server with RAIDed disk systems, multiple PSUs etc. and to keep very good backups.

        Cheers,

        Matthew
==============================================
High Availability means that your cluster should work even some system components fail.

http://en.wikipedia.org/wiki/High-availability_cluster

For building HA cluster you should have at last two machines, first will run in master mode, second in slave( standby )mode.

In every time only one machine works and provide some services (www, db, etc)

Very good idea is to use NAS(SAN) - Network Access Storage ( http://en.wikipedia.org/wiki/Network-attached_storage ) with shared disk. Both nodes of HA cluster will use this shared disk (but only one in certain time). If one node fails, second node (standby node) will become a master of cluster and will start some services, that cluster provided.

But NAS systems is not cheap!!


Another way is to use software systems such us DRBD, NFS, chironfs, rsync etc. Most of this high-availability software solution works by replicating a disk partition in a master/slave mode.

Heartbeat + DRBD is one of most popular  redundant solutions.

DRBD mirrors a partition between two machines allowing only one of them to mount it at a time. Heartbeat then monitors the machines, and if it detects that one of the machines has died, it takes control by mounting the mirrored disk and starting all the services the other machine is running. Unfortunately DRBD runs only on linux but I recommend you to see how it works for understanding this technology.

http://www.rhic.bnl.gov/hepix/talks/041020am/miers.pdf
http://www.linux-ha.org
http://www.linux-ha.org/DRBD/GettingStarted
http://www.linuxjournal.com/article/9074


For freebsd to mirror content on bouth nodes you can use rsync as in this howto:

http://www.taygeta.com/ha-postgresql.html

Another way like as DRBD is to use chironfs + nfs (sysutils/fusefs-chironfs/)

http://www.furquim.org/chironfs


Also look at CARP (Common Address Redundancy Protocol)

man carp
http://www.openbsd.org/faq/pf/carp.html


http://www.postgresql.org/docs/8.3/static/high-availability.html (for databases)

ps. sorry for my eng

======================================================
Hello,


I have been running freevrrpd and pen (http://siag.nu/pen/ or in ports) for HA web services.

My setup was a firewall/gateway consisting of more than 1 machine using freevrrpd thus enabling failover for the firewall/gateway. I write firewall and not firewalls since freevrrpd creates a virtual ip that is failover'ed between the machines.

On the firewall/gateway pen were running and pointed towards the web servers. Pen can point at as many web servers as you like and balances the load between them in a very simple way. If the web servers are identical in setup they become redundant. DNS loadbalancing is very similar.

Good luck!

/Roger

====================================================
CARP does the job perfectly!

Is you have to LB/RP from a front end (the SPOF?) you can also take a
quick look on LighttpD with the Proxy module (very simple & efficient)

In a heavier (but also quite simple) environment :

* Two (or more) LB/RP on the front with lighttpdproxy - HA with CARP
* Two (or more) Load Balanced Web "Back End" servers

;)

=====================================================

Realtime File System Replication On FreeBSD
http://phaq.phunsites.net/2006/08/11/realtime-file-system-replication-on-freebsd/

No comments: