e-maknyus: Understanding Server redundancy

Understanding the concept of Server Redundancy technology in Windows Server 2003 and later to provide business continuity and availability

When an organization is dependable to the computer resources infrastructure and downtime is something that is not tolerable to the business production, deployment of Server Redundancy technology such as Clustering becomes a must for the businesses today. If the servers are down, the business stops. Therefore such clustering technology is a solution that must be adopted for the business continuity to keep the business run properly.

Clustering

Clustering is one of Server Redundancy technologies offered by Windows server 2003 (and later) which is dedicated to run one application and more by configuring two or more application servers in such a way to provide fault tolerance and load balancing. If one server is not functional or fail to operate, the other server will take over the role to keep the application operational in unnoticeable manner by the clients automatically. This is the concept of fault tolerance in server redundancy technology.

In a server cluster redundancy, each of the servers runs the same critical applications. When one of the servers fails to function, the other servers will take over the role automatically in just a second. This is the “failover” concept. If the failed server is back to normal, other server nodes will recognize this condition and the clustering system will use the server back operational. This is a “failback” concept. Windows Server 2003 by default installs this utility automatically when you install the servers, however in Widows 2000 you need to install the module separately – Microsoft Clustering System.

Server Redundancy Clustering Technology

In Windows server 2003, there are two types of server redundancy technology: Server Cluster and Network Load Balancing (NLB). The difference between the two technologies lies on the types of applications that must be run by the servers and also the types and characteristic of the data they use.

Network Load Balancing (NLB)

Network Load Balancing (NLB) is one of the server redundancy offered by Microsoft which is easy to install, manage, and maintain. You can use the available hardware and software in the servers no need of other software or hardware additional.

You can use the application available in Windows 2003 “Network load balancing manager” to create, manage, and monitor the NLBs. NLB mostly run the memory stateless applications, the applications with data that do not change all the time.

Supported by all types of Windows 2003 including Standard, Enterprise, and Datacenter Server
Can be used to deploy up to 32 nodes Network Load Balancing where each of the server has duplicate copy of the application that you want to provide to the users.
Full load balancing for both TCP and UDP traffic
Can be used for Web servers, ISA servers, VPNs, Media servers, and Terminal Servers

Network load balancing works by creating such a virtual network adapter on each of the nodes that represents a single cluster entity. Virtual adapter has independent IP address and MAC address different from each of the IP addresses given to each server interfaces. Clients will access the virtual IP address instead of accessing each of the server nodes.

If a request comes from a client to the cluster IP address, all the nodes in the cluster system will receive and process the message. In each of the nodes in NLB cluster, an NLB service will function as the filter between the cluster adapter and the computer TCP/IP stack. This filter will allow the NLB to calculate in deciding which node in the cluster system will be responsible to response to the client’s request. Each of the nodes in the cluster does not need to make communication between them. Each of the nodes will make the same independent calculation and make their own decision whether to respond the client’s request or not. The calculation formula will only be changed if the number of the server nodes changes.

Note: if you create several nodes to form NLB server, you should register the record resources in the DNS system by using the cluster name and the virtual adapter IP address.

Server Cluster

The other types of Server redundancy technology is Server Cluster which is designed for applications which data is huge enough and changed frequently and is typically called statefull Applications and contain databases such as Microsoft SQL, Exchange server, server file and printers. All of the nodes in the cluster system are connected to a set of data share a single SCSI bus or SAN (storage area network). All nodes have the same access to the same application and each node can also process the client request every time. You can configure each of the nodes to be active or passive. The active node can receive client requests, while the passive node is idle and function as the “failback” when the active node fails to function.

The following figure shows an example of the two servers which operate as the cluster system and each of the server run Windows server 2003 and Microsoft SQL server. Each of the servers is connected to the same NAS device which contains databases. Both servers have special connection which is used to detect the heart-beat of each server if the server fails to function.

The figure shows the server cluster redundancy diagram. Server A is an active node, while server B is a passive node. Server A is functioning normally all the time, running the database applications, receiving and processing the client requests, and accessing the database files on the NAS device. Unfortunately, for some reasons server A fails to function. Server B as the passive node detects the failure of the function, and as the result server B becomes active replacing server A, processing client requests using the same database in the same NAS device.

As the server redundancy the same as the NLB, server cluster also has the independent name and IP address dependant from each of the nodes IP addresses. It is therefore when the active node fails to function, clients do not need to know what’s going on to the system. Clients still access the same name and the same IP address because the system is soon back to normal using the standby machine (server B). If there are many servers in X node clustering, the survival node will take over the failed server.
This type of server cluster redundancy can only operate under Windows server 2003 Enterprise edition and Datacenter edition. Windows server 2003 Standard edition can only work for NLB but not cluster. Each of the nodes cannot use different Windows 2003 editions, should be the same Enterprise or Datacenter editions.
Maximum up to 8 server nodes to function as failover and failback each. Failback is not configured by default, you should configure it manually or automatically. Mostly the systems engineers prefer to configure the failback manually to allow them evaluate the failure of the node if it occurs.
Server Cluster redundancy requires special disk drive such as Fiber Channel, Shared SCSI, or SAN. Fiber Channel is a high speed serial networking technology up to 100 Mbps using full-duplex communication. On the other hand, SCSI uses parallel signal technology.
Typically used for SQL databases, MS Exchange, File and Print server etc.

X Node Clustering

The significant development of the clustering system in Windows server 2003 is the number of nodes up to 8 servers in the cluster system which functions as failover and failback. X node clustering allows us to create minimum 2 nodes and up to maximum 8 node failover / failback clustering.

The following figure shows cluster configuration which contains 4 nodes where each node is active as well as survival node from the designated server when the server is failure. Each of the nodes has direct access to the share storage. Each of the servers has primary role and the designated survival role. Each of the servers has special connection link via a dedicated network connection which is used to detect the heart beat for all the four nodes. Each of the nodes can detect if there is a failure of the node and the designated node will take over the role function.

Server A is database group server which is also as the survival node of the Print and File server group. If server D is failure, it will failover to server A. So server A will have two roles (database and file server)
Server B is the Messaging server group as well as the survival node for Database server. If the database server (A) failure, it will failover to server B.
Server C is WEB services server group which is also as the survival node for server B (messaging server group). If server B is failure, it will failover to server C.
Server D is Print and File server group which is also as the survival node for WEB services server.

All of the four nodes use the same share device such as Fiber Channel, Share SCSI, or SAN or any types of share disks that allow the nodes share the same disk. With this type of 4 node cluster design, each of the nodes has different application which is always active and at the same time it functions as the survival node from the designated application.

e-maknyus

Friday, July 22, 2011

Understanding Server redundancy

Stack Overflow

Server Fault