Wednesday, October 29, 2008

Upgrading an Entrepreneur ASP infrastructure - PART III

VMware - The leading industry Virtualization Provider. It comes in two flavors VMWare ESX and VMWare Server (formerly GSX). The former comes at a cost and the latter FREE!

In a low cost setup Vmware server (Free version) is the best recommended solution. Although I must mention that the ESX server provides additional much wanted enterprise features such as support for Network attached SAN, iscsi, clustering and resource pooling.

VMWare is supported on many OSes, Windows, Redhat, CentOS, Ubuntu and more.

In this setup I recommend using CentOS as the base operating system as it is built from the stable Redhat Enterprise Linux source and is free!

Below are steps that should be followed to prepare for VMWare installation.

1. Prepare mirrored hardware RAID if available.
2. Partition the system as follows during OS installation:

- Boot (ext3,primary) = 100 mb
- LVM (LVM,primary) = All remaining space
- LVM name = RAID1
- SWAP = 2G (or 2 x RAM)
- TMP (/TMP) ext3 = 1G
- VAR_LOG (/var/log) ext3 = 2G
- ROOT (/) ext3 = 5G
- VAR_LIB_VMWARE (/var/lib/vmware) ext3 = 100G+
- Leave Free space or allocate all to VAR_LIB_VMWARE

In this setup there are two logical partitions Boot and LVM. Inside LVM contains additional logical partitions with the key note that VAR_LIB_VMWARE is where all the guest OS images reside. This mount point should be in it's own partition in order to prevent allocating too much space affecting the host OS.

A LVM is a Logical Volume Manager. In current releases of Linux, LVM is installed by default for partitioning. LVM allows an administrator to dynamically resize partitions much like partition magic for windows. Conventional Linux partitions suffer from the inability to resize easily. Using Fdisk often requires the system to be booted in rescue mode and risk loss of data. By using LVM, logical volumes are now grouped into a single resource pool. In this pool is a collection of logical extents, these extents are fixed in size and are usually several megabytes. This pool of extents are then allocated to each logical volume to form a partition. When more space is needed, additional free logical extents can be added to those volumes live.

Upgrading an Entrepreneur ASP infrastructure - PART II

After analyzing the situation I have identified several key problems.

1. System Availability - Systems fail frequently due to hardware failures, DoS, and application failures.
2. Collocation is far and administration requires frequent visits
3. Mini tower servers consume space and the 1/2 rack space maybe reaching capacity limits.
4. Network lacks sufficient protection against malicious attacks.
5. Subnet is small and may reach IP assignment limits.

A) The culprit to system availability is the use of low cost hardware. Low cost motherboards and network cards can often fail as well as that different systems built during different time periods usually leads to a mix of components that may not be supported by Enterprise Linux.

RAM, CPU, Motherboard, power supply and Hard disk failures fail at different intervals with Hard disks failure being the most frequent. Most of this is attributable to combination of poor cooling and poor quality parts. As well as this, the power supply is a key component in a system that cannot be neglected as a low quality power supply can lead to more frequent component failures.

Recommendation 1 - Use enterprise grade servers such as Dell and HP rack mountable servers. Such systems are built of much high quality components and provide N+1 redundancy for components that fail often. Dual power supplies and Mirror Raid Hard drives are a necessity. It is important to use Hardware raid for added performance and to ease administration during a failure. Commercial servers provide enterprise grade device driver support. Search and recompiling drivers are a past. Furthermore, Dell's DRAC and HP's iLO are remote access tools that allows a user to remotely administer the system at a BIOS level. Using enterprise grade servers provide increase efficiency, speed and scalability for additional RAM slots and division of CPU cores.

Recommendation 2 - Embrace Virtualization. Virtualization allows multiple OS to run from a single system taking advantage from the systems unused resources such as CPU, HD and RAM by sharing them accross multiple Virtual instances (VM Guests). By combining VMWare with an enterprise server, system stability can be leveraged therefore increasing availability.

B) VMware addresses the need for on-site administration. It allows an administrator to remotely connect to VMWare server to control the guests, performing remote operations such as reboot, allocate additional Network interfaces, RAM, and Hard disk space. All of this is shared from a resource pool belonging to the underlying server. Other neat features include remote mounting external devices and creating a template VM instance allowing the administrator to stamp out pre-configured OS installations with minimal time. Another great advantage of VMware is it allows multiple different OSes such as windows and Linux to coexist in a single host. However, there is one disadvantage which due to the fact that all the eggs are in one basket. An entire system failure could cause all virtual instances to fail. To ensure this risk is minimized 2 or more hosts should be in place in case of failure.

C) By employing VMWare and DELL/HP rack mountable servers, rack space should be reduced significantly leading for more room for expansion.

D) As a secondary phase of the project a robust Firewall needs to be in place to protect against outside DoS and hack attempts. This is a vital piece of equipment which cannot be neglected as it will reduce or remove malicious attacks completely. It also helps hide the underlying network and can help map external IPs to internal IPs and allow only the ports necessary for access. By using a hardware firewall, the OS firewall can be switched off. As well as this, such appliances offer VPN capabilities for protected administrative access to the systems. Such a device is highly sophisticated and it is recommended to use no other brand than CISCO for it's reliability and feature set. The Cisco ASA 5505 unlimited user license is a low cost entry point for such a scale of setup. Due to the price of even the lowest model, the second hand market may need to be considered.

E) By employing a firewall, NAT overloading and static natting can be performed to allow more than one system to use a single WAN IP therefore reducing the need for a large address space.

Note: Noting that the current administrator may not have sufficient knowledge to administer the device, my recommendation is to hold off on the purchase until the systems have reached a certain stability and scale. An experienced administrator needs to be hired to help configure and maintain the device.

Upgrading an Entrepreneur ASP infrastructure - PART I

Any startup entrepreneurial hosting business usually run into many technical challenges. They face difficult business decisions and often have to trade off between stability, scalability and underlying profit.

There is no win/win situation, however I do know one thing - time is money. In a setup where systems experience frequent downtime, hardware failures or even just frequent visits to the data center for administration can be costly in time to the business.

Here's a case study of a hosting business. Netdreamland is a service provider, providing hosting services for various clients from simple web hosting to sophisticated application services requiring administrative access to the systems. Currently Netdreamland rents half a rack from a remote collocation facility with seven low budget mini tower servers. The systems are assigned individual public IPs as they are fed off of a 3com unmanaged switch directly connected to the ISP. The business owner currently faces a dilemma; he is a one man team who manages the business side and the administration side of the business, He has no time to attend to the systems. Moreover, Netdreamlands systems often fail due to hardware failures, application failures and Denial of Service attacks which renders the system unreachable. This is consuming a lot of his time and energy as he often finds himself driving to the data center for simple reboots to hardware replacements, often in the middle of the night. The collocation is quite a drive away from the office and his last visit to the site was to install a new system for a new customer. He is now afraid to expand his customer base too aggressively as it will increase his visits to the point where he cannot tend other business matters. Netdreamland is profitable but is at a point where further expansion will jeopardize service availability.