A
computer cluster is a group of tightly coupled
computers that work together closely so that in many respects they can be viewed as though they are a single computer. The components of a cluster are commonly, but not always, connected to each other through fast
local area networks. Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.
Cluster categorizations High-availability clusters (also known as failover clusters) are implemented primarily for the purpose of improving the availability of services which the cluster provides. They operate by having redundant
nodes, which are then used to provide service when system components fail. The most common size for an HA cluster is two nodes, which is the minimum requirement to provide redundancy. HA cluster implementations attempt to manage the redundancy inherent in a cluster to eliminate single points of failure. There are many commercial implementations of High-Availability clusters for many operating systems. The
Linux-HA project is one commonly used
free software HA package for the
Linux OSs.
High-availability (HA) clusters Load-balancing clusters operate by having all workload come through one or more load-balancing front ends, which then distribute it to a collection of back end servers. Although they are primarily implemented for improved performance, they commonly include high-availability features as well. Such a cluster of computers is sometimes referred to as a
server farm. There are many commercial load balancers available including
Platform LSF HPC,
Sun Grid Engine,
Moab Cluster Suite and
Maui Cluster Scheduler. The
Linux Virtual Server project provides one commonly used free software package for the Linux OS.
Load-balancing clusters High-performance computing (HPC) clusters are implemented primarily to provide increased performance by splitting a computational task across many different
nodes in the cluster, and are most commonly used in scientific computing. Such clusters commonly run custom programs which have been designed to exploit the parallelism available on HPC clusters. HPCs are optimized for workloads which require jobs or processes happening on the separate cluster computer nodes to communicate actively during the computation. These include computations where intermediate results from one node's calculations will affect future calculations on other nodes.
One of the most popular HPC implementations is a cluster with nodes running
Linux as the
OS and free software to implement the parallelism. This configuration is often referred to as a
Beowulf cluster.
Microsoft offers
Windows Compute Cluster Server as a high-performance computing platform to compete with
Linux.
Many software programs running on
High-performance computing (HPC) clusters use libraries such as
MPI which are specially designed for writing scientific applications for HPC computers.
High-performance computing (HPC) clusters Main article: Grid computing Grid computing The
TOP500 organization's semiannual list of the 500 fastest computers usually includes many clusters. TOP500 is a collaboration between the
University of Mannheim, the
University of Tennessee, and the National Energy Research Scientific Computing Center at
Lawrence Berkeley National Laboratory. As of November 2006, the top
supercomputer is the
Department of Energy's IBM BlueGene/L system with performance of 280.6
TFlops.
Clustering can provide significant performance benefits versus price. The
System X supercomputer at
Virginia Tech, the 28th most powerful supercomputer on Earth as of June 2006
[1], is a 12.25 TFlops computer cluster of 1100
Apple XServe G5 2.3 GHz dual-processor machines (4
GB RAM, 80 GB
SATA HD) running
Mac OS X and using
InfiniBand interconnect. The cluster initially consisted of
Power Mac G5s; the rack-mountable XServes are denser than desktop Macs, reducing the aggregate size of the cluster. The total cost of the previous Power Mac system was $5.2 million, a tenth of the cost of slower
mainframe computer supercomputers. (The Power Mac G5s were sold off.)
The central concept of a
Beowulf cluster is the use of
commercial off-the-shelf computers to produce a cost-effective alternative to a traditional supercomputer. One project that took this to an extreme was the
Stone Soupercomputer.
However it is worth noting that FLOPs (floating point operations per second), aren't always the best metric for supercomputer speed. Clusters can have very high FLOPs, but they cannot access all data the cluster as a whole has at once. Therefore clusters are excellent for parallel computation, but much poorer than traditional supercomputers at non-parallel computation.
JavaSpaces is a specification from
Sun Microsystems that enables clustering computers via a distributed shared memory.
History MPI is a widely-available communications library that enables parallel programs to be written in
C,
Fortran,
Python,
OCaml, and many other programming languages.
The GNU/Linux world sports various cluster software; for application clustering, there is
Beowulf,
distcc, and
MPICH.
Linux Virtual Server,
Linux-HA - director-based clusters that allow incoming requests for services to be distributed across multiple cluster nodes.
MOSIX,
openMosix,
Kerrighed,
OpenSSI are full-blown clusters integrated into the
kernel that provide for automatic process migration among homogeneous nodes. OpenSSI, openMosix and Kerrighed are
single-system image implementations.
Microsoft Windows Compute Cluster Server 2003 based on the
Windows Server platform provides pieces for High Performance Computing like the Job Scheduler, MSMPI library and management tools.
NCSA's recently installed Lincoln is a cluster of 450 Dell PowerEdge™ 1855 blade servers running Windows Compute Cluster Server 2003. This cluster debuted at #130 on the
Top500 list in June 2006.
DragonFly BSD, a recent
fork of
FreeBSD 4.8 is being redesigned at its core to enable native clustering capabilities. It also aims to achieve
single-system image capabilities.
See also
No comments:
Post a Comment