Many
organizations are now looking for new ways to perform compute intensive tasks
at a lower cost. Fast provisioning, minimal administration, and flexible
instance sizing and capacity, along with innovative third party support for
grid coordination and data management through cloud based platforms that support
high performance grid computing.
One
can allocate compute capacity on demand without upfront planning of data
center, network, and server infrastructure. You have access to a broad range of
instance types to meet your demands for CPU, memory, local disk, and network
connectivity. Infrastructure can be run in any of a large number of global
regions, without long lead times of contract negotiation and a local presence,
enabling faster delivery especially in emerging markets. One can define a
virtual network topology that closely resembles a traditional network that you
might operate in your own data center. One can build grids and infrastructure
as required for isolation between business lines, or sharing of infrastructure
for cost optimization. Operation task of running compute grid of multiple nodes
can be made fully automated. One can combine elastic compute capacity with
other services to minimize complexity.
High
performance computing (HPC ) allows end users to solve complex science,
engineering, and business problems using applications that require a large
amount of computational resources, as well as high throughput and predictable
latency networking. Most systems providing HPC platforms are shared among many users, and comprise a significant
capital investment to build, tune, and maintain.
Many
commercial and open source compute grids use HTTP for communication and can
accept relatively unreliable networks with variable throughput and latency.
However, for ticking risk applications, and in some proprietary compute grids,
network latency and bandwidth can be important factors in overall performance.
Compute grids typically have hundreds to thousands of individual processes
(engines) running on tens or hundreds of machines. For reliable results,
engines tend to be deployed in a fixed ratio to compute cores and memory (for
example, two virtual cores and 2 GB of memory per engine). The formation of a
compute cluster is controlled by a grid “director” or “controller,” and clients
of the compute grid submit tasks to engines via a job manager or “broker.” In
many grid architectures, sending data between the client and engines is done
directly, while in other architectures, data is sent via the grid broker. In
some architectures, known as two-tier grids, the director and broker
responsibilities are managed by a single component, while in larger three-tier
grids, a director may have many brokers, each responsible for a subset of
engines.
The
time frame for the completion of grid calculations tends to be minutes or hours
rather than milliseconds. Calculations are partitioned among engines and
computed in parallel, and thus lend themselves to a shared-nothing
architecture. Communications with the client of the computation tend to accept
relatively high latency and can be retried during failure.
No comments:
Post a Comment