AMPP

A major part of the Netezza solution's performance advantage comes from its unique AMPP
architecture (shown in Figure 1), which combines an SMP front end with a shared nothing
MPP back end for query processing. Each component of the architecture is carefully chosen
and integrated to yield a balanced overall system. Every processing element operates on
multiple data streams, filtering out extraneous data as early as possible. More than a
thousand of these customized MPP streams work together to divide and conquer the
workload.


Let's examine the key building blocks of the appliance:

Netezza hosts
The SMP hosts are high-performance Linux servers set up in an active-passive
configuration for high availability. The active host presents a standardized interface to
external tools and applications. It compiles SQL queries into executable code segments
called snippets, creates optimized query plans, and distributes the snippets to the MPP
nodes for execution.

Snippet Blades (S-Blades)
S-Blades are intelligent processing nodes that make up the turbocharged MPP engine of
the appliance. Each S-Blade is an independent server containing powerful multi-core
CPUs, multi-engine FPGAs, and gigabytes of RAM, all balanced and working concurrently
to deliver peak performance. The CPU cores are designed with ample headroom to run
complex algorithms against large data volumes for advanced analytics applications.

Disk enclosures
The disk enclosures' high-density, high-performance disks are RAID protected. Each disk
contains a slice of every database table's data. A high-speed network connects disk enclosures to S-Blades, allowing all the disks in a Netezza to simultaneously stream data
to the S-Blades at the maximum rate possible.

Network fabric
A high-speed network fabric connects all system components. The Netezza appliance
runs a customized IP-based protocol that fully utilizes the total cross-sectional bandwidth
of the fabric and eliminates congestion even under sustained, bursty network traffic. The
network is optimized to scale to more than a thousand nodes, while allowing each node to
initiate large data transfers to every other node simultaneously.

No comments:

Post a Comment