H/W Architecture

Netezza follows Asymmetric Massively Parallel Processing (AMPP) architecture.

Architectural principles

The Netezza appliances integrate database, processing, and storage in a compact system
optimized for analytical processing and designed for flexible growth. The system architecture
is based on the following core tenets that have been a hallmark of Netezza leadership in the
industry:

  • Processing close to the data source
  • Balanced massively parallel architecture
  • Platform for advanced analytics
  • Appliance simplicity
  • Accelerated innovation and performance improvements
  • Flexible configurations and extreme scalability
Processing close to the data source
The Netezza architecture is based on a fundamental computer science principle: when
operating on large data sets, do not move data unless absolutely necessary. The Netezza
fully exploits this principle by utilizing commodity components called Field Programmable
Gate Arrays (FPGAs) to filter out extraneous data as early in the data stream as possible and
as fast as data streams off the disk. This process of data elimination close to the data source
removes I/O bottlenecks and frees up downstream components such as the CPU, memory,
and network from processing superfluous data, thus having a significant multiplier effect on
system performance.

Balanced, massively parallel architecture
The Netezza architecture combines the best elements of Symmetric Multiprocessing (SMP)
and Massively Parallel Processing (MPP) to create an appliance purpose-built for analyzing
petabytes of data quickly. Every component of the architecture, including the processor,
FPGA, memory, and network, is carefully selected and optimized to service data as fast as
the physics of the disk allows, while minimizing cost and power consumption. The Netezza
software orchestrates these components to operate concurrently on the data stream in a
pipeline fashion, thus maximizing utilization and extracting the utmost throughput from each
MPP node. In addition to raw performance, this balanced architecture delivers linear
scalability to more than a thousand processing streams executing in parallel, while offering a
very economical total cost of ownership.

Platform for advanced analytics
The principles of MPP and processing data close to the source are equally applicable to
advanced analytics on large data sets. Netezza appliances simply process on a massively
parallel scale complex algorithms expressed in languages other than SQL, with none of the
intricacies typical of parallel and grid programming. Running analytics of any complexity on
stream against huge data volumes eliminates the delays and costs incurred moving data to
separate hardware. It accelerates performance by orders of magnitude, making Netezza the
ideal platform to converge data warehousing with advanced analytics.

Appliance simplicity
By automating and streamlining day-to-day operations, the Netezza architecture shields
users from the underlying complexity of the platform. Simplicity rules whenever there is a
design tradeoff with any other aspect of the appliance. Unlike other solutions, it just runs,
handling demanding queries and mixed workloads with blistering speed, without the tuning
required by other systems. Even normally time-consuming tasks such as installation,
upgrades, and ensuring high availability and business continuity are vastly simplified, saving
precious time and resources.

Accelerated innovation and performance improvements
One of the key goals of the Netezza architecture is to deliver price-performance
improvements and innovative functionality faster than competing technologies over the long
run. While the use of open, blade-based components allows the Netezza architecture to
incorporate technology enhancements very quickly, the turbocharger effect of the FPGA, a
balanced hardware configuration, and tightly coupled intelligent software combine to deliver
overall performance gains far greater than those of individual elements. In fact, the Netezza
platform has delivered more than 4 times performance improvement every two years (double
that of Moore's Law) since its introduction.

Flexible configurations and extreme scalability
The Netezza platform scales modularly from a few hundred gigabytes to tens of petabytes of
queryable user data. The system architecture serves the needs of different segments of the
data warehouse and analytics market. The use of open blade-based components allows the
disk-processor-memory ratio to be easily modified in configurations that cater to
performance- or storage-centric requirements. The same architecture also supports
memory-based systems that provide extremely fast, real-time analytics for mission-critical
applications.

No comments:

Post a Comment