17 Oct 2011 Fremont - SGI has established a new world record performance benchmark for Terasort data processing and analysis using Apache Hadoop clusters running on Cloudera's Distribution Including Apache Hadoop (CDH). The company, which recently joined the Cloudera Connect Partner Programme, has also formed a distribution relationship with Cloudera that will allow it to build, sell and deploy commercial solutions based on and around Hadoop.
Results achieved in September, 2011 show that a 20-node SGI Hadoop cluster comprised of SGI Rackable C2005-TY6 half-depth servers with Intel Xeon processor E5630 series, 48GB of memory, and 4x 1TB SATA HDDs running on Cloudera CDH3 took only 130 seconds to complete a Terasort with a job size of 100GB. Terasort helps derive the sort time for 1TB or any other amount of data in the Hadoop cluster, and is a benchmark that combines testing the HDFS and MapReduce layers of a Hadoop cluster. In this instance, Terasort scales super linearly on an SGI Rackable C2005-TY6 cluster running Cloudera distribution of Apache Hadoop (CDH3u0), and was shown to be 81% faster than an Oracle Sun X2270 cluster of similar size.
"SGI has been successfully deploying Hadoop customer installations of up to 40,000 nodes and individual Hadoop clusters of up to 4,000 nodes for a number of years now", stated Bill Mannel, vice president of product marketing at SGI. "This benchmark, our growing presence, and our role in the Hadoop ecosystem, reflect our ongoing commitment to pushing the bar on performance and driving relationships that benefit our customers. As they wrestle with bigger and more complex data challenges every day they can trust SGI to deliver complete Hadoop solutions based on years of experience."
Hadoop is a powerful and disruptive open source technology that addresses challenges in the economics, flexibility and scalability for Big Data. Hadoop forms the infrastructure foundation at leading social media companies such as Facebook, LinkedIn and Twitter. It is the fastest growing 'big data' technology, with 26% of organisations using it today in data centres and in the Cloud.