TU Ilmenau wanted to replace its existing HPC system with a solution that would improve both performance and data access speeds. This required the expansion of its existing 55 server cluster to support the more complex processing tasks and local storage requirements of its 150-200 active users.
"We're committed to academic leadership, and strive to offer researchers and students the best possible environment and support", stated Hennig Schwanbeck, IT Manager of Datacentre Administration, at the Technical University of Ilmenau. "IT plays an essential role in helping us achieve this, but the University's existing architecture could no longer handle the growing data volumes and high-powered applications, such as those used for complex simulation."
As a public-sector institute, TU Ilmenau was required to assess solutions from multiple vendors. As the result of the European tendering procedure, a collaboration between Dell and the HPC specialist ClusterVision was started. TU Ilmenau worked with the ClusterVision technical team to design an HPC cluster based on Dell PowerEdge servers.
"Our main requirement was a fast network interconnect for our nodes, something the old cluster didn't have. In addition, we wanted a parallel file system because of the global data volumes and large file sizes we now use. Our selection criteria were pragmatic: the fastest solution, with the greatest number of cores, and the best benchmark performance."
The close collaboration between ClusterVision and Dell was an important factor in ultimately selecting ClusterVision as the preferred integration partner.
"Many companies adopt a one size fits all approach to tenders. Together, ClusterVision and Dell offered the specialised expertise we needed, and were able to provide a customised, more detailed proposal", explained Hennig Schwanbeck.
"A detailed understanding of our customers' requirements and constraints is always an important part of our process, allowing us to review and select the most appropriate technologies, and design systems which are a best fit solution to each customers individual needs", stated Jan Heichler, Country Manager Germany, ClusterVision.
The resulting HPC cluster consists of 49 Dell PowerEdge R815 servers with AMD Opteron 6134 and 6136 Octa Core processors, running a Fraunhofer File System. One PowerEdge R715 server with AMD Opteron 6128 processors operates as the storage head node, and four further PowerEdge R510 servers with Intel Xeon 5506 processors deliver the clusters SAS storage. The University has a single Dell PowerEdge R510 server as the metadata storage, and a Dell PowerVault MD1200 modular disk storage array for 196 terabytes of gross storage capacity and 150 terabytes net capacity. An InfiniBand network delivers high bandwidth and low latency for fast server-to-server interconnects.
"We were extremely satisfied with the speed and ease of installation, which took place in around four weeks and caused minimum disruption to researchers and students. ClusterVision delivered the complete solution and installed it without problems, and within a short timeframe", stated Hennig Schwanbeck.
Jan Heichler, of ClusterVision continued: "Realisation of the TU Ilmenau cluster required the harmonious combination of a number of complex hardware, software and service components. All at ClusterVision are pleased that our team was able to provide TU Ilmenau with a high-quality, right-first-time installation, allowing the University's user community to focus on their applications, with minimum levels of disturbance."
With the HPC environment in place, and a high data transfer rate of 7 gigabytes per second, TU Ilmenau students can now deliver highly complex design simulations more quickly.
Hennig Schwanbeck explained: "Our CPU and memory power has multiplied by 7 times, and the new system fulfills our computing requirements with ease. Feedback has been really positive. Numerical simulations with a previous runtime of one week can now be solved in less than one day."
Dell PowerEdge servers include Energy Smart technologies for greater green efficiency. Features such as voltage regulators, and greater venting and airflow help the University to maximise performance per Watt. Smart management features such as power capping and scheduling allow TU Ilmenau to better manage their energy use. PowerEdge servers also come with a collection of sensors that automatically track thermal activity, helping regulate temperatures and further lowering energy consumption, Overall, power and cooling costs at TU Ilmenau have been reduced by between 15 and 20%.
HPC system demands are significantly higher than with other IT environments. Hennig Schwanbeck explained: "Classical servers normally work at 30% capacity at most. In HPC, we need a maximal usage of CPU and memory - so in the last year our overall cluster usage was about 95%. With our new HPC Cluster, we have greater storage capacity and can achieve a higher workload per server, so we're maximising the effectiveness of our infrastructure. The Dell PowerEdge R815 servers also offer more memory in a single machine, and it's more easily accessed, so theres more availability for our shared memory applications, and we no longer have to split the load across two or more machines", added Hennig Schwanbeck.
Hennig Schwanbeck continued: "IT personnel find the individual servers in the clusters easy to manage using each machine's Dell Baseboard Management Controller. They can pro-actively monitor the servers and maximise their performance remotely in relation to the number of requested user jobs. Furthermore, they can log server faults, check hardware temperature, control power and reset the system. In previous clusters I've worked on, making changes was problematic. Management and monitoring is simpler and less time-consuming with Dell Baseboard Management Controller."
TU Ilmenau also chose Dell ProSupport with Mission Critical to maximise performance. The modular support structure means that the University can select the support components that best meet its needs. It has chosen onsite response times of four hours for critical components, with non-critical components protected by the next business day option.