Back to Table of contents

Primeur weekly 2013-07-30

Special

When it comes to memory, bioinformatics is a hungry wolf ...

The Cloud

IBM boosts zEnterprise mainframe portfolio to help clients build better customer experiences ...

CA releases mainframe performance enhancement innovations ...

Desktop Grids

Supercomputer created on the cheap ...

Want to help cure disease or discover new stars? Now you can, using your smartphone ...

Miron Livny earns distributed computing award ...

New app puts idle smartphones to work for science ...

EuroFlash

Customer validation achieved for Genalice map ...

e-IRG's White Paper 2013 published: Europe needs a e-Infrastructure Commons ...

Barco and projectiondesign deliver next-generation projection innovation for theme parks at EAS 2013 ...

Cray awarded $30 million contract to install a Cray XC30 supercomputer for the UK National Supercomputing Facility ...

PRACE launches pilot of high performance computing adoption programme for European SMEs ...

Iberdrola and the Barcelona Supercomputing Center develop the Sedar project ...

HOST project to organize workshop for scientific problems ...

Teenagers use supercomputing to design future aircraft with the Smallpeice Trust ...

USFlash

First Mira runs break new ground with turbulence simulations ...

NVIDIA pushes further into high performance computing with Portland Group acquisition ...

Parallella is shipped to early Kickstarter backers ...

OpenMP 4.0 API specification is released with significant new standard features ...

Studies suggest new key to switching off hypertension ...

Planned Systems International marks 25 years of excellence in technology and service ...

OSC OnDemand gives computational researchers innovative web interface to HPC systems ...

Notre Dame researchers develop system that uses a Big Data approach to personalized health care ...

Los Alamos National Laboratory upgrades its Powerwall Theater with Christie visualization projection system ...

HP and NEC expand enterprise computing alliance to deliver increased reliability and innovation to customers ...

MoSQuIT bags mBillionth South Asia Award 2013 ...

Intel aims to re-architect data centres to meet demand for new services ...

When it comes to memory, bioinformatics is a hungry wolf


19 Jun 2013 Leipzig - In the session addressing "Better understanding brains, genomes and life using HPC systems at ISC'13 in Leipzig, BingQiang Wang from BGI talked about bioinformatics, Big Data and the involvement of HPC. He stated that bioinformatics is memory hungry and therefore multithreading in bioinformatics tools should be encouraged.

At BGI, researchers are analysing population genomics and making phylogenetic studies as well as genome association studies. The systems biology is working with various levels of data and this has proved not to be easy, as BingQiang Wang explained.

The challenges researchers face are the diverse types of workloads that require high throughput computing

HPC. They have a need for massive compute power and storage capacity and this increases the computing complexity. There is consideration needed for infrastructure which has to be scalable and ready for the future. The speaker also pleaded for a balance between compute and IO, as well as a better scientist-systems interaction. Developers have to make HPC systems more enduser friendly since the training of biologists with computer "things" is not easy, BingQiang Wang insisted.

The present status is as follows, the speaker went on. In physics we know how to model solids, waves, and atoms. In chemistry we know less but in biology we know even less. There is a lack of model, method and theory.

The challenges ahead are to be found in the alignment for long, error-prone third generation reads. This can be done with a hybrid de novo assembly using second and third generation reads. The assembly of meta data amounts to up to several Tera base pairs. Researchers have to identify sequencing errors and rare species.

BingQiang Wang described the ideal computational tools. They consist among others of a SOAP3-DP aligner. Sequence alignment is a way of arranging the sequences of DNA, RNA, or proteins to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.

SNP calling with GSNP is another tool. A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide in the genome differs, the speaker explained. The elapsed time of all steps is included. GSNP is around 50x faster than the single-thread CPU-based method.

When we take a look at the systems perspective, BingQiang Wang explained that BGI runs a large scale compting facility with a very complicated scenario. There are hundreds of endusers, hundreds of analysis tools, and tens of analysis pipelines. Scripting is very popular.

The speaker also highlighted the issues of the current systems. There is an imbalanced compute I/O and BGI researchers are dealing with low utilization.

Compression on the other hand is not for free. BGI uses domain optimized compression and heterogeneous accelerated compression and decompression by means of GPU. The researchers also use generic algorithms instead of specific ones. They try to fully exploit the characteristics of genomics data.

For job characterization and scheduling, they submit a job to the BGI computing farm using the Sun Grid Engine but they need to specify CPU slots and memory usage, the speaker explained.

To analyse the preliminary results, the researchers use a simulator to investigate the system.

BingQiang Wang concluded that the trend is evolving towards more threads and less memory but it is a fact that bioinformatics is memory hungry.

Leslie Versweyveld

Back to Table of contents

Primeur weekly 2013-07-30

Special

When it comes to memory, bioinformatics is a hungry wolf ...

The Cloud

IBM boosts zEnterprise mainframe portfolio to help clients build better customer experiences ...

CA releases mainframe performance enhancement innovations ...

Desktop Grids

Supercomputer created on the cheap ...

Want to help cure disease or discover new stars? Now you can, using your smartphone ...

Miron Livny earns distributed computing award ...

New app puts idle smartphones to work for science ...

EuroFlash

Customer validation achieved for Genalice map ...

e-IRG's White Paper 2013 published: Europe needs a e-Infrastructure Commons ...

Barco and projectiondesign deliver next-generation projection innovation for theme parks at EAS 2013 ...

Cray awarded $30 million contract to install a Cray XC30 supercomputer for the UK National Supercomputing Facility ...

PRACE launches pilot of high performance computing adoption programme for European SMEs ...

Iberdrola and the Barcelona Supercomputing Center develop the Sedar project ...

HOST project to organize workshop for scientific problems ...

Teenagers use supercomputing to design future aircraft with the Smallpeice Trust ...

USFlash

First Mira runs break new ground with turbulence simulations ...

NVIDIA pushes further into high performance computing with Portland Group acquisition ...

Parallella is shipped to early Kickstarter backers ...

OpenMP 4.0 API specification is released with significant new standard features ...

Studies suggest new key to switching off hypertension ...

Planned Systems International marks 25 years of excellence in technology and service ...

OSC OnDemand gives computational researchers innovative web interface to HPC systems ...

Notre Dame researchers develop system that uses a Big Data approach to personalized health care ...

Los Alamos National Laboratory upgrades its Powerwall Theater with Christie visualization projection system ...

HP and NEC expand enterprise computing alliance to deliver increased reliability and innovation to customers ...

MoSQuIT bags mBillionth South Asia Award 2013 ...

Intel aims to re-architect data centres to meet demand for new services ...