Back to Table of contents

Primeur weekly 2014-10-20

Special

High Performance Computing Centre in Stuttgart to focus on Cloudification ...

Cash flow calculations using Monte Carlo simulations ...

The Cloud

IBM and SAP partner to accelerate enterprise Cloud adoption ...

EuroFlash

European Commission to launch survey on EU-Brazil co-operation in the area of ICT - Work Programme 2016-2017 ...

Big Data Value Association and European Commision sign Big Data Public Private Partnership ...

New forecasting method: Predicting extreme floods in the Andes mountains ...

Symposium on HPC and Data-Intensive Applications in Earth Sciences: Challenges and Opportunities@ICTP ...

Jülich renews co-operation with Oak Ridge National Laboratory ...

Johannes Gutenberg University Mainz joins Germany's Gauss-Allianz as a full member ...

PRACE supports HPC for Health ...

HP and VMware dramatically simplify and accelerate the delivery of software-defined infrastructure services with EVO: RAIL ...

Supermicro highlights VMware EVO: RAIL, FatTwin Virtual SAN Ready nodes and NVIDIA GRID vGPU SuperServer at VMworld Barcelona ...

Calling on universities to submit their HPCAC-ISC 2015 Student Cluster Competition application ...

PRACE SHAPE and NSilico pool HPC knowledge to develop faster sequencing methods ...

A novel platform for future spintronic technologies ...

Future computers could be built from magnetic 'tornadoes' ...

USFlash

New SGI UV for SAP HANA enables real-time business for large enterprises ...

Cray adds new advanced analytics solution to its Big Data product portfolio ...

Australian teams set new records for silicon quantum computing ...

IBM announces first commercial application of IBM Watson in Africa ...

Among DOE supercomputing facilities, NERSC is at the forefront of data management and storage innovations ...

SC14 announces ACM/IEEE-CS George Michael HPC Fellowships ...

Fujitsu partners with Singapore to set up Centre of Excellence for sustainable urbanisation ...

UC Santa Cruz leads $11 million Center for Big Data in Translational Genomics ...

Australian volcanic mystery explained ...

Supermicro enhances VMware EVO: RAIL and Virtual SAN offerings with Nexenta ...

Dispelling a misconception about Mg-ion batteries ...

UC Santa Cruz leads $11 million Center for Big Data in Translational Genomics

9 Oct 2014 Santa Cruz - The National Institutes of Health (NIH) has awarded $11 million to UC Santa Cruz to create the technical infrastructure needed for the broad application of genomics in medicine and biomedical research. This grant from the National Human Genome Research Institute (NHGRI) funds the Center for Big Data in Translational Genomics, a multi-institutional partnership based at UC Santa Cruz and led by David Haussler, professor of biomolecular engineering and director of the UC Santa Cruz Genomics Institute.

According to David Haussler, the centre's overarching goal is to help the biomedical community use genomic information to better understand human health and disease. To do this, scientists must be able to share and analyze genomic datasets that are orders of magnitude larger than those that can be handled by the existing infrastructure. Advances in DNA sequencing technology have made it increasingly affordable to sequence a person's entire genome, but managing genomic and related data from millions of individuals is a daunting challenge.

"Sequencing technology has run ahead of our ability to handle the data. We need to rework the informatics systems and the way we represent and handle genomic data", David Haussler stated.

At least half of all diseases have a substantial genomic component. Only by studying the genomes and related information from very large numbers of individuals will scientists have the statistical power to discover and understand the contribution to disease of individually rare but collectively common genetic variants, David Haussler said.

"It's hard for people to appreciate the size of these datasets. If you're talking about a million genomes, it's a stunning amount of data, and it's very difficult to move these large datasets, even over optical fiber", he stated.

David Haussler and his team at UC Santa Cruz have extensive experience managing large amounts of genomic data. Charged with creating a repository for The Cancer Genome Atlas and other large projects for the National Cancer Institute, they built the UCSC Cancer Genomics Hub (CGHub), the largest public database of cancer genome sequences in the world. CGHub was the first "NIH Trusted Partner" authorized to distribute genome sequence data to biomedical researchers. It currently holds more than 1.5 petabytes of data - 1,675,348 gigabytes, at the latest count. David Haussler's team also created the UCSC Genome Browser, the most popular web portal for accessing human DNA data.

For the new centre, David Haussler has teamed up with other leading experts in genomics and data science, including principal investigators Laura van 't Veer, director of applied genomics at the UCSF Helen Diller Family Comprehensive Cancer Center, and David Patterson, professor of computer science at UC Berkeley. Other partners include researchers at Wellcome Trust Sanger Institute, Sage Bionetworks, Oregon Health and Science University, California Institute of Technology, the Ontario Institute for Cancer Research, King's College London, and McGill University.

The Center for Big Data in Translational Genomics will develop new protocols and tools for genomic data and test them in four pilot projects. According to David Haussler, the genomics community must develop a standard, globally accepted set of specialized Internet protocols for handling genomic data efficiently. "It turns out that genomic information is quite complicated, so it's a massive undertaking, and we're very excited about building this new infrastructure", he stated.

The pilot projects will not only benefit from the technical infrastructure developed by the centre, but will also help guide the development of that infrastructure by providing essential feedback. These projects include the UK10K project to identify rare genetic changes with harmful phenotypic consequences, led by team member Richard Durbin of the Sanger Institute; the International Cancer Genome Consortium's 2,000 tumor pan-cancer analysis project, co-led by team members Josh Stuart at UC Santa Cruz, Lincoln Stein at the Ontario Institute for Cancer Research, and others; the I-SPY 2 adaptive breast cancer trial, co-led by PI van 't Veer at UCSF; and the Beat AML leukemia therapy project, led by team member Brian Druker at Oregon Health and Science University.

Three of the four projects are cancer-related, not because other disease areas are considered less critical, but because cancer genomics is progressing unusually rapidly and represents a high-water mark for the representation and analysis of genomic information and its translation into clinical practice, David Haussler said. "If you can build general informatics infrastructure for genomics in cancer, with thousands of potential driver mutations and more than 1,000 targeted treatment compounds in the current drug development pipelines, then this general infrastructure will be adaptable to other disease areas without needing to be scaled up", he stated.

Ultimately, the centre aims to extend the platforms developed for genomic research into regular clinical practice. Analyzing the genomic information from individual patients is potentially an extremely powerful clinical tool, and the use of genomics in clinical practice could increase dramatically the amount of genomic data available for study.

According to David Haussler, however, changes are needed not only to the technical infrastructure, but also to the "social infrastructure" related to the sharing of genomic data. "We need to develop the legal, ethical, and social organisation of shared consent so that we can share and learn from DNA sequences without threatening the privacy of individuals", he stated.

This is one of the goals of a new international non-profit alliance Haussler cofounded, the Global Alliance for Genomics and Health, which now includes nearly 200 of the world's largest medical centres, patient advocacy groups, and research institutions. "Right now, genomics data is being siloed away in the databases of individual medical centres. Only a tiny portion is being shared", David Haussler stated. "We want to create a new digital commons for responsible and confidential sharing of genomic and clinical data."

The Center for Big Data in Translational Genomics is one of several centres that have been funded by the NIH Big Data to Knowledge (BD2K) initiative, and it is the only BD2K center focused on genomics. The goal of the BD2K initiative is to develop innovative and transforming approaches that make Big Data and data science a prominent component of biomedical research. Centres focused in different big data areas will work together to achieve this goal.
Source: University of California - Santa Cruz

Back to Table of contents

Primeur weekly 2014-10-20

Special

High Performance Computing Centre in Stuttgart to focus on Cloudification ...

Cash flow calculations using Monte Carlo simulations ...

The Cloud

IBM and SAP partner to accelerate enterprise Cloud adoption ...

EuroFlash

European Commission to launch survey on EU-Brazil co-operation in the area of ICT - Work Programme 2016-2017 ...

Big Data Value Association and European Commision sign Big Data Public Private Partnership ...

New forecasting method: Predicting extreme floods in the Andes mountains ...

Symposium on HPC and Data-Intensive Applications in Earth Sciences: Challenges and Opportunities@ICTP ...

Jülich renews co-operation with Oak Ridge National Laboratory ...

Johannes Gutenberg University Mainz joins Germany's Gauss-Allianz as a full member ...

PRACE supports HPC for Health ...

HP and VMware dramatically simplify and accelerate the delivery of software-defined infrastructure services with EVO: RAIL ...

Supermicro highlights VMware EVO: RAIL, FatTwin Virtual SAN Ready nodes and NVIDIA GRID vGPU SuperServer at VMworld Barcelona ...

Calling on universities to submit their HPCAC-ISC 2015 Student Cluster Competition application ...

PRACE SHAPE and NSilico pool HPC knowledge to develop faster sequencing methods ...

A novel platform for future spintronic technologies ...

Future computers could be built from magnetic 'tornadoes' ...

USFlash

New SGI UV for SAP HANA enables real-time business for large enterprises ...

Cray adds new advanced analytics solution to its Big Data product portfolio ...

Australian teams set new records for silicon quantum computing ...

IBM announces first commercial application of IBM Watson in Africa ...

Among DOE supercomputing facilities, NERSC is at the forefront of data management and storage innovations ...

SC14 announces ACM/IEEE-CS George Michael HPC Fellowships ...

Fujitsu partners with Singapore to set up Centre of Excellence for sustainable urbanisation ...

UC Santa Cruz leads $11 million Center for Big Data in Translational Genomics ...

Australian volcanic mystery explained ...

Supermicro enhances VMware EVO: RAIL and Virtual SAN offerings with Nexenta ...

Dispelling a misconception about Mg-ion batteries ...