Back to Table of contents

Primeur weekly 2018-07-03

Quantum computing

Scientists pump up chances for quantum computing ...

CEA uses Atos simulator at the CCRT to explore the potential of quantum computing in industry ...

Focus on Europe

European parliament votes in favour of EuroHPC JU ...

European Commission welcomes Parliament vote on plans to establish EuroHPC JU ...

Report European Investment Bank: Financing the future of supercomputing - How to increase investments in high performance computing in Europe ...

Looking back on a successful PRACEdays18 ...

Middleware

Aerodynamic science reveals the best position in a peloton of 121 cyclists and calculates unexpected drag reduction for the athletes thanks to the largest simulation ever done in sport ...

Hardware

WatermelonBlock Partners with IBM Watson using its Supercomputer to let crypto investors know what the market is thinking ...

Tech Automotive Leaders Join Forces on Next-Generation In-Vehicle Networking Technologies for Autonomous and Connected Vehicles ...

Chayora Celebrates Major Milestone as It Nears Completion of Phase 1 of Its First Facility at Its Tianjin Hyperscale Data Centre Campus in China ...

Indian Institute of Technology Bombay deploys Cray to Power Research and Education ...

Lisa Compute Cluster expanded with GPUs for machine learning ...

Mellanox announces agreement with Starboard ...

Applications

Simulation for grid transmission and distribution ...

Futuristic data storage ...

GA4GH streaming API htsget a bridge to the future for modern genomic data processing ...

Department of Energy taps Argonne to lead effort focused on energy-water systems ...

Researchers discover new enzyme paradigm for critical reaction in converting lignin to useful produce useful products ...

SenseTime debuts in Singapore by signing MOU with local giants NTU, NSCC and Singtel ...

New simulations break down potential impact of a major quake by building location and size ...

AI learns the art of debate ...

Tiny sensors may help avert earthquake damage, track sonar danger, "listen" to pipelines ...

BNAs improve performance of Li-ion batteries ...

RUDN chemists have completely changed the direction of Diels-Alder reaction ...

UMass Amherst geoscientists offer new evidence for how the Adirondack Mountains formed ...

A galactic test will clarify the existence of dark matter ...

The Cloud

HPE accelerates data insight and action across the enterprise with latest Edgeline capabilities ...

Hewlett Packard Enterprise commits $4 billion to accelerate the Intelligent Edge ...

GA4GH streaming API htsget a bridge to the future for modern genomic data processing

22 Jun 2018 HINXTON - The Large Scale Genomics Work Stream of the Global Alliance for Genomics and Health (GA4GH) has announced eight new implementations of its htsget protocol, a standard released in October 2017 for accessing large-scale genomic sequencing data online without using file transfers. The protocol and interoperability testing are reported in a paper released online this week in the journal Bioinformatics .

The cornerstone of solving common diseases such as cancer and diabetes is to be able to compare the genome sequences of thousands of individuals to identify recurring genetic variants. Since no single institution can amass such a dataset on its own, it is critical for organizations to share information across traditional boundaries.

Historically, this has been done through the use of standardised file formats: a file generated at one institution can be downloaded and integrated with files at another institution because they use the same format.

This has worked well since the late 2000s, when these formats were developed as part of the international 1000 Genomes Project and they have enabled a global ecosystem of interoperable sequence analysis tools and pipelines.

But the field is changing. Genomics is shifting from a research endeavor to one more broadly implemented in routine clinical care; datasets will be so large that the current model of institutionally siloed file systems will not be sufficient to enable global sharing and collaboration.

"Datasets containing hundreds of millions -- rather than hundreds of thousands -- of sequences will be available within the next five years and sharing files of that size is simply not realistic," said Ewan Birney, Director of EMBL-EBI and Chair of GA4GH. "Users would have to download terabyte-sized files just to access data on a small subset of the genome sequence."

At the same time, the world is changing -- from film to financial data, myriad domains are shifting from traditional file-based approaches for storing and processing data to more modern, big-data, cloud-based approaches. Genomics will have to follow suit, but not without sacrificing current standards that make data interoperable.

"We are not attempting to replace the existing file formats," said Thomas Keane, Team Leader of EGA and the Archive Infrastructure at EMBL-EBI and co-chair of the GA4GH Large Scale Genomics Work Stream and its htsget task team. "Doing so would require adaptation of every single bioinformatics tool for processing data that is currently compatible with those formats."

Instead, htsget provides a consistent protocol for researchers to access data stored in different repositories -- whether based in big public clouds or in more traditional infrastructure. It also includes a robust security and authentication mechanism, which is key for sensitive data.

It can be operated efficiently for very large datasets, and, because it uses the existing standards for transmitting data, it can be readily integrated into current pipelines and analytical methods. Users can employ htsget to download only the subsection of a genome sequence in which they are interested rather than the whole file, or they can download the entire genome as a series of "data slices" distributed across multiple disparate machines.

"We've thought of this as a bridge to the future," said Mike Lin, specification maintainer for the GA4GH htsget team. "It's a gradual path to upgrade current file-based pipelines and repositories to a more interoperable, API-based architecture -- which has always been a foundational vision of GA4GH."

Lin will lead a webinar introducing the protocol and answering questions about implementation for the broad community on July 24. Anyone interested in learning more about htsget and how to implement it in their bioinformatics pipelines is invited to attend.

Source: Global Alliance for Genomics andHealth

Back to Table of contents

Primeur weekly 2018-07-03

Quantum computing

Scientists pump up chances for quantum computing ...

CEA uses Atos simulator at the CCRT to explore the potential of quantum computing in industry ...

Focus on Europe

European parliament votes in favour of EuroHPC JU ...

European Commission welcomes Parliament vote on plans to establish EuroHPC JU ...

Report European Investment Bank: Financing the future of supercomputing - How to increase investments in high performance computing in Europe ...

Looking back on a successful PRACEdays18 ...

Middleware

Aerodynamic science reveals the best position in a peloton of 121 cyclists and calculates unexpected drag reduction for the athletes thanks to the largest simulation ever done in sport ...

Hardware

WatermelonBlock Partners with IBM Watson using its Supercomputer to let crypto investors know what the market is thinking ...

Tech Automotive Leaders Join Forces on Next-Generation In-Vehicle Networking Technologies for Autonomous and Connected Vehicles ...

Chayora Celebrates Major Milestone as It Nears Completion of Phase 1 of Its First Facility at Its Tianjin Hyperscale Data Centre Campus in China ...

Indian Institute of Technology Bombay deploys Cray to Power Research and Education ...

Lisa Compute Cluster expanded with GPUs for machine learning ...

Mellanox announces agreement with Starboard ...

Applications

Simulation for grid transmission and distribution ...

Futuristic data storage ...

GA4GH streaming API htsget a bridge to the future for modern genomic data processing ...

Department of Energy taps Argonne to lead effort focused on energy-water systems ...

Researchers discover new enzyme paradigm for critical reaction in converting lignin to useful produce useful products ...

SenseTime debuts in Singapore by signing MOU with local giants NTU, NSCC and Singtel ...

New simulations break down potential impact of a major quake by building location and size ...

AI learns the art of debate ...

Tiny sensors may help avert earthquake damage, track sonar danger, "listen" to pipelines ...

BNAs improve performance of Li-ion batteries ...

RUDN chemists have completely changed the direction of Diels-Alder reaction ...

UMass Amherst geoscientists offer new evidence for how the Adirondack Mountains formed ...

A galactic test will clarify the existence of dark matter ...

The Cloud

HPE accelerates data insight and action across the enterprise with latest Edgeline capabilities ...

Hewlett Packard Enterprise commits $4 billion to accelerate the Intelligent Edge ...