Back to Table of contents

Primeur weekly 2017-02-20

Focus

HPC expert Genias Benelux to show its skillful expertise in brandnew website ...

Are billion Euro Flagships the right way to finance innovative areas like graphene, human brain research and quantum computing? ...

Exascale supercomputing

Advanced fusion code led by PPPL selected to participate in Early Science Programmes on three new DOE Office of Science pre-exascale supercomputers ...

Focus on Europe

From robotics to particle physics: Data analytics gets the spotlight in Distinguished Talk series at ISC 2017 ...

A new spin on electronics ...

Data mining tools for personalized cancer treatment ...

Why host HPC in Iceland to tackle Big Data for life sciences at Earlham Insititute ...

Biological experiments become transparent - anywhere, any time ...

Middleware

IBM delivers new platform to help clients address storage challenges at massive scale ...

Hewlett Packard Enterprise unveils most significant 3PAR Flash storage innovations to date ...

Hardware

Tokyo Institute of Technology partners with DDN on Tsubame3.0 to build forward-looking AI and Big Data computing infrastructure ...

Mellanox demonstrates four times improvement in crypto performance with Innova IPsec 40G Ethernet network adapter ...

Supermicro launches BigTwin - the industry's highest performing Twin multi-node system supporting the full range of CPUs, maximum memory and all-flash NVMe ...

Applications

Researchers catch extreme waves with higher-resolution modelling ...

Researchers are creating software to 'clean' large datasets, making it easier for scientists and the public to use Big Data ...

Designing new materials from 'small' data ...

Success by deception ...

DNA computer brings 'intelligent drugs' a step closer ...

'Lossless' metamaterial could boost efficiency of lasers and other light-based devices ...

Perimeter Institute researchers apply machine learning to condensed matter physics ...

When treating brain aneurysms, two isn't always better than one ...

Real-time MRI analysis powered by supercomputers ...

Analyzing data for transportation systems using TACC's Rustler, XSEDE ECSS support ...

NCSA facilitates performance comparisons with China's nr. 1 supercomputer ...

IBM delivers Watson for cyber security to power cognitive security operations centres ...

The Cloud

Optimizing data centre placement and network design to strengthen Cloud computing ...

Dutch start-up solution impacts data centres ...

OpenFog Consortium releases landmark reference architecture for Fog computing ...

IBM brings machine learning to the private Cloud ...

IBM accelerates hybrid Cloud adoption by enabling channel partners to offer VMware solutions ...

Oracle launches Cloud service to help organisations integrate disparate data and drive real-time analytics ...

Researchers are creating software to 'clean' large datasets, making it easier for scientists and the public to use Big Data

16 Feb 2017 Buffalo - Like a teenager's bedroom, Big Data is often messy. Malfunctioning computers, data entry errors and other hard-to-spot problems can skew datasets and mislead people - everyone from data scientists to data hobbyists - trying to draw conclusions from raw data. Vizier, a software tool under development by a University at Buffalo (UB)-led research team, aims to pro-actively catch those errors.

The project, backed by a $2.7 million National Science Foundation grant, launched in January. Like Excel and other spreadsheet software, Vizier will allow users to interactively work with datasets. For example, it will help people explore, clean, curate and visualize data in meaningful ways, as well as spot errors and offer solutions.

But unlike spreadsheet software, Vizier is intended for much larger datasets; it will be used to examine millions or billions of data points, as opposed to hundreds or thousands typically plugged into spreadsheet software.

"We are creating a tool that'll let you work with the data you have, and also unobtrusively make helpful observations like 'Hmm, have you noticed that two out of a million records make a 10 percent difference in this average?'" stated Oliver Kennedy, PhD, assistant professor of computer science and engineering at UB, and the grant's principal investigator.

Co-principal investigators include Juliana Freire, professor of computer science and engineering at New York University, and Boris Glavic, assistant professor in the Department of Computer Science at the Illinois Institute of Technology. The award is from NSF's Data Infrastructure Building Blocks (DIBBs) programme.

For years, companies like Google, Microsoft and Apple have utilized Big Data to improve their products and services. That same power is now spreading to the masses as government agencies in the United States and elsewhere publish massive amounts of public data on the internet.

For example, New York City and the federal government have open data portals making it possible for anyone with an internet connection to download information and ask questions about their government. When properly used, these portals can shed light on issues relating to health code violations, discrimination, bias and other matters, Kennedy said. Vizier will be released as free, open-source software.

"We want to make it easier for data scientists - and eventually data hobbyists - to discover and communicate not only what the data says, but why the data says that", he stated.

Source: University at Buffalo

Back to Table of contents

Primeur weekly 2017-02-20

Focus

HPC expert Genias Benelux to show its skillful expertise in brandnew website ...

Are billion Euro Flagships the right way to finance innovative areas like graphene, human brain research and quantum computing? ...

Exascale supercomputing

Advanced fusion code led by PPPL selected to participate in Early Science Programmes on three new DOE Office of Science pre-exascale supercomputers ...

Focus on Europe

From robotics to particle physics: Data analytics gets the spotlight in Distinguished Talk series at ISC 2017 ...

A new spin on electronics ...

Data mining tools for personalized cancer treatment ...

Why host HPC in Iceland to tackle Big Data for life sciences at Earlham Insititute ...

Biological experiments become transparent - anywhere, any time ...

Middleware

IBM delivers new platform to help clients address storage challenges at massive scale ...

Hewlett Packard Enterprise unveils most significant 3PAR Flash storage innovations to date ...

Hardware

Tokyo Institute of Technology partners with DDN on Tsubame3.0 to build forward-looking AI and Big Data computing infrastructure ...

Mellanox demonstrates four times improvement in crypto performance with Innova IPsec 40G Ethernet network adapter ...

Supermicro launches BigTwin - the industry's highest performing Twin multi-node system supporting the full range of CPUs, maximum memory and all-flash NVMe ...

Applications

Researchers catch extreme waves with higher-resolution modelling ...

Researchers are creating software to 'clean' large datasets, making it easier for scientists and the public to use Big Data ...

Designing new materials from 'small' data ...

Success by deception ...

DNA computer brings 'intelligent drugs' a step closer ...

'Lossless' metamaterial could boost efficiency of lasers and other light-based devices ...

Perimeter Institute researchers apply machine learning to condensed matter physics ...

When treating brain aneurysms, two isn't always better than one ...

Real-time MRI analysis powered by supercomputers ...

Analyzing data for transportation systems using TACC's Rustler, XSEDE ECSS support ...

NCSA facilitates performance comparisons with China's nr. 1 supercomputer ...

IBM delivers Watson for cyber security to power cognitive security operations centres ...

The Cloud

Optimizing data centre placement and network design to strengthen Cloud computing ...

Dutch start-up solution impacts data centres ...

OpenFog Consortium releases landmark reference architecture for Fog computing ...

IBM brings machine learning to the private Cloud ...

IBM accelerates hybrid Cloud adoption by enabling channel partners to offer VMware solutions ...

Oracle launches Cloud service to help organisations integrate disparate data and drive real-time analytics ...