The key note at the conference was by Peter Coveney, University College London. He is a big supercomputer user and active in many UK and European eScience projects. Key to making supercomputing, Big Data and other resources available to scientists are work flows that provide easy access. This way ensemble molecular dynamics can be done and multi-scale physics, using a variety of resources, becomes feasible. When you need to analyse big scientific data, you often need big computers. Hence it is useful that a number of EUDAT Big Data centres are also supercomputer centres. Even though supercomputers are expensive, it is chicken feed compared to the real large scientific instruments like the Square Kilometre Array. Peter Coveney does not understand why on a European scale we have separate infrastructures for Big Data, HPC and supercomputing.
Big Data over the Internet? The real Big Data can be found in scientfic instruments, said Rob van Nieuwpoort in his presentation. For instance the LOFAR telescope produces 20 Tbit/s of raw data. The Amsterdam Internet Exchange (AMS-IX), manages only about 1,5 Tbit/s. Especially in astronomy Big Data gets so large you cannot store the raw data. Different algorithms are needed to process the streaming. Often "old" algorithms are used that were replaced some decades ago by more efficient ones that unfortunately require the data to be in memory.
Next, Willem Bouten explained e-Ecology which currently is at the level of data driven statistical science. Ecological systems too are just complex for full simulations.
Piek Vossen explained next that actually sensor data is not the raw, but the real data. It is an interpretation of the data already. In humanities it is even worse: researchers have their own, often subjective, interpretation of data. Surprisingly, that does not hinder uptake of Big Data analytics techniques, said Piek Vossen. It is not a problem that researchers have a subjective interpretion of data as long as they can be formalised and compared. He is working in a project, called News Reader to prove the usefulness of Big Data analytics in news collection and interpretation. News Reader ( http://www.newsreader-project.eu/ ) tries to collect and interpret all the news in the world. These are millions of items each day. On a local university cluster at the VU Amsterdam, processing one day of news would take some 15 years.
The Netherlands Forensic Institute ("Dutch CSI") needs to process terabytes of digital media data in an hour. The time constraint is the challenge, explained Erwin van Eijk. It is often necessary because of the legal requirements: the judge needs it. There is a complex situation with provenance, with access to the data and the interpretation of the data.
What does the average Dutch man/woman look like? Look at the genes of course. Swertz explained who they created an "ultra-sharp genetic group portrait of the Dutch". They sued a light-path to transport data between Groningen, the data repository and Amsterdam, the computing infrastructure used.
Do you believe the weather forecast? It is OK when you live in a rural area. But in cities there are micro climates that can be rather different. Bert Holtslag is working on a project "Summer in the City" to measure and model these micro climates. He started in his home town Wageningen. Next year he will model Amsterdam.
How does Bert Holtslag collect measurements in a city? Typically Dutch: with a bike packed with measurement equipment.