Back to Table of contents

Primeur weekly 2014-08-18

Special

HPC computing based on the working of the human brain ...

Exascale supercomputing

NERSC, Intel, and Cray team up to prepare users for transition to exascale computing ...

The Cloud

Made in IBM Labs: Governing geographically dispersed Cloud data ...

IBM acquires Cloud security services provider Lighthouse Security Group ...

IBM opens first SoftLayer data centre in Canada ...

EuroFlash

European Commission to launch public survey on Net Innovation for the Work Programme 2016-2017 ...

SysFera joins ETP4HPC ...

Iberdrola stars work on construction of 66 megawatt Pier II wind farm in Mexico ...

Quantum simulators explained ...

USFlash

University of Michigan computer scientist reviews frontier technologies to determine fundamental limits of computer scaling ...

Finalists compete for coveted ACM Gordon Bell Prize in High Performance Computing ...

Florida Polytechnic University, Flagship Solutions Group and IBM announce supercomputing centre ...

Vancouver Film School selects HP Z workstations with AMD FirePro professional graphics ...

TACC's Stampede and Lonestar systems aid microbiome research of gum disease, diabetes, and Crohn's disease ...

CacheBox application acceleration software increases on-line transactions per second by 56x while reducing latency by 98% ...

Data-visualization tool identifies sources of aberrant results and recomputes visualizations without them ...

Penguin Computing's new line of Arctica Ethernet switches featuring x86 application compatibility now available ...

Altera releases Quartus II software Arria 10 edition v14.0 ...

IBM to receive approval for selling x86 supercomputer business to Lenovo ...

HPC computing based on the working of the human brain


26 Jun 2014 Leipzig - The closing keynote at ISC'14 in Leipzig, Germany, was presented by Karlheinz Meier of the Ruprecht-Karls University of Heidelberg. He talked about the achievements and challenges of brain derived computing beyond von Neumann by showing a few remarkable efforts in current high performance computing and by comparing the working of the human brain with the processing of HPC systems and by trying to figure out how the two can support each other mutually in order to build more energy-efficient, yet more powerful HPC architectures.

Karlheinz Meier went back to the beginning of HPC history with Konrad Zuse's Z3 disjoint programme and data memories and the Harvard architecture developed by Howard Aiken at Harvard, IBM, built on the

instructions memory and data memory.

At present machines are highly modular, engineering friendly, easily mapped to programming models, and

theoretically sound. The BlueBrain Project in Lausanne however is different: it is rather uniform, with no separation of memory and computing, it is not programmed, and has no established theory. As such, it is

not a von Neumann type of machine, according to Karlheinz Meier.

The individual cells in the brain are spatially separate objects. This action potential was discovered by Julius Bernstein and Santiago Ramon y Cajal. There is interaction over a distance and spatial and temporal integration.

There are sparse, stereotypic pulses or spikes with very efficient coding of Shannon information. The rare events that happen not too often carry a lot of information.

The extreme matter in the human brain has 1015connections and 1011cells.

It is in constant interaction with the environment, stochastic, far from equilibrium, and has dynamic long-range and short-range interactions.

One has tried to put models of the brain into a computer, explained Karlheinz Meier. One is running on the K-Computer at RIKEN lab and was performed by Markus Diesmann from the Human Brain Project.

How does the compute time behave? It is pretty flat, according to the results.

You can do neuroscience on a computer but this is not what we want to do: we want to do neuroscience to build a computer. This is the other way round, the speaker pointed out.

One has performed a biology cell level simulation in the BlueBrain project. In the cortical column, it showed 10.000 detailed reconstructed neurons. This is a 1000 times slower than in biology with 1:1014in energy. These simulations do not run in real time.

Researchers have also tried simulating the brain of mice by comparing the mouse with the size of the solar system. The ratio is 1:1014.

When looking at the energy scales, the energy used for a synaptic transmission shows an enormous gap in energy. It has a 10 to 14 orders of magnitude difference for "the same thing", according to Karlheinz Meier.

The energy efficiency is amazing.

How much does a neural computation cost?

The human brain from top to bottom has a 20 W total power equally shared. 100 billion neurons are firing at 1 Hz which is 10-10Joule per action potential. 1015synapses are transmitting at 1 Hz which is 10-14Joule per synaptic transmission.

From bottom to top, approximately 109ATP molecules are to be hydrolyzed for the action potential and approximately 105ATP molecules are to be hydrolyzed for the synaptic transmission, according to D. Attwell and S. B. Laughlin.

As such, one obtains 10-19Joule or approximately 1 eV per ATP molecule, according to Bray, Dennis in "Cell Movements", New York: Garland, 1992.

So one ends up with 10-10Joule (100.000 fJ = 0.1 nJ) per action potential and

10-14Joule (10 fJ) per synaptic transmission.

If we consider electronics versus biology on the device level, there is not a big difference. Switching off a MOS element is approximately 1 fJ. For the synaptic transmission it is about 10 fJ. So we are talkiing 10 low-tech CMOS transistors here, explained Karlheinz Meier. In the end, it are not the devices but the architecture and the computational model that count.

It takes about 15 years of wiring up the adolescent brain during development. From 5 to 15 years, one loses gray matter volume. The brain develops and the wiring process evolves, which is called plasticity.

The development happens in years but in simulation it happens in 1000 years. There is an enormous gap again, showed Karlheinz Meier.

There is 80 years of volume loss in the lateral prefrontal cortex during normal aging. For planning complex cognitive behaviour, personality expression, decision making, and moderating social behaviour, there is an enormous loss.

For the crustacean stomatogastric ganglion of a lobster, there are 400.000 de-generate solutions for a model control network with 17 cell parameters. So there is variability.

The neural cell is an entity on a silicon substrate with no global synchronization. This is neuromorphic computing. Neuromorphic systems are energy efficient, fault tolerant, self organized, fast and compact, and might solve the energy problem, reliability problem, software problem, simulation time problem, and size problem of traditional computers, according to the speaker.

So in the role model, the state-of-the-art devices are replaced, but the memory-processor architecture (von Neumann) is maintained, as well as the universality and size matters.

John von Neumann and Robert Oppenheimer worked on the architecture. Computers are universal devices.

There is a rapidly growing number of projects in the European Union and the US, said Karlheinz Meier. There are five complementary approaches to neuromorphic computing including commodity microprocessors in SpiNNaker and the Human Brian Project (HBP) with soft-binary-code; custom fully digital by IBM Almaden with hard-binary-code; custom mixed-signal in BrainScaleS and HBP with accelerated physical model; custom subthreshold analogue cells by Stanford with real time physical model; and custom hybrid by Qualcomm with hybrid model.

The complementarity of the approaches is essential. All are massively parallel, with asynchronous communication and configurable.

The communication is the challenge, according to Karlheinz Meier.

10.000 inputs into a cortical neuron responds to 1015synaptic connections in the human brain.

Karlheinz Meier showed models with discrete time and discrete signal; with continuous time and discrete signal; and with continuous time and continuous signal to make the connection for communication.

The SpiNNaker project in the UK has 18 ARM 968 cores per chip, uses integer arithmetic and 200 MHz processor clock, a shared system RAM on die, and 128 Mbyte DRAM stacked on die. Each chip has 6 bi-directional links and 6 million spikes/s/link. It is a real time simulator.

As for the connectivity, it is a small packet based communication network. with toroidal link geometry, a maximum of 3 routings between any pair of nodes, and 6 million spikes per second per bi-directional link.

The system of IBM Almaden Group is fully custom with a fully digital design. It exploits an economy of scale, based on a simple LIF neuron model with 1-bit crossbar synapses. It shows plasticity. There is a large effort towards usability, according to Karlheinz Meier. TrueNorth is a corelet language for composing networks of neurosynapticic cores.

The BrainScaleS Group has developed a physical model, with analogue local computing and binary continuous time communication. It has wafer-scale integration of 200.000 neurons and 50.000.000 synapses. There is short term and long term plasticity, 10.000 faster than real time. There is a separation of neural computing and monitoring and control. It is a multi-scale circuit with plastic synapses, a high input count and network chips.

The Boahen Group in Stanford has developed an analogue neural cell bodies (somas) system with transistors operating in low power subthreshold range. It uses 3W for 1 million somas. All network connectivity is virtualized off chip which is not energy-efficient. It has biologically realistic membrance response. It is a real time modelling system.

Qualcomm is working with Neural Processing Units, a new class of processors mimicking human perception and cognition. They are massively parallel and reprogrammable. It are comprehensive tools with human-like functions.

Next, Karlheinz Meier described the software for this hardware with the following considerations: generic network description, configuration databases, place and route, verification, executable system specification, experiment control, visualization, data analysis or reduction.

Examples for ongoing unifying approaches are NENGO from the University of Waterloo, PyNN from the FACETS project, True North from IBM, MUSIC from INCF, and HBP Cockpit.

What to do with it, asked Karlheinz Meier. He cited four principles:

1. basic dynamical properties of isolated cells and circuits with cell firing patterns, synchronisation, stability and order or chaos

2. biologically realistic, reverse engineerd circuits in closed loops with small brains, cortical structures, cortical columns, and functional units

3. implement and test fundamental, generic concepts and theories with liquid computing, probabilistic inference, and Boltzmann machines

4. generic neurmorphic computing outside neuroscience with neuromorphic controllers, spatio-temporal pattern detection in data streams, causal relations in Big Data, and approximate computing.

He showed the 3-layer spiking neuron network derived from the insect olfactory system. Layer one has receptor neurons, layer two has decorrelation through lateral inhibition, and layer three has association or soft WTA through strong inhibitory populations. He also showed the neuromorphic network activity before and after training.

Karlheinz Meier said the energy used for a synaptic transmission has to be efficient. To fill the gap, it is typically 10.000.000 times more energy efficient than state-of-the-art HPC but 10.000 times less efficient than biology.

In the time scales, the speed of the development takes 1 year in nature, 1000 years in simulation, and 3000 s in the accelerated model.

64 neuronal spiking signal sources correspond to presynaptic learning. The receiving postsynaptic neuron performs phase locking by STDP learning, according to Karlheinz Meier. One can run 100 emulations with 20ms duration each.

In the Boltzmann machine with artificial spiking neurons, one can draw samples from a joint prabability distribution. The network explores 2 modes with stochastic switches. Karlheinz Meier demonstrated sampling from a stationary probability distribution.

The Human Brain project is a co-ordinated effort to understand, improve and exploit the brain. The project selection took place in January 2013. There is an approval of 30 months ramp-up. The starting date was October 1, 2013. The initial project size amounts to 80 partners and the initial EU contribution to 54 million euro. The framework partership agreement of 7,5 years is currently in preparation.

There are 6 ICT platforms in the Human Brain Project:

1. the neuroinformatics platform for aggregating neuroscience data and delivering brain atlases

2. the medical informatics platform for aggregating clinical records and classifying brain diseases

3. the brain simulation platform for developing software tools and running closed loop brain simulations

4. the HPC platform for developing and operating HPC systems optimized for brain simulations

5. the neuromorphic computing platform for developing and operating novel brain derived computing hardware

6. the neurorobotics platform for developing virtual robotic systems for closed loop cognitive experiments

There are also 2 complementary approaches in the HBP project. The first one is a many-core digital processor system which is many-clocked with simplified ARM processors and address-based, small packet, asynchronous communication, effectively running at biological real-time. The second one is a physical model system with many analogue elements with physical time constraints, and binary, asynchronous, continuous time, high density communication, effectively running at 10.000 times biological real-time.

Is it computing for neuroscience or neuroscience for computing, Karlheinz Meier asked. The neuromorphic systems are reliable, fast, efficient and compact as computational neuroscience simulators. Or the neuromorphic systems provide the highest possible degree of biologically relevant complexity under user control.

The HBP project does not start from zero. There are two large systems: the HBP many-core system at the Manchester site and the HBP physical model system at the Heidelberg site with 4 million Adex neurons, 1 billion conductance based synapses and a 10.000 times acceleration beyond real-time.

Spikey in Heidelberg has 384 neurons and 100.000 plastic synapses while SpiNNaker in Manchester has 18-72 ARM cores, 36-144 million spikes per second in real time with various I/O options. Both plug into a laptop via USB.

Karlheinz Meier thought it was time for both of them to leave the lab.

He ended by stating that there is a consistent set of concepts for a brain derived, non-von Neumann computer architecture. It is accessible to available device technologies but also an attractive application for future component technologies. The key features are universality, scalability, fault tolerance, power efficiency, speed, and learning. The accelerated operation is the only known approach to bridge all timescales relevant for circuit dynammics. The important next step is to give up bit-precise simulation as a reference, and to exploit the physical properties of the devices.

Leslie Versweyveld

Back to Table of contents

Primeur weekly 2014-08-18

Special

HPC computing based on the working of the human brain ...

Exascale supercomputing

NERSC, Intel, and Cray team up to prepare users for transition to exascale computing ...

The Cloud

Made in IBM Labs: Governing geographically dispersed Cloud data ...

IBM acquires Cloud security services provider Lighthouse Security Group ...

IBM opens first SoftLayer data centre in Canada ...

EuroFlash

European Commission to launch public survey on Net Innovation for the Work Programme 2016-2017 ...

SysFera joins ETP4HPC ...

Iberdrola stars work on construction of 66 megawatt Pier II wind farm in Mexico ...

Quantum simulators explained ...

USFlash

University of Michigan computer scientist reviews frontier technologies to determine fundamental limits of computer scaling ...

Finalists compete for coveted ACM Gordon Bell Prize in High Performance Computing ...

Florida Polytechnic University, Flagship Solutions Group and IBM announce supercomputing centre ...

Vancouver Film School selects HP Z workstations with AMD FirePro professional graphics ...

TACC's Stampede and Lonestar systems aid microbiome research of gum disease, diabetes, and Crohn's disease ...

CacheBox application acceleration software increases on-line transactions per second by 56x while reducing latency by 98% ...

Data-visualization tool identifies sources of aberrant results and recomputes visualizations without them ...

Penguin Computing's new line of Arctica Ethernet switches featuring x86 application compatibility now available ...

Altera releases Quartus II software Arria 10 edition v14.0 ...

IBM to receive approval for selling x86 supercomputer business to Lenovo ...