Routing brain traffic through the von Neumann bottleneck: Efficient cache usage in spiking neural network simulation code on general purpose computers

Pronold, Jari; Wylie, Brian J. N.; Diesmann, Markus; Jordan, Jakob; Kunkel, Susanne; Kitayama, Itaru

doi:10.48550/ARXIV.2109.12855

Preprint

FZJ-2022-01560

Routing brain traffic through the von Neumann bottleneck: Efficient cache usage in spiking neural network simulation code on general purpose computers

Pronold, J. (Corresponding author)FZJ* ; Jordan, J. ; Wylie, B. J. N.FZJ* ; Kitayama, I.FZJ* ; Diesmann, M.FZJ* ; Kunkel, S. (Corresponding author)FZJ*

2021
arXiv

arXiv (2021) [10.48550/ARXIV.2109.12855]

This record in other databases:

Please use a persistent id in citations: http://hdl.handle.net/2128/30873 doi:10.48550/ARXIV.2109.12855

Abstract: Simulation is a third pillar next to experiment and theory in the study of complex dynamic systems such as biological neural networks. Contemporary brain-scale networks correspond to directed graphs of a few million nodes, each with an in-degree and out-degree of several thousands of edges, where nodes and edges correspond to the fundamental biological units, neurons and synapses, respectively. When considering a random graph, each node's edges are distributed across thousands of parallel processes. The activity in neuronal networks is also sparse. Each neuron occasionally transmits a brief signal, called spike, via its outgoing synapses to the corresponding target neurons. This spatial and temporal sparsity represents an inherent bottleneck for simulations on conventional computers: Fundamentally irregular memory-access patterns cause poor cache utilization. Using an established neuronal network simulation code as a reference implementation, we investigate how common techniques to recover cache performance such as software-induced prefetching and software pipelining can benefit a real-world application. The algorithmic changes reduce simulation time by up to 50%. The study exemplifies that many-core systems assigned with an intrinsically parallel computational problem can overcome the von Neumann bottleneck of conventional computer architectures.

Keyword(s): Distributed, Parallel, and Cluster Computing (cs.DC) ; FOS: Computer and information sciences

Contributing Institute(s):

Research Program(s):

Appears in the scientific report 2022

Database coverage:
OpenAccess