Decomposition of Deep Neural Networks into Correlation Functions

Fischer, Kirsten; Dahmen, David; Helias, Moritz

Master Thesis

FZJ-2020-05175

Decomposition of Deep Neural Networks into Correlation Functions

Fischer, K. (Corresponding author)FZJ* ; Helias, M. (Thesis advisor)FZJ* ; Dahmen, D. (Thesis advisor)FZJ*

2020

91 p. (2020) = Masterarbeit, RWTH Aachen University, 2020

Please use a persistent id in citations: http://hdl.handle.net/2128/26707

Abstract: Recent years have shown a great success of deep neural networks. One active field of research investigates the functioning mechanisms of such networks with respect to the network expressivity as well as information processing within the network. In this thesis, we describe the input-output mapping implemented by deep neural networks in terms of correlation functions. To trace the transformation of correlation functions within neural networks, we make use of methods from statistical physics. Using a quadratic approximation for non-linear activation functions, we obtain recursive relations in a perturbative manner by means of Feynman diagrams. Our results yield a characterization of the network as a non-linear mapping of mean and covariance, which can be extended by including corrections from higher order correlations. Furthermore, re-expressing the training objective in terms of data correlations allows us to study their role for solutions to a given task. First, we investigate an adaptation of the XOR problem, in which case the solutions implemented by neural networks can largely be described in terms of mean and covariance of each class. Furthermore, we study the MNIST database as an example of a non-synthetic dataset. For MNIST, solutions based on empirical estimates for mean and covariance of each class already capture a large amount of the variability within the dataset, but still exhibit a non-negligible performance gap in comparison to solutions based on the actual dataset. Lastly, we introduce an example task where higher order correlations exclusively encode class membership, which allows us to explore their role for solutions found by neural networks. Finally, our framework also allows us to make predictions regarding the correlation functions that are inferable from data, yielding insights into the network expressivity. This work thereby creates a link between statistical physics and machine learning, aiming towards explainable AI.

Note: Masterarbeit, RWTH Aachen University, 2020

Contributing Institute(s):

Research Program(s):

Appears in the scientific report 2020

Database coverage:
OpenAccess