Decomposition of Deep Neural Networks into Correlation Functions

Fischer, Kirsten; Dahmen, David; Helias, Moritz
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@MASTERSTHESIS{Fischer:888741,
      author       = {Fischer, Kirsten},
      othercontributors = {Helias, Moritz and Dahmen, David},
      title        = {{D}ecomposition of {D}eep {N}eural {N}etworks into
                      {C}orrelation {F}unctions},
      school       = {RWTH Aachen University},
      type         = {Masterarbeit},
      reportid     = {FZJ-2020-05175},
      pages        = {91 p.},
      year         = {2020},
      note         = {Masterarbeit, RWTH Aachen University, 2020},
      abstract     = {Recent years have shown a great success of deep neural
                      networks. One active field of research investigates the
                      functioning mechanisms of such networks with respect to the
                      network expressivity as well as information processing
                      within the network. In this thesis, we describe the
                      input-output mapping implemented by deep neural networks in
                      terms of correlation functions. To trace the transformation
                      of correlation functions within neural networks, we make use
                      of methods from statistical physics. Using a quadratic
                      approximation for non-linear activation functions, we obtain
                      recursive relations in a perturbative manner by means of
                      Feynman diagrams. Our results yield a characterization of
                      the network as a non-linear mapping of mean and covariance,
                      which can be extended by including corrections from higher
                      order correlations. Furthermore, re-expressing the training
                      objective in terms of data correlations allows us to study
                      their role for solutions to a given task. First, we
                      investigate an adaptation of the XOR problem, in which case
                      the solutions implemented by neural networks can largely be
                      described in terms of mean and covariance of each class.
                      Furthermore, we study the MNIST database as an example of a
                      non-synthetic dataset. For MNIST, solutions based on
                      empirical estimates for mean and covariance of each class
                      already capture a large amount of the variability within the
                      dataset, but still exhibit a non-negligible performance gap
                      in comparison to solutions based on the actual dataset.
                      Lastly, we introduce an example task where higher order
                      correlations exclusively encode class membership, which
                      allows us to explore their role for solutions found by
                      neural networks. Finally, our framework also allows us to
                      make predictions regarding the correlation functions that
                      are inferable from data, yielding insights into the network
                      expressivity. This work thereby creates a link between
                      statistical physics and machine learning, aiming towards
                      explainable AI.},
      cin          = {INM-6 / IAS-6 / INM-10},
      cid          = {I:(DE-Juel1)INM-6-20090406 / I:(DE-Juel1)IAS-6-20130828 /
                      I:(DE-Juel1)INM-10-20170113},
      pnm          = {574 - Theory, modelling and simulation (POF3-574) /
                      RenormalizedFlows - Transparent Deep Learning with
                      Renormalized Flows (BMBF-01IS19077A) / MSNN - Theory of
                      multi-scale neuronal networks (HGF-SMHB-2014-2018) /
                      neuroIC002 - Recurrence and stochasticity for neuro-inspired
                      computation (EXS-SF-neuroIC002)},
      pid          = {G:(DE-HGF)POF3-574 / G:(DE-Juel-1)BMBF-01IS19077A /
                      G:(DE-Juel1)HGF-SMHB-2014-2018 / G:(DE-82)EXS-SF-neuroIC002},
      typ          = {PUB:(DE-HGF)19},
      url          = {https://juser.fz-juelich.de/record/888741},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help