Deep Learning for Prediction and Control of Cellular Automata in Unreal Environments

Aach, Marcel
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@MASTERSTHESIS{Aach:892991,
      author       = {Aach, Marcel},
      title        = {{D}eep {L}earning for {P}rediction and {C}ontrol of
                      {C}ellular {A}utomata in {U}nreal {E}nvironments},
      school       = {University of Cologne},
      type         = {Masterarbeit},
      reportid     = {FZJ-2021-02488},
      pages        = {76 pages},
      year         = {2021},
      note         = {Masterarbeit, University of Cologne, 2021},
      abstract     = {In this thesis, we show the ability of a deep convolutional
                      neural network to understand the underlying transition rules
                      of two-dimensional cellular automata by pure observation. To
                      do so, we evaluate the network on a prediction task, where
                      it has to predict the next state of some cellular automata,
                      and a control task, where it has to intervene in the
                      evolution of a cellular automaton to achieve a state of
                      standstill. The cellular automata we use in this case are
                      based on the classical Game of Life by John Conway and
                      implemented in the Unreal Engine. With the usage of the
                      Unreal Engine for data generation, a technical pipeline for
                      processing output images with neural networks is
                      established.Cellular automata in general are chaotic
                      dynamical systems, making any sort of prediction or control
                      very challenging, but using convolutional neural networks to
                      exploit the locality of their interactions is a promising
                      approach to solve these problems. The network we present in
                      this thesis follows the Encoder-Decoder structure and
                      features residual skip connections that serve as shortcuts
                      in between the different layers. Recent advancements in the
                      field of image recognition and segmentation have shown that
                      both of these aspects are the key to success.The evaluation
                      of the prediction task is split into several levels of
                      generalization: we train the developed network on
                      trajectories of several hundred different cellular automata,
                      varying in their transition rules and neighborhood sizes.
                      Results on a test set show that the network is able to learn
                      the rules of even more complex cellular automata (with an
                      accuracy of ≈ $93\%).$ To some extent, it is even able to
                      interpolate and generalize to completely unseen rules (with
                      an accuracy of ≈ $77\%).$ A qualitative investigation
                      shows that static rules (not forcing many changes in between
                      time steps) are among the easiest to predict.For the control
                      task, we combine the encoder part of the developed neural
                      network with a reinforcement agent and train it to stop all
                      movements on the grid of the cellular automata as quickly as
                      possible. To do so, the agent can change the state of a
                      single cell per time step. A comparison between giving back
                      rewards to agents continuously and giving them only in the
                      case of success or failure shows that Proximal Policy
                      Optimization agents do better with receiving sparse rewards
                      while Deep Q-Network agents fare better with continuously
                      receiving them. Both algorithms beat random agents on
                      training data, but their generalization ability remains
                      limited.},
      cin          = {JSC},
      cid          = {I:(DE-Juel1)JSC-20090406},
      pnm          = {511 - Enabling Computational- $\&$ Data-Intensive Science
                      and Engineering (POF4-511)},
      pid          = {G:(DE-HGF)POF4-511},
      typ          = {PUB:(DE-HGF)19},
      url          = {https://juser.fz-juelich.de/record/892991},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help