Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning

Pleines, Marco; Preuss, Mike; Zimmer, Frank; Jitsev, Jenia

doi:10.1109/CoG47356.2020.9231802

Items
Marc 21

001			890092
005			20231116095325.0
020	_	_	\|a 978-1-7281-4533-4
024	7	_	\|a 10.1109/CoG47356.2020.9231802 \|2 doi
024	7	_	\|a 2128/26998 \|2 Handle
024	7	_	\|a WOS:000632592300058 \|2 WOS
037	_	_	\|a FZJ-2021-00681
100	1	_	\|a Pleines, Marco \|0 P:(DE-HGF)0 \|b 0 \|e Corresponding author
111	2	_	\|a 2020 IEEE Conference on Games (CoG) \|c Osaka \|d 2020-08-24 - 2020-08-27 \|w Japan
245	_	_	\|a Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning
260	_	_	\|c 2020 \|b IEEE
295	1	0	\|a 2020 IEEE Conference on Games (CoG) : [Proceedings] - IEEE, 2020
300	_	_	\|a 447 - 454
336	7	_	\|a CONFERENCE_PAPER \|2 ORCID
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a Output Types/Conference Paper \|2 DataCite
336	7	_	\|a Contribution to a conference proceedings \|b contrib \|m contrib \|0 PUB:(DE-HGF)8 \|s 1611566175_25895 \|2 PUB:(DE-HGF)
336	7	_	\|a Contribution to a book \|0 PUB:(DE-HGF)7 \|2 PUB:(DE-HGF) \|m contb
520	_	_	\|a The Obstacle Tower Challenge is the task to master a procedurally generated chain of levels that subsequently get harder to complete. Whereas the most top performing entries of last year's competition used human demonstrations or reward shaping to learn how to cope with the challenge, we present an approach that performed competitively (placed 7th) but starts completely from scratch by means of Deep Reinforcement Learning with a relatively simple feed-forward deep network structure. We especially look at the generalization performance of the taken approach concerning different seeds and various visual themes that have become available after the competition, and investigate where the agent fails and why. Note that our approach does not possess a short-term memory like employing recurrent hidden states. With this work, we hope to contribute to a better understanding of what is possible with a relatively simple, flexible solution that can be applied to learning in environments featuring complex 3D visual input where the abstract task structure itself is still fairly simple.
536	_	_	\|a 512 - Data-Intensive Science and Federated Computing (POF3-512) \|0 G:(DE-HGF)POF3-512 \|c POF3-512 \|f POF III \|x 0
588	_	_	\|a Dataset connected to CrossRef Conference
700	1	_	\|a Jitsev, Jenia \|0 P:(DE-Juel1)158080 \|b 1 \|e Corresponding author \|u fzj
700	1	_	\|a Preuss, Mike \|0 P:(DE-HGF)0 \|b 2
700	1	_	\|a Zimmer, Frank \|0 P:(DE-HGF)0 \|b 3
773	_	_	\|a 10.1109/CoG47356.2020.9231802
856	4	_	\|u https://juser.fz-juelich.de/record/890092/files/Obstacle%20tower%20without%20human%20demonstrations%20How%20far%20a%20deep%20feed-forward%20network%20goes%20with%20reinforcement%20learning%20-%202020%20-%20Pleines%20et%20al..pdf \|y OpenAccess
909	C	O	\|o oai:juser.fz-juelich.de:890092 \|p openaire \|p open_access \|p VDB \|p driver \|p dnbdelivery
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)158080
913	1	_	\|a DE-HGF \|b Key Technologies \|l Supercomputing & Big Data \|1 G:(DE-HGF)POF3-510 \|0 G:(DE-HGF)POF3-512 \|3 G:(DE-HGF)POF3 \|2 G:(DE-HGF)POF3-500 \|4 G:(DE-HGF)POF \|v Data-Intensive Science and Federated Computing \|x 0
914	1	_	\|y 2020
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a contrib
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a contb
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	1	_	\|a FullTexts

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help