001     1037904
005     20250203103256.0
024 7 _ |a 10.34734/FZJ-2025-01042
|2 datacite_doi
037 _ _ |a FZJ-2025-01042
100 1 _ |a Finkbeiner, Jan Robert
|0 P:(DE-Juel1)190112
|b 0
|u fzj
245 _ _ |a On-Chip Learning via Transformer In-Context Learning
260 _ _ |c 2024
336 7 _ |a Preprint
|b preprint
|m preprint
|0 PUB:(DE-HGF)25
|s 1738239208_31383
|2 PUB:(DE-HGF)
336 7 _ |a WORKING_PAPER
|2 ORCID
336 7 _ |a Electronic Article
|0 28
|2 EndNote
336 7 _ |a preprint
|2 DRIVER
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a Output Types/Working Paper
|2 DataCite
520 _ _ |a Autoregressive decoder-only transformers have become key components for scalable sequence processing and generation models. However, the transformer's self-attention mechanism requires transferring prior token projections from the main memory at each time step (token), thus severely limiting their performance on conventional processors. Self-attention can be viewed as a dynamic feed-forward layer, whose matrix is input sequence-dependent similarly to the result of local synaptic plasticity. Using this insight, we present a neuromorphic decoder-only transformer model that utilizes an on-chip plasticity processor to compute self-attention. Interestingly, the training of transformers enables them to ``learn'' the input context during inference. We demonstrate this in-context learning ability of transformers on the Loihi 2 processor by solving a few-shot classification problem. With this we emphasize the importance of pretrained models especially their ability to find simple, local, backpropagation free, learning rules enabling on-chip learning and adaptation in a hardware friendly manner.
536 _ _ |a 5234 - Emerging NC Architectures (POF4-523)
|0 G:(DE-HGF)POF4-5234
|c POF4-523
|f POF IV
|x 0
536 _ _ |a BMBF 03ZU1106CA - NeuroSys: Algorithm-Hardware Co-Design (Projekt C) - A (03ZU1106CA)
|0 G:(BMBF)03ZU1106CA
|c 03ZU1106CA
|x 1
536 _ _ |a BMBF 03ZU1106CB - NeuroSys: Algorithm-Hardware Co-Design (Projekt C) - B (BMBF-03ZU1106CB)
|0 G:(DE-Juel1)BMBF-03ZU1106CB
|c BMBF-03ZU1106CB
|x 2
700 1 _ |a Neftci, Emre
|0 P:(DE-Juel1)188273
|b 1
|u fzj
856 4 _ |u https://arxiv.org/abs/2410.08711
856 4 _ |u https://juser.fz-juelich.de/record/1037904/files/arxiv_On-Chip%20Learning%20via%20Transformer%20In-Context%20Learning.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1037904
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)190112
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)188273
913 1 _ |a DE-HGF
|b Key Technologies
|l Natural, Artificial and Cognitive Information Processing
|1 G:(DE-HGF)POF4-520
|0 G:(DE-HGF)POF4-523
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Neuromorphic Computing and Network Dynamics
|9 G:(DE-HGF)POF4-5234
|x 0
914 1 _ |y 2024
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)PGI-15-20210701
|k PGI-15
|l Neuromorphic Software Eco System
|x 0
980 1 _ |a FullTexts
980 _ _ |a preprint
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)PGI-15-20210701


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21