On-Chip Learning via Transformer In-Context Learning

Finkbeiner, Jan Robert; Neftci, Emre

Items
Marc 21

001			1037904
005			20250203103256.0
024	7	_	\|a 10.34734/FZJ-2025-01042 \|2 datacite_doi
037	_	_	\|a FZJ-2025-01042
100	1	_	\|a Finkbeiner, Jan Robert \|0 P:(DE-Juel1)190112 \|b 0 \|u fzj
245	_	_	\|a On-Chip Learning via Transformer In-Context Learning
260	_	_	\|c 2024
336	7	_	\|a Preprint \|b preprint \|m preprint \|0 PUB:(DE-HGF)25 \|s 1738239208_31383 \|2 PUB:(DE-HGF)
336	7	_	\|a WORKING_PAPER \|2 ORCID
336	7	_	\|a Electronic Article \|0 28 \|2 EndNote
336	7	_	\|a preprint \|2 DRIVER
336	7	_	\|a ARTICLE \|2 BibTeX
336	7	_	\|a Output Types/Working Paper \|2 DataCite
520	_	_	\|a Autoregressive decoder-only transformers have become key components for scalable sequence processing and generation models. However, the transformer's self-attention mechanism requires transferring prior token projections from the main memory at each time step (token), thus severely limiting their performance on conventional processors. Self-attention can be viewed as a dynamic feed-forward layer, whose matrix is input sequence-dependent similarly to the result of local synaptic plasticity. Using this insight, we present a neuromorphic decoder-only transformer model that utilizes an on-chip plasticity processor to compute self-attention. Interestingly, the training of transformers enables them to ``learn'' the input context during inference. We demonstrate this in-context learning ability of transformers on the Loihi 2 processor by solving a few-shot classification problem. With this we emphasize the importance of pretrained models especially their ability to find simple, local, backpropagation free, learning rules enabling on-chip learning and adaptation in a hardware friendly manner.
536	_	_	\|a 5234 - Emerging NC Architectures (POF4-523) \|0 G:(DE-HGF)POF4-5234 \|c POF4-523 \|f POF IV \|x 0
536	_	_	\|a BMBF 03ZU1106CA - NeuroSys: Algorithm-Hardware Co-Design (Projekt C) - A (03ZU1106CA) \|0 G:(BMBF)03ZU1106CA \|c 03ZU1106CA \|x 1
536	_	_	\|a BMBF 03ZU1106CB - NeuroSys: Algorithm-Hardware Co-Design (Projekt C) - B (BMBF-03ZU1106CB) \|0 G:(DE-Juel1)BMBF-03ZU1106CB \|c BMBF-03ZU1106CB \|x 2
700	1	_	\|a Neftci, Emre \|0 P:(DE-Juel1)188273 \|b 1 \|u fzj
856	4	_	\|u https://arxiv.org/abs/2410.08711
856	4	_	\|u https://juser.fz-juelich.de/record/1037904/files/arxiv_On-Chip%20Learning%20via%20Transformer%20In-Context%20Learning.pdf \|y OpenAccess
909	C	O	\|o oai:juser.fz-juelich.de:1037904 \|p openaire \|p open_access \|p VDB \|p driver \|p dnbdelivery
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)190112
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)188273
913	1	_	\|a DE-HGF \|b Key Technologies \|l Natural, Artificial and Cognitive Information Processing \|1 G:(DE-HGF)POF4-520 \|0 G:(DE-HGF)POF4-523 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Neuromorphic Computing and Network Dynamics \|9 G:(DE-HGF)POF4-5234 \|x 0
914	1	_	\|y 2024
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)PGI-15-20210701 \|k PGI-15 \|l Neuromorphic Software Eco System \|x 0
980	1	_	\|a FullTexts
980	_	_	\|a preprint
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a I:(DE-Juel1)PGI-15-20210701

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe