TY  - CONF
AU  - Tran, Viet Anh Khoa
AU  - Neftci, Emre
AU  - Wybo, Willem
TI  - Continual learning using dendritic modulations on view-invariant feedforward weights
M1  - FZJ-2024-02719
PY  - 2024
AB  - The brain is remarkably adept at learning from a continuous stream of data without significantlyforgetting previously learnt skills. Conventional machine learning models struggle at continual learn-ing, as weight updates that optimize the current task interfere with previously learnt tasks. A simpleremedy to catastrophic forgetting is freezing a network pretrained on a set of base tasks, and trainingtask-specific readouts on this shared trunk. However, this assumes that representations in the frozennetwork are separable under new tasks, therefore leading to sub-par performance. To continually learnon novel task data, previous methods suggest weight consolidation – preserving weights that are mostimpactful for the performance of previous tasks – and memory-based approaches – where the networkis allowed to see a subset of images from previous tasks.For biological networks, prior work showed that dendritic top-down modulations provide a powerfulmechanism to learn novel tasks while initial feedforward weights solely extract generic view-invariantfeatures. Therefore, we propose a continual learner that optimizes the feedforward weights towardsview-invariant representations while training task-specific modulations towards separable class clus-ters. In a task-incremental setting, we train feedforward weights using a self-supervised algorithm,while training the task-specific modulations and readouts in a supervised fashion, both exclusivelythrough current-task data. We show that this simple approach avoids catastrophic forgetting of classclusters, as opposed to training the whole network in a supervised manner, while also outperforming(a) task-specific readout without modulations and (b) frozen feedforward weights. This suggests that(a) top-down modulations are necessary and sufficient to shift the representations towards separableclusters and that (b) the SSL objective learns novel features based on the newly presented objectswhile maintaining features relevant to previous tasks, without requiring specific synaptic consolidationmechanisms.
T2  - Computational and Systems Neuroscience 2024
CY  - 29 Feb 2024 - 3 Mar 2024, Lisbon (Portugal)
Y2  - 29 Feb 2024 - 3 Mar 2024
M2  - Lisbon, Portugal
LB  - PUB:(DE-HGF)24
DO  - DOI:10.34734/FZJ-2024-02719
UR  - https://juser.fz-juelich.de/record/1025142
ER  -