Preprint FZJ-2025-00072

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Equivariant Representation Learning for Augmentation-based Self-Supervised Learning via Image Reconstruction

 ;  ;

2024
arXiv

arXiv () [10.48550/arXiv.2412.03314]

This record in other databases:

Please use a persistent id in citations: doi:  doi:

Abstract: Augmentation-based self-supervised learning methods have shown remarkable success in self-supervised visual representation learning, excelling in learning invariant features but often neglecting equivariant ones. This limitation reduces the generalizability of foundation models, particularly for downstream tasks requiring equivariance. We propose integrating an image reconstruction task as an auxiliary component in augmentation-based self-supervised learning algorithms to facilitate equivariant feature learning without additional parameters. Our method implements a cross-attention mechanism to blend features learned from two augmented views, subsequently reconstructing one of them. This approach is adaptable to various datasets and augmented-pair based learning methods. We evaluate its effectiveness on learning equivariant features through multiple linear regression tasks and downstream applications on both artificial (3DIEBench) and natural (ImageNet) datasets. Results consistently demonstrate significant improvements over standard augmentation-based self-supervised learning methods and state-of-the-art approaches, particularly excelling in scenarios involving combined augmentations. Our method enhances the learning of both invariant and equivariant features, leading to more robust and generalizable visual representations for computer vision tasks.

Keyword(s): Computer Vision and Pattern Recognition (cs.CV) ; FOS: Computer and information sciences


Contributing Institute(s):
  1. Datenanalyse und Maschinenlernen (IAS-8)
  2. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)
  2. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  3. SLNS - SimLab Neuroscience (Helmholtz-SLNS) (Helmholtz-SLNS)

Appears in the scientific report 2024
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Institute Collections > IAS > IAS-8
Document types > Reports > Preprints
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2025-01-06, last modified 2025-02-03


OpenAccess:
Download fulltext PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)