TY  - CONF
AU  - Wang, Qin
AU  - Krajsek, Kai
AU  - Scharr, Hanno
TI  - Rescuing Easy Samples in Self-Supervised Pretraining
PB  - SCITEPRESS - Science and Technology Publications
M1  - FZJ-2025-04879
SN  - 978-989-758-728-3
SP  - 400 - 409
PY  - 2025
AB  - Many recent self-supervised pretraining methods use augmented versions of the same image as samples for their learning schemes. We observe that ’easy’ samples, i.e. samples being too similar to each other after augmentation, have only limited value as learning signal. We therefore propose to rescue easy samples and make them harder. To do so, we select the top k easiest samples using cosine similarity, strongly augment them, forward-pass them through the model, calculate cosine similarity of the output as loss, and add it to the original loss in a weighted fashion. This method can be adopted to all contrastive or other augmented-pair based learning methods, whether they involve negative pairs or not, as it changes handling of easy positives, only. This simple but effective approach introduces greater variability into such self-supervised pretraining processes, significantly increasing the performance on various downstream tasks as observed in our experiments. We pretrain models of di fferent sizes, i.e. ResNet-50, ViT-S, ViT-B, or ViT-L, using ImageNet with SimCLR, MoCo v3, or DINOv2 training schemes. Here, e.g., we consistently find to improve results for ImageNet top-1 accuracy with a linear classifier establishing new SOTA for this task.
T2  - 20th International Conference on Computer Vision Theory and Applications
CY  - 26 Feb 2025 - 28 Feb 2025, Porto (Portugal)
Y2  - 26 Feb 2025 - 28 Feb 2025
M2  - Porto, Portugal
LB  - PUB:(DE-HGF)8 ; PUB:(DE-HGF)7
DO  - DOI:10.5220/0013167900003912
UR  - https://juser.fz-juelich.de/record/1048764
ER  -