001     1020896
005     20240226075320.0
020 _ _ |a 9781713871088
024 7 _ |a K10Plus:1857944542
|2 K10Plus
024 7 _ |a 10.34734/FZJ-2024-00372
|2 datacite_doi
037 _ _ |a FZJ-2024-00372
100 1 _ |a Schuhmann, Christoph
|0 P:(DE-HGF)0
|b 0
|e Corresponding author
111 2 _ |a 9781713871088
|g NeurIPS 2022
|c New Orleans, Louisiana
|d 2022-11-28 - 2022-12-09
|w USA
245 _ _ |a LAION-5B: An open large-scale dataset for training next generation image-text models
260 _ _ |a Red Hook, NY
|c 2022
|b Curran Associates, Inc.
300 _ _ |a 25278 - 25294
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1704965075_2197
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
490 0 _ |a Advances in neural information processing systems
|v 35
500 _ _ |a Also on arXiv: https://doi.org/10.48550/arXiv.2210.08402
520 _ _ |a Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classification with noteworthy out-of-distribution robustness. Since then, large-scale language-vision models like ALIGN, BASIC, GLIDE, Flamingo and Imagen made further improvements. Studying the training and capabilities of such models requires datasets containing billions of image-text pairs. Until now, no datasets of this size have been made openly available for the broader research community. To address this problem and democratize research on large-scale multi-modal models, we present LAION-5B - a dataset consisting of 5.85 billion CLIP-filteredimage-text pairs, of which 2.32B contain English language. We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale. Additionally we provide several nearest neighbor indices, an improved web-interface for dataset exploration and subset generation, and detection scores for watermark, NSFW, and toxic content detection.
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 0
588 _ _ |a Dataset connected to K10Plus
700 1 _ |a Beaumont, Romain
|0 P:(DE-HGF)0
|b 1
|e Corresponding author
700 1 _ |a Vencu, Richard
|0 P:(DE-HGF)0
|b 2
|e Corresponding author
700 1 _ |a Gordon, Cade
|0 P:(DE-HGF)0
|b 3
|e Corresponding author
700 1 _ |a Wightman, Ross
|0 P:(DE-HGF)0
|b 4
|e Corresponding author
700 1 _ |a Cherti, Mehdi
|0 P:(DE-Juel1)180894
|b 5
|e Corresponding author
|u fzj
700 1 _ |a Coombes, Theo
|0 P:(DE-HGF)0
|b 6
700 1 _ |a Katta, Aarush
|0 P:(DE-HGF)0
|b 7
700 1 _ |a Mullis, Clayton
|0 P:(DE-HGF)0
|b 8
700 1 _ |a Wortsman, Mitchell
|0 P:(DE-HGF)0
|b 9
700 1 _ |a Schramowsk, Patrick
|0 P:(DE-HGF)0
|b 10
700 1 _ |a Kundurthy, Srivatsa
|0 P:(DE-HGF)0
|b 11
700 1 _ |a Crowson, Katherine
|0 P:(DE-HGF)0
|b 12
700 1 _ |a Schmidt, Ludwig
|0 P:(DE-HGF)0
|b 13
700 1 _ |a Kaczmarczyk, Robert
|0 P:(DE-HGF)0
|b 14
700 1 _ |a Jitsev, Jenia
|0 P:(DE-Juel1)158080
|b 15
|u fzj
773 _ _ |v 35
856 4 _ |u https://proceedings.neurips.cc/paper_files/paper/2022/file/a1859debfb3b59d094f3504d5ebb6c25-Paper-Datasets_and_Benchmarks.pdf
856 4 _ |u https://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.pdf
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.gif?subformat=icon
|x icon
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-1440
|x icon-1440
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-180
|x icon-180
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-640
|x icon-640
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1020896
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 5
|6 P:(DE-Juel1)180894
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 15
|6 P:(DE-Juel1)158080
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 0
914 1 _ |y 2023
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21