001020896 001__ 1020896
001020896 005__ 20240226075320.0
001020896 020__ $$a9781713871088
001020896 0247_ $$2K10Plus$$aK10Plus:1857944542
001020896 0247_ $$2datacite_doi$$a10.34734/FZJ-2024-00372
001020896 037__ $$aFZJ-2024-00372
001020896 1001_ $$0P:(DE-HGF)0$$aSchuhmann, Christoph$$b0$$eCorresponding author
001020896 1112_ $$a9781713871088$$cNew Orleans, Louisiana$$d2022-11-28 - 2022-12-09$$gNeurIPS 2022$$wUSA
001020896 245__ $$aLAION-5B: An open large-scale dataset for training next generation image-text models
001020896 260__ $$aRed Hook, NY$$bCurran Associates, Inc.$$c2022
001020896 300__ $$a25278 - 25294
001020896 3367_ $$2ORCID$$aCONFERENCE_PAPER
001020896 3367_ $$033$$2EndNote$$aConference Paper
001020896 3367_ $$2BibTeX$$aINPROCEEDINGS
001020896 3367_ $$2DRIVER$$aconferenceObject
001020896 3367_ $$2DataCite$$aOutput Types/Conference Paper
001020896 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1704965075_2197
001020896 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
001020896 4900_ $$aAdvances in neural information processing systems$$v35
001020896 500__ $$aAlso on arXiv: https://doi.org/10.48550/arXiv.2210.08402
001020896 520__ $$aGroundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classification with noteworthy out-of-distribution robustness. Since then, large-scale language-vision models like ALIGN, BASIC, GLIDE, Flamingo and Imagen made further improvements. Studying the training and capabilities of such models requires datasets containing billions of image-text pairs. Until now, no datasets of this size have been made openly available for the broader research community. To address this problem and democratize research on large-scale multi-modal models, we present LAION-5B - a dataset consisting of 5.85 billion CLIP-filteredimage-text pairs, of which 2.32B contain English language. We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale. Additionally we provide several nearest neighbor indices, an improved web-interface for dataset exploration and subset generation, and detection scores for watermark, NSFW, and toxic content detection.
001020896 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001020896 588__ $$aDataset connected to K10Plus
001020896 7001_ $$0P:(DE-HGF)0$$aBeaumont, Romain$$b1$$eCorresponding author
001020896 7001_ $$0P:(DE-HGF)0$$aVencu, Richard$$b2$$eCorresponding author
001020896 7001_ $$0P:(DE-HGF)0$$aGordon, Cade$$b3$$eCorresponding author
001020896 7001_ $$0P:(DE-HGF)0$$aWightman, Ross$$b4$$eCorresponding author
001020896 7001_ $$0P:(DE-Juel1)180894$$aCherti, Mehdi$$b5$$eCorresponding author$$ufzj
001020896 7001_ $$0P:(DE-HGF)0$$aCoombes, Theo$$b6
001020896 7001_ $$0P:(DE-HGF)0$$aKatta, Aarush$$b7
001020896 7001_ $$0P:(DE-HGF)0$$aMullis, Clayton$$b8
001020896 7001_ $$0P:(DE-HGF)0$$aWortsman, Mitchell$$b9
001020896 7001_ $$0P:(DE-HGF)0$$aSchramowsk, Patrick$$b10
001020896 7001_ $$0P:(DE-HGF)0$$aKundurthy, Srivatsa$$b11
001020896 7001_ $$0P:(DE-HGF)0$$aCrowson, Katherine$$b12
001020896 7001_ $$0P:(DE-HGF)0$$aSchmidt, Ludwig$$b13
001020896 7001_ $$0P:(DE-HGF)0$$aKaczmarczyk, Robert$$b14
001020896 7001_ $$0P:(DE-Juel1)158080$$aJitsev, Jenia$$b15$$ufzj
001020896 773__ $$v35
001020896 8564_ $$uhttps://proceedings.neurips.cc/paper_files/paper/2022/file/a1859debfb3b59d094f3504d5ebb6c25-Paper-Datasets_and_Benchmarks.pdf
001020896 8564_ $$uhttps://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.pdf$$yOpenAccess
001020896 8564_ $$uhttps://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.gif?subformat=icon$$xicon$$yOpenAccess
001020896 8564_ $$uhttps://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess
001020896 8564_ $$uhttps://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-180$$xicon-180$$yOpenAccess
001020896 8564_ $$uhttps://juser.fz-juelich.de/record/1020896/files/NeurIPS-2022-laion-5b-an-open-large-scale-dataset-for-training-next-generation-image-text-models-Paper-Datasets_and_Benchmarks.jpg?subformat=icon-640$$xicon-640$$yOpenAccess
001020896 909CO $$ooai:juser.fz-juelich.de:1020896$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
001020896 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180894$$aForschungszentrum Jülich$$b5$$kFZJ
001020896 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)158080$$aForschungszentrum Jülich$$b15$$kFZJ
001020896 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001020896 9141_ $$y2023
001020896 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001020896 920__ $$lyes
001020896 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001020896 980__ $$acontrib
001020896 980__ $$aVDB
001020896 980__ $$aUNRESTRICTED
001020896 980__ $$acontb
001020896 980__ $$aI:(DE-Juel1)JSC-20090406
001020896 9801_ $$aFullTexts