Google Dataset Search Meta-search across millions of datasets on the web with schema.org-powered indexing. 0400 Datasets & Labeling# catalog# dataset search# metadata
Open Images Dataset Large-scale annotated image dataset with boxes, masks, relationships, and more. 0430 Datasets & Labeling# annotations# detection# Open Images
doccano (OSS) Open-source text annotation for classification, NER, and seq2seq tasks. 0470 Datasets & Labeling# doccano# NER# open source
Kaggle Datasets Community-driven repository of datasets with notebooks, discussions, and trending signals. 0420 Datasets & Labeling# community# datasets# Kaggle
LAION-5B Massive open image–text dataset (multilingual) widely used for generative models. 0380 Datasets & Labeling# CLIP# image-text# LAION
Ollama Library (for local label assist & synthetic data) Run open-weight LLMs locally to assist labeling or generate synthetic datasets. 0450 Datasets & Labeling# label assist# local LLM# offline
Hugging Face Datasets Hub Huge catalog of ML-ready datasets with cards, viewers, and the 🤗 Datasets library. 0370 Datasets & Labeling# dataset cards# datasets# hub
Roboflow Universe Huge community hub of computer-vision datasets and pre-trained models. 0460 Datasets & Labeling# community# models# Roboflow
Google Cloud Public Datasets Curated, analysis-ready public datasets hosted on Google Cloud/BigQuery. 0420 Datasets & Labeling# analytics# BigQuery# Google Cloud