Data.gov (US Open Data) The U.S. government’s open data portal aggregating hundreds of thousands of datasets. 0390 Datasets & Labeling# API# catalog# open government
CVAT Leading open-source image/video annotation with auto-annotation and team workflows. 0360 Datasets & Labeling# annotation# computer vision# CVAT
OpenML Open platform to share datasets, tasks, and benchmarks for machine learning. 0380 Datasets & Labeling# benchmarks# datasets# experiments
Scale Nucleus Dataset management and curation—debug models, fix labels, and improve data quality. 0460 Datasets & Labeling# curation# dataset management# Nucleus
UCI Machine Learning Repository Classic collection of ML datasets used in research and education. 0500 Datasets & Labeling# benchmark# education# ML datasets
SuperAnnotate AI-assisted annotation and services with workflow, QA, and multimodal support. 0390 Datasets & Labeling# annotation# multimodal# QA
data.world Community Social data catalog to share, query, and collaborate on open datasets. 0330 Datasets & Labeling# catalog# community# data.world
V7 Darwin Professional CV labeling with model-in-the-loop and medical/video tooling. 0400 Datasets & Labeling# annotation# auto-annotate# Darwin
Zenodo CERN-hosted open repository for research data with DOIs and long-term preservation. 0390 Datasets & Labeling# CERN# DOI# open repository
Encord Annotate AI + HITL labeling with customizable workflows, analytics, and ontology management. 0440 Datasets & Labeling# AI-assisted# Encord# HITL
Common Crawl Free, open web-crawl corpus for large-scale text/data mining. 0620 Datasets & Labeling# Common Crawl# corpus# open
Prodigy Developer-centric annotation tool with strong active learning for NLP/CV/A/V. 0400 Datasets & Labeling# active learning# annotation# CV
Google Dataset Search Meta-search across millions of datasets on the web with schema.org-powered indexing. 0400 Datasets & Labeling# catalog# dataset search# metadata
Open Images Dataset Large-scale annotated image dataset with boxes, masks, relationships, and more. 0430 Datasets & Labeling# annotations# detection# Open Images
doccano (OSS) Open-source text annotation for classification, NER, and seq2seq tasks. 0470 Datasets & Labeling# doccano# NER# open source