Name and URL |
Category |
1000 Genomes |
Biology |
American Gut (Microbiome Project) |
Biology |
Animal species occurrence |
Biology |
Bird invasions |
Biology |
Bird-building collisions |
Biology |
Broad Bioimage Benchmark Collection (BBBC) |
Biology |
Cell Image Library |
Biology |
Chimp personalities |
Biology |
Complete Genomics Public Data |
Biology |
Coremine Medical |
Biology |
Declared Dangerous Dogs |
Biology |
EBI ArrayExpress – ArrayExpress Archive of Functional Genomics Data |
Biology |
EBI Protein Data Bank in Europe |
Biology |
Electron Microscopy Pilot Image Archive (EMPIAR) |
Biology |
ENCODE project (Encyclopedia of DNA Elements) |
Biology |
Ensembl Genomes |
Biology |
Equine deaths in New York |
Biology |
Gene Expression Omnibus (GEO) |
Biology |
Gene Ontology (GO) |
Biology |
Global Biotic Interactions (GloBI) |
Biology |
Harvard Medical School (HMS) LINCS Project |
Biology |
Human Genome Diversity Project – Stanford |
Biology |
Human Microbiome Project (HMP) |
Biology |
ICOS PSP Benchmark |
Biology |
Imported bats |
Biology |
Invasive species |
Biology |
Journal of Cell Biology DataViewer |
Biology |
KEGG |
Biology |
Mammal data |
Biology |
MIT Cancer Genomics Data |
Biology |
National Centers for Environmental Information |
Biology |
National Oceanic and Atmospheric Administration Fisheries |
Biology |
NCBI Proteins |
Biology |
NCBI Taxonomy |
Biology |
NCI Genomic Data Commons |
Biology |
OpenSNP genotypes data |
Biology |
Palmer Penguins |
Biology |
Pathguid Protein Interactions Catalog |
Biology |
Penguins |
Biology |
Protein Data Bank |
Biology |
Psychiatric Genomics Consortium |
Biology |
PubChem Project |
Biology |
Rfam |
Biology |
Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) |
Biology |
Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) |
Biology |
Sequence Read Archive (SRA) |
Biology |
Shrinking salmon |
Biology |
Stowers Institute Original Data Repository |
Biology |
Systems Science of Biological Dynamics (SSBD) Database |
Biology |
The Cancer Genome Atlas (TCGA) |
Biology |
The Catalogue of Life |
Biology |
The Personal Genome Project |
Biology |
UCSC Public Data |
Biology |
UniGene |
Biology |
Universal Protein Resource (UnitProt) |
Biology |
Zoo animal lifespans |
Biology |
Actuaries Climate Index |
Climate and Weather |
Australian Weather |
Climate and Weather |
Aviation Weather Center |
Climate and Weather |
Canadian Meteorological Centre |
Climate and Weather |
Charting The Global Climate Change News Narrative 2009-2020 |
Climate and Weather |
Climate Data from UEA |
Climate and Weather |
Dutch Weather |
Climate and Weather |
European Climate Assessment & Dataset |
Climate and Weather |
German Climate Data Center |
Climate and Weather |
Global Climate Data Since 1929 |
Climate and Weather |
NASA Global Imagery Browse Services |
Climate and Weather |
NOAA Bering Sea Climate |
Climate and Weather |
NOAA Climate Datasets |
Climate and Weather |
NOAA Real-time Weather Models |
Climate and Weather |
NOAA SURFRAD Meteorology and Radiation Datasets |
Climate and Weather |
UEA Climatic Research Unit |
Climate and Weather |
Wahington Post Climate Change |
Climate and Weather |
WorldClim – Global Climate Data |
Climate and Weather |
WU Historical Weather Worldwide |
Climate and Weather |
AMiner Citation Network Dataset |
Complex Networks |
Community Resource for Archiving Wireless Data |
Complex Networks |
CrossRef DOI URLs |
Complex Networks |
DBLP Citation dataset |
Complex Networks |
DIMACS Road Networks Collection |
Complex Networks |
NBER Patent Citations |
Complex Networks |
Network Repository |
Complex Networks |
Small Network Data |
Complex Networks |
Stanford GraphBase |
Complex Networks |
Stanford Large Network Dataset Collection |
Complex Networks |
The Laboratory for Web Algorithmics (UNIMI) |
Complex Networks |
UCI Network Data Repository |
Complex Networks |
UFL sparse matrix collection |
Complex Networks |
3.5B Web Pages from CommonCrawl 2012 |
Computer Networks |
53.5B Web clicks of 100K users in Indiana Univ. |
Computer Networks |
CAIDA Internet Datasets |
Computer Networks |
ClueWeb09 – 1B web pages |
Computer Networks |
ClueWeb12 – 733M web pages |
Computer Networks |
CommonCrawl Web Data |
Computer Networks |
Internet-Wide Scan Data Repository |
Computer Networks |
OONI: Open Observatory of Network Interference |
Computer Networks |
Rapid7 Sonar Internet Scans |
Computer Networks |
The Peer-to-Peer Trace Archive |
Computer Networks |
UCSD Network Telescope IPv4 /8 net |
Computer Networks |
38-Cloud (Cloud Detection) |
Earth |
African agricultural survey |
Earth |
Air quality |
Earth |
Alabama Real-Time Coastal Observing System |
Earth |
AQUASTAT Global water resources and uses |
Earth |
BODC Marine Data |
Earth |
CERN Open Data Portal |
Earth |
Cloud data |
Earth |
Complete Plants Checklist (US Department of Agriculture) |
Earth |
Crystallography Open Database |
Earth |
EOSDIS – NASA’s earth observing system data |
Earth |
Global dataset of historical crop yields |
Earth |
Global Wind Atlas |
Earth |
Grapes |
Earth |
Harvesting statistics |
Earth |
Historical crop yields |
Earth |
Hyperspectral benchmark dataset on soil moisture |
Earth |
IceCube – South Pole Neutrino Observatory |
Earth |
Integrated Marine Observing System (IMOS) |
Earth |
Lemons quality control dataset |
Earth |
Ligo Open Science Center (LOSC) |
Earth |
Marinexplore – Open Oceanographic Data |
Earth |
NASA Earth Science |
Earth |
NASA Exoplanet Archive |
Earth |
NASA Space |
Earth |
National Estuarine Research Reserves System-Wide Monitoring Program |
Earth |
NSSDC (NASA) data of 550 space spacecraft |
Earth |
Oil and Gas Authority Open Data |
Earth |
Optimized Soil Adjusted Vegetation Index |
Earth |
Sloan Digital Sky Survey (SDSS) – Mapping the Universe |
Earth |
Smithsonian Institution Global Volcano and Eruption Database |
Earth |
US forest tree distributions |
Earth |
USGS Earthquake Archives |
Earth |
World glaciers |
Earth |
Academic parental leave policies |
Economics |
Aggregated smartphone movements |
Economics |
Alcohol content in products |
Economics |
American Economic Association (AEA) |
Economics |
Building footprints |
Economics |
Construction spending |
Economics |
Corporate misconduct |
Economics |
Corporate risk conversations |
Economics |
DBnomics – the world’s economic database |
Economics |
Elevators in NYC |
Economics |
Euro-bank speeches |
Economics |
FCC telemarketer and robocall complaints |
Economics |
Florida billboards |
Economics |
Global economic forecasts |
Economics |
Global inequality |
Economics |
Historical empires |
Economics |
Historical MacroEconomic Statistics |
Economics |
Home Price indexes |
Economics |
Hotel bookings |
Economics |
Housework |
Economics |
Household surveys |
Economics |
How we spend our time |
Economics |
Humans using computers |
Economics |
Internet Product Code Database |
Economics |
Iowa liquor purchases |
Economics |
Job data and numbers |
Economics |
Joint External Debt Data Hub |
Economics |
Jon Haveman International Trade Data Links |
Economics |
Late-medieval English immigrants |
Economics |
London grocery purchases |
Economics |
Long-Term Productivity Database |
Economics |
Maternity leave policies for US companies |
Economics |
Multinational corporations |
Economics |
NEH grants |
Economics |
New York real estate brokers |
Economics |
Nike factories |
Economics |
Old British lighthouses |
Economics |
OpenCorporates Database of Companies in the World |
Economics |
Our World in Data |
Economics |
Patents and trademarks |
Economics |
Population estimates |
Economics |
Public-sector employment |
Economics |
Religion in America |
Economics |
Restaurant menus |
Economics |
Roman empire movements |
Economics |
Rooftop water tanks |
Economics |
Science grants |
Economics |
Sidewalk grates |
Economics |
Space dollars |
Economics |
The Atlas of Economic Complexity |
Economics |
The Big Mac Index |
Economics |
The Center for International Data |
Economics |
The early Islamic world |
Economics |
UK companies |
Economics |
UN Commodity Trade Statistics |
Economics |
UN Human Development Reports |
Economics |
US consumer information |
Economics |
US house construction |
Economics |
US liquor licenses |
Economics |
US manufacturing |
Economics |
US petroleum supply and exports |
Economics |
US power consumption and generation |
Economics |
US retail data |
Economics |
USA Trade, Imports, Exports |
Economics |
War debt |
Economics |
Welsh shipping crews |
Economics |
Bad words |
Education |
College Scorecard Data |
Education |
Education data, unified |
Education |
Encyclopaedia Britannica |
Education |
New York State Education Department Data |
Education |
Pages of books |
Education |
Poems by kids |
Education |
Seattle library checkouts |
Education |
Student Data from Free Code Camp |
Education |
Wikipedia revisions for editors |
Education |
AMPds – The Almanac of Minutely Power dataset |
Energy |
BLUEd – Building-Level fully labelled Electricity Disaggregation dataset |
Energy |
COMBED |
Energy |
DBFC – Direct Borohydride Fuel Cell (DBFC) Dataset |
Energy |
DEL – Domestic Electrical Load study datasets for South Africa (1994 – 2014) |
Energy |
ECO |
Energy |
EIA |
Energy |
Electricity prices |
Energy |
Electricity utilities |
Energy |
Emissions indicators |
Energy |
Global Power Plant Database |
Energy |
HFED |
Energy |
Home energy consumption |
Energy |
iAWE |
Energy |
PEM1 – Proton Exchange Membrane (PEM) Fuel Cell Dataset |
Energy |
Power outages |
Energy |
REDD |
Energy |
Smart Meter Data Portal |
Energy |
State-owned oil companies |
Energy |
SYND |
Energy |
The Public Utility Data Liberation Project (PUDL) |
Energy |
Tracebase |
Energy |
UK-DALE – UK Domestic Appliance-Level Electricity |
Energy |
Ukraine Energy Centre Datasets |
Energy |
US electricity prices |
Energy |
A decade of TV news words |
Entertainment |
Anthony Bourdain’s travels |
Entertainment |
BFI film industry statistics |
Entertainment |
Billboard music hits and lyrics |
Entertainment |
Boy bands |
Entertainment |
Breaking Bad data |
Entertainment |
Classical music data |
Entertainment |
Crossword data |
Entertainment |
Drama |
Entertainment |
Friends TV show analysis |
Entertainment |
Hunger Games survival |
Entertainment |
Indian movie theatres |
Entertainment |
Movie dialog |
Entertainment |
Movie scripts and genders |
Entertainment |
Music artists |
Entertainment |
NYC film and TV permits |
Entertainment |
Pinball |
Entertainment |
Ramen ratings |
Entertainment |
Rotten Tomatoes reviews |
Entertainment |
Star Trek data |
Entertainment |
Studio Ghibli data |
Entertainment |
Talk radio transcripts |
Entertainment |
Tarantino movie data |
Entertainment |
Billionaire list |
Finance |
BIS Statistics |
Finance |
Blockmodo Coin Registry |
Finance |
CBOE Futures Exchange |
Finance |
Federal Reserve forecasts |
Finance |
Financial access |
Finance |
Financial crisis events |
Finance |
Global debt |
Finance |
Global finance history |
Finance |
Kansas CIty cars at auctions |
Finance |
Low-Income College Graduation Rates |
Finance |
St Louis Federal |
Finance |
Taxation data |
Finance |
UK post-graduation earnings. |
Finance |
Antarctic icebergs |
Geographical |
Antarctic infrastructure |
Geographical |
ArcGIS Open Data portal |
Geographical |
Awesome 3D Semantic City Models |
Geographical |
British Isle bumps |
Geographical |
Cambridge MA US GIS data on GitHub |
Geographical |
Countries, States, subdivisions, provinces |
Geographical |
Country Typology Codes |
Geographical |
Deepwater Horizon’s effects |
Geographical |
Every place name in the United States. |
Geographical |
Forest cover |
Geographical |
Geo Maps |
Geographical |
Geo Wiki Project |
Geographical |
GeoFabrik |
Geographical |
GeoNames Worldwide |
Geographical |
Global Administrative Areas Database (GADM) |
Geographical |
Homeland Infrastructure Foundation-Level Data |
Geographical |
IEEE Geoscience and Remote Sensing Society DASE Website |
Geographical |
Landsat 8 on AWS |
Geographical |
List of all countries in all languages |
Geographical |
National Park data |
Geographical |
National park trails |
Geographical |
National Weather Service GIS Data Portal |
Geographical |
Natural Earth – vectors and rasters of the world |
Geographical |
Nighttime brightness in Niger and Nigeria |
Geographical |
North America ecoregions |
Geographical |
Offshore drilling |
Geographical |
OpenAddresses |
Geographical |
OpenStreetMap (OSM) |
Geographical |
Palm trees |
Geographical |
Patent geography |
Geographical |
Permafrost thickness |
Geographical |
Pleiades – Gazetteer and graph of ancient places |
Geographical |
Reverse Geocoder using OSM data |
Geographical |
Robin Wilson Free GIS Datasets |
Geographical |
San Francisco Bay water |
Geographical |
The height of the frozen world |
Geographical |
TIGER/Line U.S. boundaries and roads |
Geographical |
Tropical cyclone simulations |
Geographical |
TwoFishes – Foursquare’s coarse geocoder |
Geographical |
TZ Timezones shapefile |
Geographical |
UK noise pollution |
Geographical |
UN Environmental Data |
Geographical |
US beach data |
Geographical |
US national park visitors |
Geographical |
World boundaries from the U.S. Department of State |
Geographical |
World countries in multiple formats |
Geographical |
Affordable Care Act data |
Government |
Albuquerque, New Mexico |
Government |
Anti-press incidents |
Government |
Antwerp Belgium |
Government |
Austin, Texas, USA |
Government |
Australia |
Government |
Austria |
Government |
Beersheba, Israel |
Government |
Belgium |
Government |
Brazil |
Government |
Buenos Aires, Argentina |
Government |
Calgary, Alberta, Canada |
Government |
Cambridge, Massachusetts, US |
Government |
Canada |
Government |
Chicago |
Government |
Chicago prosecutions |
Government |
Chile |
Government |
City of Berkeley |
Government |
Congressional district demographics |
Government |
Congressional whip counts |
Government |
Constitution proposed amendments |
Government |
Crime in cities |
Government |
Dallas |
Government |
DataBC (British Columbia) |
Government |
Datos Argentina |
Government |
Debt to the Penny |
Government |
Denver, USA |
Government |
Dual citizenship policies |
Government |
Durham, North Carolina, USA |
Government |
Edmonton, Alberta, Canada |
Government |
England |
Government |
EU laws |
Government |
European electoral polling |
Government |
European opinion polls |
Government |
EuroStat |
Government |
EveryPolitician |
Government |
Exonerations in the USA |
Government |
Federal Committee on Statistical Methodology (FCSM) |
Government |
Finland |
Government |
Foreign ministers |
Government |
France |
Government |
Freedom of Information Act reports |
Government |
Germany |
Government |
Ghent Belgium |
Government |
Glasgow Scotland UK |
Government |
Greece |
Government |
Halifax NS Canada |
Government |
Helsinki Region Finland |
Government |
Hong Kong |
Government |
Hyperlocal Biden/Trump results |
Government |
Indian Government Data |
Government |
Indonesian Data Portal |
Government |
International labor treaties |
Government |
Iowa |
Government |
Ireland |
Government |
Israel |
Government |
Istanbul Municipality |
Government |
Italy |
Government |
Jail deaths in America |
Government |
Japan |
Government |
Laval QC Canada |
Government |
Lexington KY |
Government |
Licensed Firearms Dealers |
Government |
London UK |
Government |
Los Angeles |
Government |
Luxembourg |
Government |
Mass shootings in America |
Government |
MassGIS Massachusetts USA |
Government |
Metropolitan Transportation Commission (MTC) California US |
Government |
Mexico |
Government |
Mississauga ON Canada |
Government |
Moldova |
Government |
Moncton NB Canada |
Government |
Montreal QC Canada |
Government |
Mother of Jones – US mass shootings |
Government |
Mountain View California US (GIS) |
Government |
Netherlands |
Government |
New York Department of Sanitation Monthly Tonnage |
Government |
New York personalised license plates |
Government |
New Zealand |
Government |
NYC betanyc |
Government |
NYC Open Data |
Government |
Oakland California US |
Government |
OECD |
Government |
Oklahoma |
Government |
Open Data for Africa |
Government |
OpenDataPhilly |
Government |
OpenDataSoft |
Government |
Oregon |
Government |
Ottawa ON Canada |
Government |
Palo Alto California US |
Government |
Peace agreements |
Government |
Police Open Data Census |
Government |
Political conditions |
Government |
Political resistance campaigns |
Government |
Portland Oregon |
Government |
Portugal |
Government |
Presidential popularity |
Government |
Privacy policies |
Government |
Public health policy |
Government |
Public pension plans |
Government |
Puerto Rico Government |
Government |
Quebec Province of Canada |
Government |
R&D spending |
Government |
Regina SK Canada |
Government |
Retiree language preference |
Government |
Rio de Janeiro Brazil |
Government |
Romania |
Government |
Ruling elites |
Government |
Russia |
Government |
San Antonio TX |
Government |
San Diego CA |
Government |
San Francisco Data sets |
Government |
San Jose California US |
Government |
San Mateo County California US |
Government |
Saskatchewan Province of Canada |
Government |
School shootings |
Government |
SCOTUS confirmation transcripts |
Government |
Seattle |
Government |
Shot Spotter (gunshot detections) data |
Government |
Singapore |
Government |
Social scientists testifying |
Government |
South Africa |
Government |
State Department daily travel allowance |
Government |
State of Utah US |
Government |
Supreme Court v Congress |
Government |
Switzerland |
Government |
Taiwan |
Government |
Terrorism prosecutions |
Government |
Testing in schools |
Government |
Texas |
Government |
Texas licenses |
Government |
The World Bank |
Government |
Trade agreements |
Government |
UK 2011 Census Open Atlas Project |
Government |
UK fire stats |
Government |
UK general elections |
Government |
UK Government Data |
Government |
UK rules data |
Government |
UK wine cellar |
Government |
Ukraine |
Government |
UN security council debates |
Government |
UN vote history |
Government |
United Nations |
Government |
Uruguay |
Government |
US Cabinet turnover |
Government |
US citizens’ deaths overseas |
Government |
US Counties |
Government |
US county-level and precinct-level results |
Government |
US domestic radicalisation |
Government |
US Executive orders |
Government |
US government court payouts |
Government |
US government paperwork |
Government |
US House of Representatives gift travel |
Government |
US local justice data |
Government |
US marriage, divorce, pregnancy, and infertility |
Government |
US national opinions |
Government |
US regional Medicare usage |
Government |
US state governor addresses |
Government |
US state relations |
Government |
US state-level election results |
Government |
US Supreme Court transcripts |
Government |
US veterans |
Government |
USA American Community Survey |
Government |
USA CDC Public Health datasets |
Government |
USA Census Bureau |
Government |
USA Congressional Research Service (CRS) Reports |
Government |
USA Department of Housing and Urban Development (HUD) |
Government |
USA Federal Government Agencies |
Government |
USA Federal Government Data Catalog |
Government |
USA Food and Drug Administration (FDA) |
Government |
USA National Center for Education Statistics (NCES) |
Government |
USA Open Government |
Government |
USA Patent and Trademark Office (USPTO) Bulk Data Products |
Government |
Valley Transportation Authority (VTA) California US |
Government |
Victoria BC Canada |
Government |
Vienna Austria |
Government |
Voter attitudes and choices |
Government |
Voting equipment |
Government |
Windy City murals |
Government |
2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE |
Health |
Allen Institute Datasets |
Health |
Animals with SARS-CoV-2 |
Health |
Britain’s diet |
Health |
CDC behavioural risk |
Health |
Collaborative Research in Computational Neuroscience (CRCNS) |
Health |
Composition of Foods Raw Processed Prepared USDA National Nutrient Database for Standard |
Health |
Consumer-product chemicals |
Health |
Cooks and kitchen tasks |
Health |
Coronavirus (Covid-19) Data in the United States |
Health |
Coronavirus research papers |
Health |
COVID-19 Case Surveillance Public Use Data |
Health |
COVID-19 cases |
Health |
COVID-19 in prisons |
Health |
COVID-19 Reported Patient Impact and Hospital Capacity by Facility |
Health |
COVID-related behaviour |
Health |
Deaths on the job |
Health |
Dog bites |
Health |
Drug-free school zones in Tennessee |
Health |
Drugs and side effects |
Health |
Early Onset Prostate Cancer Germany |
Health |
EHDP Large Health Data Sets |
Health |
Fatal car crashes |
Health |
FCP-INDI |
Health |
Food Outbreak Online Database |
Health |
Gapminder World demographic databases |
Health |
GDC |
Health |
GENIE – Data from the Genomics Evidence Neoplasia Information Exchange |
Health |
Genomic Hallmarks Prostate Adenocarcinoma CPC GENE |
Health |
Health data breaches |
Health |
Healthcare service in Africa |
Health |
Hospitals |
Health |
Human Connectome Project |
Health |
ICGC Cancer Datasets |
Health |
Informatics for Integrating Biology & the Bedside |
Health |
MeDAL |
Health |
Medicare Coverage Database (MCD) U.S. |
Health |
Medicare Data Engine of medicare.gov Data |
Health |
Medicare drug prices |
Health |
NaF-Prostate |
Health |
NDAR |
Health |
NeuroData |
Health |
Neuroelectro |
Health |
Neuroendocrine Prostate Cancer |
Health |
NeuroMorpho – NeuroMorpho.Org is a centrally curated inventory of |
Health |
Number of Ebola Cases and Deaths in Affected Countries (2014) |
Health |
OASIS |
Health |
Open-ODS (structure of the UK NHS) |
Health |
OpenNEURO |
Health |
OpenPaymentsData Healthcare |
Health |
Pandemic travel restrictions |
Health |
PhysioBank Databases |
Health |
PLCO Datasets |
Health |
School vaccination rates |
Health |
Severe workplace injuries |
Health |
Social assistance programs |
Health |
Study Forrest |
Health |
Subnational COVID-19 case counts |
Health |
The Cancer Genome Atlas project (TCGA) |
Health |
The Cancer Imaging Archive (TCIA) |
Health |
The COVID Tracking Project |
Health |
Two decades of tobacco (and e-cigarette) laws |
Health |
US dairy information |
Health |
US emergency room visits |
Health |
US vaccination rates in adults |
Health |
US workplace safety |
Health |
World Health Organization Global Health Observatory |
Health |
Yahoo Knowledge Graph COVID-19 Datasets |
Health |
Zika virus data |
Health |
Canada Science and Technology Museums Corporation’s Open Data |
Museums |
Cooper-Hewitt’s Collection Database |
Museums |
Metropolitan Museum of Art Collection API |
Museums |
Minneapolis Institute of Arts |
Museums |
Museum of Modern Art collection |
Museums |
Natural History Museum (London) Data Portal |
Museums |
Rijksmuseum Historical Art Collection |
Museums |
Tate Collection metadata |
Museums |
The Getty vocabularies |
Museums |
Automatic Keyphrase Extraction |
Natural Language |
Blizzard Challenge Speech |
Natural Language |
DBpedia |
Natural Language |
Dirty Words |
Natural Language |
Flickr Personal Taxonomies |
Natural Language |
German Political Speeches Corpus |
Natural Language |
Google Books Ngrams (2.2TB) |
Natural Language |
Google MC-AFP |
Natural Language |
Google Web 5gram (1TB 2006) |
Natural Language |
Hansards text chunks of Canadian Parliament |
Natural Language |
LJ Speech |
Natural Language |
Machine Comprehension Test (MCTest) of text from Microsoft Research |
Natural Language |
Machine Translation of European languages |
Natural Language |
Making Sense of Microposts 2016 |
Natural Language |
Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) |
Natural Language |
Multi-Domain Sentiment Dataset (version 2.0) |
Natural Language |
Noisy speech database for training speech enhancement algorithms and TTS |
Natural Language |
Open Multilingual Wordnet |
Natural Language |
SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic 30K articles) |
Natural Language |
SMS Spam Collection in English |
Natural Language |
Stanford Question Answering Dataset (SQuAD) |
Natural Language |
The Big Bad NLP Database |
Natural Language |
Universal Dependencies |
Natural Language |
USENET postings corpus of 2005~2011 |
Natural Language |
Webhose – News/Blogs in multiple languages |
Natural Language |
Wikidata – Wikipedia databases |
Natural Language |
A range of miscellaneous data |
Other |
Adult height over time |
Other |
Advice sought |
Other |
Broadband access in the US |
Other |
Catholic women |
Other |
Confidence |
Other |
Dating survey |
Other |
Emotion and word associations |
Other |
Graphic Design data |
Other |
Happy moments |
Other |
Jeans pockets |
Other |
Joke database |
Other |
Makeup shades |
Other |
Photographer biographies |
Other |
Random card choices |
Other |
Academic Torrents |
Public |
Amazon |
Public |
Archive IT |
Public |
Archive.org Datasets |
Public |
CMU JASA data archive |
Public |
CMU StatLab collections |
Public |
Data.World |
Public |
Datahub.io |
Public |
Domains Project |
Public |
Google |
Public |
Grand Comics Database |
Public |
Harvard Dataverse Network of scientific data |
Public |
ICPSR (UMICH) |
Public |
Microsoft Research Open Data |
Public |
Open Data Certificates (beta) |
Public |
Open Library Data Dumps |
Public |
OpenDataNetwork |
Public |
Reddit Datasets |
Public |
Sample R data sets |
Public |
Statista |
Public |
The Washington Post List |
Public |
UCLA SOCR data collection |
Public |
UFO Reports |
Public |
Wikileaks 911 pager intercepts |
Public |
Yahoo Webscope |
Public |
Zenodo |
Public |
2021 Portuguese Elections Twitter Dataset |
Social Network |
43k+ Donald Trump Twitter Screenshots |
Social Network |
Annotated Reddit conversations |
Social Network |
China Biographical Database |
Social Network |
CMU Enron Email of 150 users |
Social Network |
Dataset from Reddit Place project |
Social Network |
Facebook Data Scrape (2005) |
Social Network |
Facebook Social Connectedness Index |
Social Network |
Facebook Social Networks from LAW (since 2007) |
Social Network |
Foursquare from UMN/Sarwat (2013) |
Social Network |
GitHub Collaboration Archive |
Social Network |
Indie Map |
Social Network |
Mobile Social Networks from UMASS |
Social Network |
Network Twitter Data |
Social Network |
Reddit Comments |
Social Network |
September 2009 – January 2010 Twitter Scrape |
Social Network |
Skytrax’ Air Travel Reviews Dataset |
Social Network |
Snapchat political ads |
Social Network |
Social Twitter Data |
Social Network |
SourceForge.net Research Data |
Social Network |
Twitter Data for Online Reputation Management |
Social Network |
Twitter Dataset of 40+ million tweets related to COVID-19 |
Social Network |
United States Congress Twitter Data |
Social Network |
Word usage on Twitter |
Social Network |
Yahoo! Graph and Social Data |
Social Network |
Bruteforce Database |
Software |
Bug bounties |
Software |
Bug fixes |
Software |
California large systems |
Software |
CCCS-CIC-AndMal-2020 Malware List |
Software |
Code duplicates |
Software |
Cyber wargames |
Software |
Development projects and outcomes |
Software |
GitHub Commit messages |
Software |
GitHub Pull Request review comments |
Software |
Government-sponsored cyberattacks |
Software |
Libraries.io Open Source Repository and Dependency Metadata |
Software |
Newsgroup emails |
Software |
Phishing data |
Software |
Public Git Archive |
Software |
SEC server logs |
Software |
Software development time estimates |
Software |
Source Code Identifiers |
Software |
Spelling auto-corrections |
Software |
StackExchange Data Explorer |
Software |
STEM surveys |
Software |
Traffic and Log Data Captured During a Cyber Defense Exercise |
Software |
US Government popular web pages |
Software |
American Ninja Warrior Obstacles |
Sport |
Baseball ballparks |
Sport |
College football songs |
Sport |
Cricsheet Matches (Cricket) |
Sport |
Equity in Athletics |
Sport |
Ergast Formula 1 |
Sport |
Football stats |
Sport |
Football/Soccer Data Sets |
Sport |
Golfing discs |
Sport |
Lahman’s Baseball Database |
Sport |
NFL play-by-play data |
Sport |
OpenDota |
Sport |
Pinhooker: Thoroughbred Bloodstock Sale Data |
Sport |
Retrosheet Baseball Statistics |
Sport |
Soccer/football play-by-play |
Sport |
Tennis database ATP |
Sport |
Tennis database WTA |
Sport |
US soccer salaries |
Sport |
USA Soccer Teams and Locations |
Sport |
3W dataset |
Time Series |
Hard Drive Failure Rates |
Time Series |
Heart Rate Time Series from MIT |
Time Series |
Time Series Data Library (TSDL) from MU |
Time Series |
Turing Change Point Dataset |
Time Series |
UC Riverside Time Series Dataset |
Time Series |
Aircraft data |
Transport |
Airlines OD Data 1987-2008 |
Transport |
Aeroplane strikes with animals |
Transport |
Bike Share Systems (BSS) |
Transport |
Car makes and specifications |
Transport |
Car model fuel efficiency |
Transport |
Commercial vehicle safety |
Transport |
Dutch Traffic Information |
Transport |
GeoLife GPS Trajectory from Microsoft Research |
Transport |
German train system by Deutsche Bahn |
Transport |
Lasers and aeroplanes |
Transport |
London bike infrastructure |
Transport |
Lyft Level 5 autonomous driving |
Transport |
NYC parking tickets |
Transport |
NYC Taxi Trip Data 2009 |
Transport |
NYC Taxi Trip Data 2013 (FOIA/FOILed) |
Transport |
NYC Uber trip data April 2014 to September 2014 |
Transport |
Open Flight data |
Transport |
Open Traffic collection |
Transport |
OpenFlights – airport airline and route data |
Transport |
Parking meters in NYC |
Transport |
Parking tickets in Chicago |
Transport |
Plane Crash Database since 1920 |
Transport |
Renfe (Spanish National Railway Network) dataset |
Transport |
Toronto Bike Share Stations (JSON and GBFS files) |
Transport |
Transport for London (TFL) |
Transport |
Travel Tracker Survey (TTS) for Chicago |
Transport |
TSA Claims |
Transport |
U.S. Freight Analysis Framework since 2007 |
Transport |
Uber movement data |
Transport |
Urban traffic |
Transport |
US Bureau of Transport Statistics |
Transport |
US public transport |
Transport |
Vehicle Management Library |
Transport |
ACLED (Armed Conflict Location & Event Data Project) |
World |
Authoritarian Ruling Elites Database |
World |
Canadian Legal Information Institute |
World |
Center for Systemic Peace Datasets |
World |
Correlates of War Project |
World |
Coups In The World, 1950-Present |
World |
Cryptome Conspiracy Theory Items |
World |
Europe in translation |
World |
European Social Survey |
World |
FBI Hate Crime 2013 |
World |
FiveThirtyEight |
World |
GDELT Global Events Database |
World |
General Social Survey (GSS) |
World |
German Social Survey |
World |
Global Open Data Index |
World |
Global Religious Futures Project |
World |
Gun Violence Data |
World |
Humanitarian Data Exchange |
World |
Hunter-gatherers |
World |
INI Benchmark |
World |
Institute for Demographic Studies |
World |
International Networks Archive |
World |
International Studies Compendium Project |
World |
James McGuire Cross-National Data |
World |
Kaggle Datasets |
World |
Low-frequency sound reports |
World |
MacroData Guide |
World |
Mass Mobilization Data Project |
World |
Minnesota Population Center |
World |
MIT Reality Mining Dataset |
World |
Notre Dame Global Adaptation Index (ND-GAIN) |
World |
Open Crime and Policing Data in England Wales and Northern Ireland |
World |
OpenSanctions |
World |
Paul Hensel General International Data Page |
World |
PewResearch Society Data Collection |
World |
Population density and locations |
World |
Rebel groups and natural resources |
World |
SIPRI Arms Transfers Database |
World |
Survey of Scottish Witchcraft |
World |
Texas Inmates Executed Since 1984 |
World |
TransNewGuinea language translations |
World |
UCLA Social Sciences Data Archive |
World |
UN Civil Society Database |
World |
UN daily refugee movements |
World |
Universities Worldwide |
World |
UPJOHN for Labor Employment Research |
World |
Uppsala Conflict Data Program |
World |
World Bank Open Data |
World |
World Inequality Database |
World |
WorldPop project |
World |
Waste