800 Free Data Sets

Do you want to practice your SQL, database, or data analysis skills?

If so, you’ll need some data, or a data set, to work on.

On this page, you can find a list of several hundred data sets you can use.


The table below contains about 800 free data sets on a range of topics. The data sets have been compiled from a range of sources.

To use them:

  • Click the name to visit the website mentioned
  • Download the files (the process is different for each one)
  • Load them into a database
  • Practice your queries!

Many of the sites below have a single data set, and many others have a collection of data sets (e.g. Government websites).

Some of them may require registration, but they should all be free. If you notice that any are not free, or no longer work, or have other submissions, let me know in the comments below.

Also, if you want to see more data sets, check out the listings on these sites:

Let’s see these data sets!


Free Data Sets

Name and URL Category
1000 Genomes Biology
American Gut (Microbiome Project) Biology
Animal species occurrence Biology
Bird invasions Biology
Bird-building collisions Biology
Broad Bioimage Benchmark Collection (BBBC) Biology
Cell Image Library Biology
Chimp personalities Biology
Complete Genomics Public Data Biology
Coremine Medical Biology
Declared Dangerous Dogs Biology
EBI ArrayExpress – ArrayExpress Archive of Functional Genomics Data Biology
EBI Protein Data Bank in Europe Biology
Electron Microscopy Pilot Image Archive (EMPIAR) Biology
ENCODE project (Encyclopedia of DNA Elements) Biology
Ensembl Genomes Biology
Equine deaths in New York Biology
Gene Expression Omnibus (GEO) Biology
Gene Ontology (GO) Biology
Global Biotic Interactions (GloBI) Biology
Harvard Medical School (HMS) LINCS Project Biology
Human Genome Diversity Project – Stanford Biology
Human Microbiome Project (HMP) Biology
ICOS PSP Benchmark Biology
Imported bats Biology
Invasive species Biology
Journal of Cell Biology DataViewer Biology
KEGG Biology
Mammal data Biology
MIT Cancer Genomics Data Biology
National Centers for Environmental Information Biology
National Oceanic and Atmospheric Administration Fisheries Biology
NCBI Proteins Biology
NCBI Taxonomy Biology
NCI Genomic Data Commons Biology
OpenSNP genotypes data Biology
Palmer Penguins Biology
Pathguid Protein Interactions Catalog Biology
Penguins Biology
Protein Data Bank Biology
Psychiatric Genomics Consortium Biology
PubChem Project Biology
Rfam Biology
Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) Biology
Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) Biology
Sequence Read Archive (SRA) Biology
Shrinking salmon Biology
Stowers Institute Original Data Repository Biology
Systems Science of Biological Dynamics (SSBD) Database Biology
The Cancer Genome Atlas (TCGA) Biology
The Catalogue of Life Biology
The Personal Genome Project Biology
UCSC Public Data Biology
UniGene Biology
Universal Protein Resource (UnitProt) Biology
Zoo animal lifespans Biology
Actuaries Climate Index Climate and Weather
Australian Weather Climate and Weather
Aviation Weather Center Climate and Weather
Canadian Meteorological Centre Climate and Weather
Charting The Global Climate Change News Narrative 2009-2020 Climate and Weather
Climate Data from UEA Climate and Weather
Dutch Weather Climate and Weather
European Climate Assessment & Dataset Climate and Weather
German Climate Data Center Climate and Weather
Global Climate Data Since 1929 Climate and Weather
NASA Global Imagery Browse Services Climate and Weather
NOAA Bering Sea Climate Climate and Weather
NOAA Climate Datasets Climate and Weather
NOAA Real-time Weather Models Climate and Weather
NOAA SURFRAD Meteorology and Radiation Datasets Climate and Weather
UEA Climatic Research Unit Climate and Weather
Wahington Post Climate Change Climate and Weather
WorldClim – Global Climate Data Climate and Weather
WU Historical Weather Worldwide Climate and Weather
AMiner Citation Network Dataset Complex Networks
Community Resource for Archiving Wireless Data Complex Networks
CrossRef DOI URLs Complex Networks
DBLP Citation dataset Complex Networks
DIMACS Road Networks Collection Complex Networks
NBER Patent Citations Complex Networks
Network Repository Complex Networks
Small Network Data Complex Networks
Stanford GraphBase Complex Networks
Stanford Large Network Dataset Collection Complex Networks
The Laboratory for Web Algorithmics (UNIMI) Complex Networks
UCI Network Data Repository Complex Networks
UFL sparse matrix collection Complex Networks
3.5B Web Pages from CommonCrawl 2012 Computer Networks
53.5B Web clicks of 100K users in Indiana Univ. Computer Networks
CAIDA Internet Datasets Computer Networks
ClueWeb09 – 1B web pages Computer Networks
ClueWeb12 – 733M web pages Computer Networks
CommonCrawl Web Data Computer Networks
Internet-Wide Scan Data Repository Computer Networks
OONI: Open Observatory of Network Interference Computer Networks
Rapid7 Sonar Internet Scans Computer Networks
The Peer-to-Peer Trace Archive Computer Networks
UCSD Network Telescope IPv4 /8 net Computer Networks
38-Cloud (Cloud Detection) Earth
African agricultural survey Earth
Air quality Earth
Alabama Real-Time Coastal Observing System Earth
AQUASTAT Global water resources and uses Earth
BODC Marine Data Earth
CERN Open Data Portal Earth
Cloud data Earth
Complete Plants Checklist (US Department of Agriculture) Earth
Crystallography Open Database Earth
EOSDIS – NASA’s earth observing system data Earth
Global dataset of historical crop yields Earth
Global Wind Atlas Earth
Grapes Earth
Harvesting statistics Earth
Historical crop yields Earth
Hyperspectral benchmark dataset on soil moisture Earth
IceCube – South Pole Neutrino Observatory Earth
Integrated Marine Observing System (IMOS) Earth
Lemons quality control dataset Earth
Ligo Open Science Center (LOSC) Earth
Marinexplore – Open Oceanographic Data Earth
NASA Earth Science Earth
NASA Exoplanet Archive Earth
NASA Space Earth
National Estuarine Research Reserves System-Wide Monitoring Program Earth
NSSDC (NASA) data of 550 space spacecraft Earth
Oil and Gas Authority Open Data Earth
Optimized Soil Adjusted Vegetation Index Earth
Sloan Digital Sky Survey (SDSS) – Mapping the Universe Earth
Smithsonian Institution Global Volcano and Eruption Database Earth
US forest tree distributions Earth
USGS Earthquake Archives Earth
World glaciers Earth
Academic parental leave policies Economics
Aggregated smartphone movements Economics
Alcohol content in products Economics
American Economic Association (AEA) Economics
Building footprints Economics
Construction spending Economics
Corporate misconduct Economics
Corporate risk conversations Economics
DBnomics – the world’s economic database Economics
Elevators in NYC Economics
Euro-bank speeches Economics
FCC telemarketer and robocall complaints Economics
Florida billboards Economics
Global economic forecasts Economics
Global inequality Economics
Historical empires Economics
Historical MacroEconomic Statistics Economics
Home Price indexes Economics
Hotel bookings Economics
Housework Economics
Household surveys Economics
How we spend our time Economics
Humans using computers Economics
Internet Product Code Database Economics
Iowa liquor purchases Economics
Job data and numbers Economics
Joint External Debt Data Hub Economics
Jon Haveman International Trade Data Links Economics
Late-medieval English immigrants Economics
London grocery purchases Economics
Long-Term Productivity Database Economics
Maternity leave policies for US companies Economics
Multinational corporations Economics
NEH grants Economics
New York real estate brokers Economics
Nike factories Economics
Old British lighthouses Economics
OpenCorporates Database of Companies in the World Economics
Our World in Data Economics
Patents and trademarks Economics
Population estimates Economics
Public-sector employment Economics
Religion in America Economics
Restaurant menus Economics
Roman empire movements Economics
Rooftop water tanks Economics
Science grants Economics
Sidewalk grates Economics
Space dollars Economics
The Atlas of Economic Complexity Economics
The Big Mac Index Economics
The Center for International Data Economics
The early Islamic world Economics
UK companies Economics
UN Commodity Trade Statistics Economics
UN Human Development Reports Economics
US consumer information Economics
US house construction Economics
US liquor licenses Economics
US manufacturing Economics
US petroleum supply and exports Economics
US power consumption and generation Economics
US retail data Economics
USA Trade, Imports, Exports Economics
War debt Economics
Welsh shipping crews Economics
Bad words Education
College Scorecard Data Education
Education data, unified Education
Encyclopaedia Britannica Education
New York State Education Department Data Education
Pages of books Education
Poems by kids Education
Seattle library checkouts Education
Student Data from Free Code Camp Education
Wikipedia revisions for editors Education
AMPds – The Almanac of Minutely Power dataset Energy
BLUEd – Building-Level fully labelled Electricity Disaggregation dataset Energy
DBFC – Direct Borohydride Fuel Cell (DBFC) Dataset Energy
DEL – Domestic Electrical Load study datasets for South Africa (1994 – 2014) Energy
ECO Energy
EIA Energy
Electricity prices Energy
Electricity utilities Energy
Emissions indicators Energy
Global Power Plant Database Energy
HFED Energy
Home energy consumption Energy
iAWE Energy
PEM1 – Proton Exchange Membrane (PEM) Fuel Cell Dataset Energy
Power outages Energy
REDD Energy
Smart Meter Data Portal Energy
State-owned oil companies Energy
SYND Energy
The Public Utility Data Liberation Project (PUDL) Energy
Tracebase Energy
UK-DALE – UK Domestic Appliance-Level Electricity Energy
Ukraine Energy Centre Datasets Energy
US electricity prices Energy
A decade of TV news words Entertainment
Anthony Bourdain’s travels Entertainment
BFI film industry statistics Entertainment
Billboard music hits and lyrics Entertainment
Boy bands Entertainment
Breaking Bad data Entertainment
Classical music data Entertainment
Crossword data Entertainment
Drama Entertainment
Friends TV show analysis Entertainment
Hunger Games survival Entertainment
Indian movie theatres Entertainment
Movie dialog Entertainment
Movie scripts and genders Entertainment
Music artists Entertainment
NYC film and TV permits Entertainment
Pinball Entertainment
Ramen ratings Entertainment
Rotten Tomatoes reviews Entertainment
Star Trek data Entertainment
Studio Ghibli data Entertainment
Talk radio transcripts Entertainment
Tarantino movie data Entertainment
Billionaire list Finance
BIS Statistics Finance
Blockmodo Coin Registry Finance
CBOE Futures Exchange Finance
Federal Reserve forecasts Finance
Financial access Finance
Financial crisis events Finance
Global debt Finance
Global finance history Finance
Kansas CIty cars at auctions Finance
Low-Income College Graduation Rates Finance
St Louis Federal Finance
Taxation data Finance
UK post-graduation earnings. Finance
Antarctic icebergs Geographical
Antarctic infrastructure Geographical
ArcGIS Open Data portal Geographical
Awesome 3D Semantic City Models Geographical
British Isle bumps Geographical
Cambridge MA US GIS data on GitHub Geographical
Countries, States, subdivisions, provinces Geographical
Country Typology Codes Geographical
Deepwater Horizon’s effects Geographical
Every place name in the United States. Geographical
Forest cover Geographical
Geo Maps Geographical
Geo Wiki Project Geographical
GeoFabrik Geographical
GeoNames Worldwide Geographical
Global Administrative Areas Database (GADM) Geographical
Homeland Infrastructure Foundation-Level Data Geographical
IEEE Geoscience and Remote Sensing Society DASE Website Geographical
Landsat 8 on AWS Geographical
List of all countries in all languages Geographical
National Park data Geographical
National park trails Geographical
National Weather Service GIS Data Portal Geographical
Natural Earth – vectors and rasters of the world Geographical
Nighttime brightness in Niger and Nigeria Geographical
North America ecoregions Geographical
Offshore drilling Geographical
OpenAddresses Geographical
OpenStreetMap (OSM) Geographical
Palm trees Geographical
Patent geography Geographical
Permafrost thickness Geographical
Pleiades – Gazetteer and graph of ancient places Geographical
Reverse Geocoder using OSM data Geographical
Robin Wilson Free GIS Datasets Geographical
San Francisco Bay water Geographical
The height of the frozen world Geographical
TIGER/Line U.S. boundaries and roads Geographical
Tropical cyclone simulations Geographical
TwoFishes – Foursquare’s coarse geocoder Geographical
TZ Timezones shapefile Geographical
UK noise pollution Geographical
UN Environmental Data Geographical
US beach data Geographical
US national park visitors Geographical
World boundaries from the U.S. Department of State Geographical
World countries in multiple formats Geographical
Affordable Care Act data Government
Albuquerque, New Mexico Government
Anti-press incidents Government
Antwerp Belgium Government
Austin, Texas, USA Government
Australia Government
Austria Government
Beersheba, Israel Government
Belgium Government
Brazil Government
Buenos Aires, Argentina Government
Calgary, Alberta, Canada Government
Cambridge, Massachusetts, US Government
Canada Government
Chicago Government
Chicago prosecutions Government
Chile Government
City of Berkeley Government
Congressional district demographics Government
Congressional whip counts Government
Constitution proposed amendments Government
Crime in cities Government
Dallas Government
DataBC (British Columbia) Government
Datos Argentina Government
Debt to the Penny Government
Denver, USA Government
Dual citizenship policies Government
Durham, North Carolina, USA Government
Edmonton, Alberta, Canada Government
England Government
EU laws Government
European electoral polling Government
European opinion polls Government
EuroStat Government
EveryPolitician Government
Exonerations in the USA Government
Federal Committee on Statistical Methodology (FCSM) Government
Finland Government
Foreign ministers Government
France Government
Freedom of Information Act reports Government
Germany Government
Ghent Belgium Government
Glasgow Scotland UK Government
Greece Government
Halifax NS Canada Government
Helsinki Region Finland Government
Hong Kong Government
Hyperlocal Biden/Trump results Government
Indian Government Data Government
Indonesian Data Portal Government
International labor treaties Government
Iowa Government
Ireland Government
Israel Government
Istanbul Municipality Government
Italy Government
Jail deaths in America Government
Japan Government
Laval QC Canada Government
Lexington KY Government
Licensed Firearms Dealers Government
London UK Government
Los Angeles Government
Luxembourg Government
Mass shootings in America Government
MassGIS Massachusetts USA Government
Metropolitan Transportation Commission (MTC) California US Government
Mexico Government
Mississauga ON Canada Government
Moldova Government
Moncton NB Canada Government
Montreal QC Canada Government
Mother of Jones – US mass shootings Government
Mountain View California US (GIS) Government
Netherlands Government
New York Department of Sanitation Monthly Tonnage Government
New York personalised license plates Government
New Zealand Government
NYC betanyc Government
NYC Open Data Government
Oakland California US Government
OECD Government
Oklahoma Government
Open Data for Africa Government
OpenDataPhilly Government
OpenDataSoft Government
Oregon Government
Ottawa ON Canada Government
Palo Alto California US Government
Peace agreements Government
Police Open Data Census Government
Political conditions Government
Political resistance campaigns Government
Portland Oregon Government
Portugal Government
Presidential popularity Government
Privacy policies Government
Public health policy Government
Public pension plans Government
Puerto Rico Government Government
Quebec Province of Canada Government
R&D spending Government
Regina SK Canada Government
Retiree language preference Government
Rio de Janeiro Brazil Government
Romania Government
Ruling elites Government
Russia Government
San Antonio TX Government
San Diego CA Government
San Francisco Data sets Government
San Jose California US Government
San Mateo County California US Government
Saskatchewan Province of Canada Government
School shootings Government
SCOTUS confirmation transcripts Government
Seattle Government
Shot Spotter (gunshot detections) data Government
Singapore Government
Social scientists testifying Government
South Africa Government
State Department daily travel allowance Government
State of Utah US Government
Supreme Court v Congress Government
Switzerland Government
Taiwan Government
Terrorism prosecutions Government
Testing in schools Government
Texas Government
Texas licenses Government
The World Bank Government
Trade agreements Government
UK 2011 Census Open Atlas Project Government
UK fire stats Government
UK general elections Government
UK Government Data Government
UK rules data Government
UK wine cellar Government
Ukraine Government
UN security council debates Government
UN vote history Government
United Nations Government
Uruguay Government
US Cabinet turnover Government
US citizens’ deaths overseas Government
US Counties Government
US county-level and precinct-level results Government
US domestic radicalisation Government
US Executive orders Government
US government court payouts Government
US government paperwork Government
US House of Representatives gift travel Government
US local justice data Government
US marriage, divorce, pregnancy, and infertility Government
US national opinions Government
US regional Medicare usage Government
US state governor addresses Government
US state relations Government
US state-level election results Government
US Supreme Court transcripts Government
US veterans Government
USA American Community Survey Government
USA CDC Public Health datasets Government
USA Census Bureau Government
USA Congressional Research Service (CRS) Reports Government
USA Department of Housing and Urban Development (HUD) Government
USA Federal Government Agencies Government
USA Federal Government Data Catalog Government
USA Food and Drug Administration (FDA) Government
USA National Center for Education Statistics (NCES) Government
USA Open Government Government
USA Patent and Trademark Office (USPTO) Bulk Data Products Government
Valley Transportation Authority (VTA) California US Government
Victoria BC Canada Government
Vienna Austria Government
Voter attitudes and choices Government
Voting equipment Government
Windy City murals Government
2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE Health
Allen Institute Datasets Health
Animals with SARS-CoV-2 Health
Britain’s diet Health
CDC behavioural risk Health
Collaborative Research in Computational Neuroscience (CRCNS) Health
Composition of Foods Raw Processed Prepared USDA National Nutrient Database for Standard Health
Consumer-product chemicals Health
Cooks and kitchen tasks Health
Coronavirus (Covid-19) Data in the United States Health
Coronavirus research papers Health
COVID-19 Case Surveillance Public Use Data Health
COVID-19 cases Health
COVID-19 in prisons Health
COVID-19 Reported Patient Impact and Hospital Capacity by Facility Health
COVID-related behaviour Health
Deaths on the job Health
Dog bites Health
Drug-free school zones in Tennessee Health
Drugs and side effects Health
Early Onset Prostate Cancer Germany Health
EHDP Large Health Data Sets Health
Fatal car crashes Health
Food Outbreak Online Database Health
Gapminder World demographic databases Health
GDC Health
GENIE – Data from the Genomics Evidence Neoplasia Information Exchange Health
Genomic Hallmarks Prostate Adenocarcinoma CPC GENE Health
Health data breaches Health
Healthcare service in Africa Health
Hospitals Health
Human Connectome Project Health
ICGC Cancer Datasets Health
Informatics for Integrating Biology & the Bedside Health
MeDAL Health
Medicare Coverage Database (MCD) U.S. Health
Medicare Data Engine of medicare.gov Data Health
Medicare drug prices Health
NaF-Prostate Health
NDAR Health
NeuroData Health
Neuroelectro Health
Neuroendocrine Prostate Cancer Health
NeuroMorpho – NeuroMorpho.Org is a centrally curated inventory of Health
Number of Ebola Cases and Deaths in Affected Countries (2014) Health
OASIS Health
Open-ODS (structure of the UK NHS) Health
OpenNEURO Health
OpenPaymentsData Healthcare Health
Pandemic travel restrictions Health
PhysioBank Databases Health
PLCO Datasets Health
School vaccination rates Health
Severe workplace injuries Health
Social assistance programs Health
Study Forrest Health
Subnational COVID-19 case counts Health
The Cancer Genome Atlas project (TCGA) Health
The Cancer Imaging Archive (TCIA) Health
The COVID Tracking Project Health
Two decades of tobacco (and e-cigarette) laws Health
US dairy information Health
US emergency room visits Health
US vaccination rates in adults Health
US workplace safety Health
World Health Organization Global Health Observatory Health
Yahoo Knowledge Graph COVID-19 Datasets Health
Zika virus data Health
Canada Science and Technology Museums Corporation’s Open Data Museums
Cooper-Hewitt’s Collection Database Museums
Metropolitan Museum of Art Collection API Museums
Minneapolis Institute of Arts Museums
Museum of Modern Art collection Museums
Natural History Museum (London) Data Portal Museums
Rijksmuseum Historical Art Collection Museums
Tate Collection metadata Museums
The Getty vocabularies Museums
Automatic Keyphrase Extraction Natural Language
Blizzard Challenge Speech Natural Language
DBpedia Natural Language
Dirty Words Natural Language
Flickr Personal Taxonomies Natural Language
German Political Speeches Corpus Natural Language
Google Books Ngrams (2.2TB) Natural Language
Google MC-AFP Natural Language
Google Web 5gram (1TB 2006) Natural Language
Hansards text chunks of Canadian Parliament Natural Language
LJ Speech Natural Language
Machine Comprehension Test (MCTest) of text from Microsoft Research Natural Language
Machine Translation of European languages Natural Language
Making Sense of Microposts 2016 Natural Language
Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) Natural Language
Multi-Domain Sentiment Dataset (version 2.0) Natural Language
Noisy speech database for training speech enhancement algorithms and TTS Natural Language
Open Multilingual Wordnet Natural Language
SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic 30K articles) Natural Language
SMS Spam Collection in English Natural Language
Stanford Question Answering Dataset (SQuAD) Natural Language
The Big Bad NLP Database Natural Language
Universal Dependencies Natural Language
USENET postings corpus of 2005~2011 Natural Language
Webhose – News/Blogs in multiple languages Natural Language
Wikidata – Wikipedia databases Natural Language
A range of miscellaneous data Other
Adult height over time Other
Advice sought Other
Broadband access in the US Other
Catholic women Other
Confidence Other
Dating survey Other
Emotion and word associations Other
Graphic Design data Other
Happy moments Other
Jeans pockets Other
Joke database Other
Makeup shades Other
Photographer biographies Other
Random card choices Other
Academic Torrents Public
Amazon Public
Archive IT Public
Archive.org Datasets Public
CMU JASA data archive Public
CMU StatLab collections Public
Data.World Public
Datahub.io Public
Domains Project Public
Google Public
Grand Comics Database Public
Harvard Dataverse Network of scientific data Public
Microsoft Research Open Data Public
Open Data Certificates (beta) Public
Open Library Data Dumps Public
OpenDataNetwork Public
Reddit Datasets Public
Sample R data sets Public
Statista Public
The Washington Post List Public
UCLA SOCR data collection Public
UFO Reports Public
Wikileaks 911 pager intercepts Public
Yahoo Webscope Public
Zenodo Public
2021 Portuguese Elections Twitter Dataset Social Network
43k+ Donald Trump Twitter Screenshots Social Network
Annotated Reddit conversations Social Network
China Biographical Database Social Network
CMU Enron Email of 150 users Social Network
Dataset from Reddit Place project Social Network
Facebook Data Scrape (2005) Social Network
Facebook Social Connectedness Index Social Network
Facebook Social Networks from LAW (since 2007) Social Network
Foursquare from UMN/Sarwat (2013) Social Network
GitHub Collaboration Archive Social Network
Indie Map Social Network
Mobile Social Networks from UMASS Social Network
Network Twitter Data Social Network
Reddit Comments Social Network
September 2009 – January 2010 Twitter Scrape Social Network
Skytrax’ Air Travel Reviews Dataset Social Network
Snapchat political ads Social Network
Social Twitter Data Social Network
SourceForge.net Research Data Social Network
Twitter Data for Online Reputation Management Social Network
Twitter Dataset of 40+ million tweets related to COVID-19 Social Network
United States Congress Twitter Data Social Network
Word usage on Twitter Social Network
Yahoo! Graph and Social Data Social Network
Bruteforce Database Software
Bug bounties Software
Bug fixes Software
California large systems Software
CCCS-CIC-AndMal-2020 Malware List Software
Code duplicates Software
Cyber wargames Software
Development projects and outcomes Software
GitHub Commit messages Software
GitHub Pull Request review comments Software
Government-sponsored cyberattacks Software
Libraries.io Open Source Repository and Dependency Metadata Software
Newsgroup emails Software
Phishing data Software
Public Git Archive Software
SEC server logs Software
Software development time estimates Software
Source Code Identifiers Software
Spelling auto-corrections Software
StackExchange Data Explorer Software
STEM surveys Software
Traffic and Log Data Captured During a Cyber Defense Exercise Software
US Government popular web pages Software
American Ninja Warrior Obstacles Sport
Baseball ballparks Sport
College football songs Sport
Cricsheet Matches (Cricket) Sport
Equity in Athletics Sport
Ergast Formula 1 Sport
Football stats Sport
Football/Soccer Data Sets Sport
Golfing discs Sport
Lahman’s Baseball Database Sport
NFL play-by-play data Sport
OpenDota Sport
Pinhooker: Thoroughbred Bloodstock Sale Data Sport
Retrosheet Baseball Statistics Sport
Soccer/football play-by-play Sport
Tennis database ATP Sport
Tennis database WTA Sport
US soccer salaries Sport
USA Soccer Teams and Locations Sport
3W dataset Time Series
Hard Drive Failure Rates Time Series
Heart Rate Time Series from MIT Time Series
Time Series Data Library (TSDL) from MU Time Series
Turing Change Point Dataset Time Series
UC Riverside Time Series Dataset Time Series
Aircraft data Transport
Airlines OD Data 1987-2008 Transport
Aeroplane strikes with animals Transport
Bike Share Systems (BSS) Transport
Car makes and specifications Transport
Car model fuel efficiency Transport
Commercial vehicle safety Transport
Dutch Traffic Information Transport
GeoLife GPS Trajectory from Microsoft Research Transport
German train system by Deutsche Bahn Transport
Lasers and aeroplanes Transport
London bike infrastructure Transport
Lyft Level 5 autonomous driving Transport
NYC parking tickets Transport
NYC Taxi Trip Data 2009 Transport
NYC Taxi Trip Data 2013 (FOIA/FOILed) Transport
NYC Uber trip data April 2014 to September 2014 Transport
Open Flight data Transport
Open Traffic collection Transport
OpenFlights – airport airline and route data Transport
Parking meters in NYC Transport
Parking tickets in Chicago Transport
Plane Crash Database since 1920 Transport
Renfe (Spanish National Railway Network) dataset Transport
Toronto Bike Share Stations (JSON and GBFS files) Transport
Transport for London (TFL) Transport
Travel Tracker Survey (TTS) for Chicago Transport
TSA Claims Transport
U.S. Freight Analysis Framework since 2007 Transport
Uber movement data Transport
Urban traffic Transport
US Bureau of Transport Statistics Transport
US public transport Transport
Vehicle Management Library Transport
ACLED (Armed Conflict Location & Event Data Project) World
Authoritarian Ruling Elites Database World
Canadian Legal Information Institute World
Center for Systemic Peace Datasets World
Correlates of War Project World
Coups In The World, 1950-Present World
Cryptome Conspiracy Theory Items World
Europe in translation World
European Social Survey World
FBI Hate Crime 2013 World
FiveThirtyEight World
GDELT Global Events Database World
General Social Survey (GSS) World
German Social Survey World
Global Open Data Index World
Global Religious Futures Project World
Gun Violence Data World
Humanitarian Data Exchange World
Hunter-gatherers World
INI Benchmark World
Institute for Demographic Studies World
International Networks Archive World
International Studies Compendium Project World
James McGuire Cross-National Data World
Kaggle Datasets World
Low-frequency sound reports World
MacroData Guide World
Mass Mobilization Data Project World
Minnesota Population Center World
MIT Reality Mining Dataset World
Notre Dame Global Adaptation Index (ND-GAIN) World
Open Crime and Policing Data in England Wales and Northern Ireland World
OpenSanctions World
Paul Hensel General International Data Page World
PewResearch Society Data Collection World
Population density and locations World
Rebel groups and natural resources World
SIPRI Arms Transfers Database World
Survey of Scottish Witchcraft World
Texas Inmates Executed Since 1984 World
TransNewGuinea language translations World
UCLA Social Sciences Data Archive World
UN Civil Society Database World
UN daily refugee movements World
Universities Worldwide World
UPJOHN for Labor Employment Research World
Uppsala Conflict Data Program World
World Bank Open Data World
World Inequality Database World
WorldPop project World



I hope you’ve found the table above useful. Let me know in the comments below:

  • if any data sets to add to the list
  • if you have anything that would make this list more useful
  • if a data set on this list is not free

While you’re here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

2 thoughts on “800 Free Data Sets”

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.