FB

Sample Database: Olympic Games (ERD and SQL)

When you’re learning SQL or database design, it’s helpful to use other databases as a reference.

Many articles online refer to Oracle’s HR database, or SQL Server’s AdventureWorks database. These can be helpful, but often you’re looking for answers that these databases don’t help with.

Which is where this post comes in.

This post describes a sample database containing data about Olympic events and competitors.

It includes:

  • An ERD (entity relationship diagram) for the sample Olympics database
  • An explanation of the tables and columns
  • A download of sample data to create and populate this database
  • An example query on the database

Why is this helpful? Firstly, you can understand more about how an Olympic competitor database might work.

Also, you can practice SQL against realistic data and write your own queries, both simple (who won medals at the 2004 Olympics) and complex (which competitors won a medal in their first Olympic games).

So, let’s take a look at the database.

 

Sample Olympic Database: ERD

The ERD for the Olympic database is shown here (click to open large version):

Olympics ERD

It contains about 260,000 separate results for competitors at all events since 1896, including both summer and winter Olympics. It has all the results: not only the medals but those who competed and did not win a medal. All events for each sport are included.

What do all of these tables and columns mean?

 

Table Explanations

Sport

This table contains a list of all sports in the Olympics: both summer sports and winter sports.

Event

The event table contains all of the different events for each sport. For example, if a sport is “Alpine Skiing”, then an event is “1KM Men’s Alpine Skiing”. There are multiple events per sport, and events are split into Men, Women, and Mixed.

City

This represents a list of many of the cities in the world.

Games

The games table lists all of the Olympic Games since 1896m, the year they were held, and whether they were Summer or Olympic games.

Looking at this data, I learned that the Summer and Winter Olympics used to be in the same year. I’ve only ever known them to be in alternating years.

Games City

This table shows all of the cities for each Olympic games. This is a joining table between games and city.

Why do we need a joining table? Aren’t the Olympics always in one city?

Well, as I was compiling this data, I got some duplicate rows in my results. I traced this back to a particular set of events for one Olympics. It turns out that in 1956, the Summer Olympics was in Melbourne for most of the events, but due to Australia’s strict horse quarantine rules, they couldn’t bring the horses for the equestrian events. So they performed all of the Equestrian events in Stockholm earlier in 1956.

So, the 1956 Olympics was in both Melbourne and Stockholm.

I didn’t realise this was the case until I looked at the data.

It’s a good lesson: we shouldn’t assume we know how data is structured until we look at the details.

NOC Region

This contains a list of NOC (National Olympic Committee) codes and their names. This translates roughly to a country that competes in the Olympics.

Person

This table lists all people that have competed in an Olympic games. It has their name, gender, height (in CM) and weight (in KG). The height and weight did not differ between Olympic games. If it did, it would have been stored in a different table.

Person Region

This is a joining table that lists all people and the NOC regions (countries) they competed for. This captures the fact that some people competed for more than one country, which is another scenario I didn’t realise until I looked at the data.

Games Competitor

This table is a joining table that relates a person to an Olympic games, which shows who competed what each Olympic games.

Medal

This small table lists the different medals available: Gold, Silver, Bronze, and N/A.

Competitor Event

This table lists the combination of competitors (the people and the games they competed in), the event they competed in, and the medal (if any) they received. This is the largest table.

 

Design Comments

There were a few things I had to consider when designing this database, and some were mentioned earlier:

  • It’s possible for an Olympic games to be held in two different cities (e.g. Melbourne and Stockholm in 1956).
  • A person can compete for two different countries in two different Olympics.
  • A person’s height and weight may change between Olympics in reality, but in the data I had, they did not change.

 

Sample Data

I’ve prepared some sample data for this database. You can use this to create this database on your own computer, explore the tables, and write SQL on it.

The sample data is available for Oracle, SQL Server, MySQL, and Postgres, and is stored on my GitHub repository. Find out how to access it and load the data here: Sample Data for SQL Databases

 

Sample Query

With the sample data in the database, let’s take a look at a sample query. This query shows the people who have won the most medals of any type (Gold, Silver, Bronze) at the Olympic Games.

SELECT
p.id,
p.full_name,
COUNT(*)
FROM person p
INNER JOIN games_competitor gc ON p.id = gc.person_id
INNER JOIN competitor_event ce ON gc.id = ce.competitor_id
INNER JOIN medal m ON ce.medal_id = m.id
WHERE medal_name IN ('Gold', 'Silver', 'Bronze')
GROUP BY p.id, p.full_name
ORDER BY COUNT(*) DESC;

 

Results (top 20):

id full_name COUNT(*)
94406 Michael Fred Phelps, II 28
67046 Larysa Semenivna Latynina (Diriy-) 18
4198 Nikolay Yefimovich Andrianov 15
89187 Takashi Ono 13
109161 Borys Anfiyanovych Shakhlin 13
11951 Ole Einar Bjrndalen 13
74420 Edoardo Mangiarotti 13
85286 Aleksey Yuryevich Nemov 12
70965 Ryan Steven Lochte 12
87390 Paavo Johannes Nurmi 12
119922 Jennifer Elisabeth “Jenny” Thompson (-Cumpelik) 12
23426 Natalie Anne Coughlin (-Hall) 12
57998 Sawao Kato 12
121258 Dara Grace Torres (-Hoffman, -Minas) 12
35550 Birgit Fischer-Schmidt 12
89706 Carl Townsend Osburn 11
11642 Matthew Nicholas “Matt” Biondi 11
18826 Vra slavsk (-Odloilov) 11
21402 Viktor Ivanovych Chukarin 11
113912 Mark Andrew Spitz 11

 

There are all kinds of other queries you can write on this data.

 

Conclusion

So that’s the sample database for the Olympic Games. There’s an ERD you can use to help you understand it or to design your own. You can also download the sample database tables and data to run your own queries on it.

6 thoughts on “Sample Database: Olympic Games (ERD and SQL)”

  1. The following narrative describes a simplified version of the organization of Olympic  facilities planned for the summer Olympics. Draw an EER diagram that shows the entity  types, attributes, relationships, and specializations for this application. State anyassumptions you make. The Olympic facilities are divided into sports complexes. Sports  complexes are divided into one-sport and multisport types. Multisport complexes have areas  of the complex designated for each sport with a location indicator (e.g., center, NE corner,  and so on). A complex has a location, chief organizing individual, total occupied area, and so  on. Each complex holds a series of events (e.g., the track stadium may hold many different  races). For each event there is a planned date, duration, number of participants, number of  officials, and so on. A roster of all officials will be maintained together with the list of  events each official will be involved in. Different equipment is needed for the events (e.g.,  goal posts, poles, parallel bars) as well as for maintenance. The two types of facilities (one￾sport and multisport) will have different types of information. For each type, the number  of facilities needed is kept, together with an approximate budget.schama diagram for this

    1. Hi, I believe I found it using an open data source online. I don’t remember the details. How come? Is there an issue with the data?

      1. Hi,
        Yes please if you can find your open data source back that would be great.
        I’ve been looking for a database with athletes of the olympic game and eventually whether they were caught cheating, without success so far.
        Getting the website wher eyou found this one would be a great lead I suppose.

        Thanks a lot for sharing!

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.