FB

Do you want to practice SQL on something other than the basic default database? Here’s a sample ERD and data for a superhero database.

When you’re learning SQL or database design, it’s helpful to use other databases as a reference.

Many articles online refer to Oracle’s HR database, or SQL Server’s AdventureWorks database. These can be helpful, but often you’re looking for answers that these databases don’t help with.

Which is where this post comes in.

This post describes a sample database containing data about superheroes. It includes:

  • An ERD (entity relationship diagram) for the sample superhero database
  • An explanation of the tables and columns
  • A download of sample data to create and populate this database
  • An example query on the database

Why is this helpful? Firstly, you can understand more about how a superhero database might work.

Also, you can practice SQL against realistic data and write your own queries, both simple (which superheroes were created by Marvel) and complex (which superpowers are more prevalent in DC than in Marvel).

So, let’s take a look at the database.

 

Sample Olympic Database: ERD

The ERD for the superhero database is shown here (open in new tab to see a larger version):

It contains about 700 superhero entries from several online sources. Superheroes also have a list of attributes (such as Intelligence and Combat) and a set of superpowers.

However, due to the nature of the source data, not all superheroes have attribute values or superpowers. It’s not ideal, but it’s good practice to deal with incomplete data and working with NULL values.

What do all of these tables and columns mean?

 

Table Explanations

 

Superhero

This is the main list of superheroes. It has their superhero name, full name/real name, a list of IDs linking to other tables, and their height (in centimetres) and weight (in kilograms).

Gender

The Gender table lists three rows: Male, Female, and N/A.

Colour

The Colour table contains references for eye colour, skin colour, and hair colour. There’s a row that indicates “no colour” as well.

Race

This table shows values for the different races of superheroes, such as Human.

Publisher

The Publisher table lists all of the publishers. It includes the two most common: DC and Marvel. It also includes other publishers such as the “Heroes” TV show and Star Trek.

Alignment

This table shows three values that demonstrate how the superhero is aligned: Good, Neutral, and Bad.

Attribute

This table lists six different attributes (e.g. intelligence) that can describe a superhero.

Hero_Attribute

This table is the intersection of attributes and superheroes. A superhero can have multiple attributes, and for each attribute, they have a value from 0-100 indicating their rating. This table contains those ratings (in the attribute_value column).

Superpower

The superpower table lists all of the available abilities or superpowers that someone can have.

Hero_Power

This table lists the IDs of each power that each superhero can have because a superhero can have more than one superpower.

 

Design Comments

There were a few things I had to consider when designing this database:

  • Not all superheroes have attribute values
  • Not all superheroes have superpower values
  • Some superheroes are mentioned multiple times, such as Green Lantern who has had different people take on the role of Green Lantern.
  • Many pieces of information were left out of this data set to keep it simple, such as alias names, birthplaces, and family members.

 

Sample Data

I’ve prepared some sample data for this database. You can use this to create this database on your own computer, explore the tables, and write SQL on it.

This file is set up to run on MySQL.

The file is available here:

sample_data_superheroes.sql (145 KB)

It’s a single SQL file that creates the superhero database, creates the tables, and populates them.

You can run the SQL file itself, or use MySQL Workbench’s Data Import feature to import it.

If you’d like to run this on Oracle or SQL Server, enter your email address below and I’ll send the links to the files.

Download the Sample Data for this Database for

Oracle or SQL Server

 

Sample Query

With the sample data in the database, let’s take a look at a sample query. This query shows all of the superheroes and the sum of their attributes, including those with no attributes.

SELECT
s.id,
s.superhero_name,
p.publisher_name,
SUM(ha.attribute_value) AS total_attributes
FROM superhero s
INNER JOIN publisher p ON s.publisher_id = p.id
LEFT JOIN hero_attribute ha ON s.id = ha.hero_id
LEFT JOIN attribute a ON ha.attribute_id = a.id
GROUP BY s.id, s.superhero_name, p.publisher_name
ORDER BY SUM(ha.attribute_value)  DESC;

Results (top 20):

# idsuperhero_namepublisher_nametotal_attributes
284General ZodDC Comics600
541Power GirlDC Comics600
441Man of MiraclesImage Comics600
518One-Above-AllMarvel Comics600
665Superboy-PrimeDC Comics600
666SupergirlDC Comics600
84BeyonderMarvel Comics600
667SupermanDC Comics600
683The PresenceDC Comics600
424Living TribunalMarvel Comics600
28AngelaImage Comics595
348HyperionMarvel Comics595
637SpectreDC Comics595
91BinaryMarvel Comics595
484MonarchDC Comics595
82Ben 10DC Comics590
604SentryMarvel Comics590
43ArdinaMarvel Comics585
474Mister MxyzptlkDC Comics585
622Silver SurferMarvel Comics580

There are all kinds of other queries you can write on this data.

 

Conclusion

So that’s the sample database for superheroes. There’s an ERD you can use to help you understand it or to design your own. You can also download the sample database tables and data to run your own queries on it.

Get Your SQL Cheat Sheets Now: