FB

Do you want to practice SQL on something other than the basic default database? Here’s a sample ERD and data for a superhero database.

When you’re learning SQL or database design, it’s helpful to use other databases as a reference.

Many articles online refer to Oracle’s HR database, or SQL Server’s AdventureWorks database. These can be helpful, but often you’re looking for answers that these databases don’t help with.

Which is where this post comes in.

This post describes a sample database containing data about superheroes. It includes:

  • An ERD (entity relationship diagram) for the sample superhero database
  • An explanation of the tables and columns
  • A download of sample data to create and populate this database
  • An example query on the database

Why is this helpful? Firstly, you can understand more about how a superhero database might work.

Also, you can practice SQL against realistic data and write your own queries, both simple (which superheroes were created by Marvel) and complex (which superpowers are more prevalent in DC than in Marvel).

So, let’s take a look at the database.

 

Sample Olympic Database: ERD

The ERD for the superhero database is shown here (open in new tab to see a larger version):

It contains about 700 superhero entries from several online sources. Superheroes also have a list of attributes (such as Intelligence and Combat) and a set of superpowers.

However, due to the nature of the source data, not all superheroes have attribute values or superpowers. It’s not ideal, but it’s good practice to deal with incomplete data and working with NULL values.

What do all of these tables and columns mean?

 

Table Explanations

 

Superhero

This is the main list of superheroes. It has their superhero name, full name/real name, a list of IDs linking to other tables, and their height (in centimetres) and weight (in kilograms).

Gender

The Gender table lists three rows: Male, Female, and N/A.

Colour

The Colour table contains references for eye colour, skin colour, and hair colour. There’s a row that indicates “no colour” as well.

Race

This table shows values for the different races of superheroes, such as Human.

Publisher

The Publisher table lists all of the publishers. It includes the two most common: DC and Marvel. It also includes other publishers such as the “Heroes” TV show and Star Trek.

Alignment

This table shows three values that demonstrate how the superhero is aligned: Good, Neutral, and Bad.

Attribute

This table lists six different attributes (e.g. intelligence) that can describe a superhero.

Hero_Attribute

This table is the intersection of attributes and superheroes. A superhero can have multiple attributes, and for each attribute, they have a value from 0-100 indicating their rating. This table contains those ratings (in the attribute_value column).

Superpower

The superpower table lists all of the available abilities or superpowers that someone can have.

Hero_Power

This table lists the IDs of each power that each superhero can have because a superhero can have more than one superpower.

 

Design Comments

There were a few things I had to consider when designing this database:

  • Not all superheroes have attribute values
  • Not all superheroes have superpower values
  • Some superheroes are mentioned multiple times, such as Green Lantern who has had different people take on the role of Green Lantern.
  • Many pieces of information were left out of this data set to keep it simple, such as alias names, birthplaces, and family members.

 

Sample Data

I’ve prepared some sample data for this database. You can use this to create this database on your own computer, explore the tables, and write SQL on it.

This file is set up to run on MySQL.

The file is available here:

sample_data_superheroes.sql (145 KB)

It’s a single SQL file that creates the superhero database, creates the tables, and populates them.

You can run the SQL file itself, or use MySQL Workbench’s Data Import feature to import it.

If you’d like to run this on Oracle or SQL Server, enter your email address below and I’ll send the links to the files.

 

Sample Query

With the sample data in the database, let’s take a look at a sample query. This query shows all of the superheroes and the sum of their attributes, including those with no attributes.

SELECT
s.id,
s.superhero_name,
p.publisher_name,
SUM(ha.attribute_value) AS total_attributes
FROM superhero s
INNER JOIN publisher p ON s.publisher_id = p.id
LEFT JOIN hero_attribute ha ON s.id = ha.hero_id
LEFT JOIN attribute a ON ha.attribute_id = a.id
GROUP BY s.id, s.superhero_name, p.publisher_name
ORDER BY SUM(ha.attribute_value)  DESC;

Results (top 20):

# id superhero_name publisher_name total_attributes
284 General Zod DC Comics 600
541 Power Girl DC Comics 600
441 Man of Miracles Image Comics 600
518 One-Above-All Marvel Comics 600
665 Superboy-Prime DC Comics 600
666 Supergirl DC Comics 600
84 Beyonder Marvel Comics 600
667 Superman DC Comics 600
683 The Presence DC Comics 600
424 Living Tribunal Marvel Comics 600
28 Angela Image Comics 595
348 Hyperion Marvel Comics 595
637 Spectre DC Comics 595
91 Binary Marvel Comics 595
484 Monarch DC Comics 595
82 Ben 10 DC Comics 590
604 Sentry Marvel Comics 590
43 Ardina Marvel Comics 585
474 Mister Mxyzptlk DC Comics 585
622 Silver Surfer Marvel Comics 580

There are all kinds of other queries you can write on this data.

 

Conclusion

So that’s the sample database for superheroes. There’s an ERD you can use to help you understand it or to design your own. You can also download the sample database tables and data to run your own queries on it.

Share via
Copy link
Powered by Social Snap