SQL Subqueries: The Complete Guide

Oct 22, 2018 · 15 min read · sql ·

Subqueries are powerful features of SQL.

If you want to move beyond the basics of SQL, then subqueries are something you need to know about.

This article will show you everything you need to know about subqueries in SQL.

What is a Subquery?

A subquery is a SELECT query inside another query. It’s also called a nested query in SQL. A subquery is often used inside SELECT queries but can also be used in other types of queries.

The concept of a query inside a query might seem simple, but it can be a hard concept to truly understand. I’ll cover everything you need to know about subqueries in this article.

Let’s say you had a simple query like this:

1SELECT id, last_name
2FROM employee;

This is a standard SELECT query that shows you some data from the employee table.

Now, let’s say you had a query like this:

1SELECT id, last_name
2FROM employee
3WHERE salary > (
4  SELECT salary
5  FROM some_other_table
6);

There are a few extra features in this query that you might not be used to. But the main thing to point out is that there is a subquery in this query. The subquery is:

1SELECT salary
2FROM some_other_table

That’s the subquery because it’s a SELECT query inside another query. We’ll have some more examples later in this article, but this just demonstrates the concept.

Why Use SQL Subqueries?

The main reason to use a subquery is to improve the maintainability and performance of your query and application.

I’ll demonstrate this with an example.

Let’s say you had to display a report in your application that showed all employees that had an above average salary. To find these records, you would need to do two things:

Find the average salary of all employees
List all employees whose salary is larger than that average

Our employee table looks like this:

ID	LAST_NAME	SALARY
1	SMITH	40000
2	ANDERSON	60000
3	JONES	45000
4	LANDY	82000
5	CHARLESTON	21000
6	JOHNSON	51000
7	BROWN	93000
8	HARDEN	29000
9	CONNOR	52000
10	PIERCE	39000

First, let’s find the average salary of all employees, using an aggregate function:

1SELECT AVG(salary)
2FROM employee;

Assuming you have this table structure and data set up (which you can get as part of this article), you should get this result:

AVG(SALARY)
51200

The average salary is 51,200. Now, we can use this value to find all of the employees whose salary is larger than this average of 51,200.

1SELECT id, last_name, salary
2FROM employee
3WHERE salary > 51200;

The results of this query are:

ID	LAST_NAME	SALARY
2	ANDERSON	60000
4	LANDY	82000
7	BROWN	93000
9	CONNOR	52000

This gives us the result we want. However, this is done in two steps.

Using a subquery will let you do this in one step. That means you won’t need to adjust the query each time a new employee is entered to update the average salary. It will run correctly every time.

We’ll see how to do this with a subquery in the next section.

You can also watch my YouTube video on when to use subqueries here:

Subqueries in a WHERE Clause

One place where you can use subqueries is in the WHERE clause. It’s probably the most common place to use a subquery that I’ve seen, both in online examples and in code that I’ve written.

Using a subquery in a WHERE clause means that we want to use the results of a query as a WHERE clause for another query.

Example: Find Employees with Above-Average Salary

An example of this is finding all employees with a salary greater than the average employee, which we saw how to do earlier in two separate steps.

We can do this in a single step using a subquery. Our query would look like this:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary > (
4  SELECT AVG(salary)
5  FROM employee
6);

Let’s break this query down:

Line 1: this shows the SELECT clause. These are the columns that are displayed when the query is run: id, last_name, and salary.
Line 2: this is the FROM clause. We are selecting data from the employee table.
Line 3: This is the WHERE clause. We want to see records where the salary is greater than something. We open a bracket here, which is closed later in the query.
Line 4: This is another SELECT clause, which selects the AVG of the salary column. This is the start of the subquery - a query within a query.
Line 5: Another FROM clause, which is related to the subquery. It’s also the employee table.
Line 6: We close the brackets for the subquery, and end with a semicolon.

In this query, the bold part is the subquery, or inner query. The unbolded part is the outer query. The subquery is also indented, to make it easier to ready and easier to identify in your script that it’s a subquery.

1SELECT id, last_name, salary
2FROM employee
3WHERE salary > (
4  SELECT AVG(salary)
5  FROM employee
6);

If you run this query, you’ll get this result:

ID	LAST_NAME	SALARY
2	ANDERSON	60000
4	LANDY	82000
7	BROWN	93000
9	CONNOR	52000

It’s the same result as running it in separate queries, which we saw earlier. But it’s run in a single query, which means it’s easier for you to run and will properly cater to any changes in the data.

Example: Find Employee with the Highest Salary

Another way we can use a subquery is to find a record that exactly matches the result of a subquery. An example of this would be finding the employee that has the highest salary, which is a common type of query no matter what data you have (find the details of the row with the highest value).

To do this using a subquery, it would look like this:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary = (
4  SELECT MAX(salary)
5  FROM employee
6);

This is similar to the earlier example. However, there are some differences.

The subquery finds the maximum salary from the employee table: SELECT MAX(salary).

Also, the WHERE clause in the outer query says WHERE salary =. This means that the salary of the employee needs to be equal to the maximum, which is from the subquery.

If you run this query, you’ll get this result:

ID	LAST_NAME	SALARY
7	BROWN	93000

This shows Brown with the highest salary. You can check this by running a query using ORDER BY:

1SELECT id, last_name, salary
2FROM employee
3ORDER BY salary DESC;

ID	LAST_NAME	SALARY
7	BROWN	93000
4	LANDY	82000
2	ANDERSON	60000
9	CONNOR	52000
6	JOHNSON	51000
3	JONES	45000
1	SMITH	40000
10	PIERCE	39000
8	HARDEN	29000
5	CHARLESTON	21000

This shows that Brown does have the highest salary.

Example: Multiple Values

The previous example worked because the subquery only returned a single value:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary = (
4  SELECT MAX(salary)
5  FROM employee
6);

When you specify “WHERE salary =”, the value you’re comparing the salary to needs to be a single value. This means the subquery needs to return a single value. Luckily, using MAX or AVG will ensure a single value is returned. However, what happens when your subquery does not return a single value?

Look at this example:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary = (
4  SELECT salary
5  FROM employee
6  WHERE last_name LIKE 'C%'
7);

This will look for all salaries that are the same as those employees whose last_names start with “C”. If you run this query, you’ll get this error:

ORA-01427: single-row subquery returns more than one row

This means that your subquery has returned more than one row. The subquery can be run by itself, which shows these results:

1SELECT salary
2FROM employee
3WHERE last_name LIKE 'C%';

SALARY
21000
52000

If these values were used in your query instead of the subquery, it would look like this:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary = (21000, 52000);

You can't have a salary that is equal to two values. One of the rules of using a subquery is that if you use an equals sign, you must return a single value.

That's why you get the ORA-01427: single-row subquery returns more than one row. Your subquery has returned more than one row, and the outer query expects a single value (because of an = sign). Resolve this by using an IN operator of changing your subquery.

Example: IN Operator

You can match several values in a subquery by using an IN operator. The IN operator will check that the value you specify matches any of the values inside the IN operator.

For example, the earlier query can be written like this:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary IN (21000, 52000);

Now, this isn't a subquery, but you can write a query with a subquery in the same way:

1SELECT id, last_name, salary
2FROM employee
3WHERE salary IN (
4  SELECT salary
5FROM employee
6WHERE last_name LIKE 'C%'
7);

All we've done here is replace the = sign with an IN clause, and the query will work:

ID	LAST_NAME	SALARY
5	CHARLESTON	21000
9	CONNOR	52000

There are all kinds of uses for subqueries. Whenever you think you need to use the result of one query as an input or a filter in another query, a subquery is probably the best way to do it.

Inline Views: Subqueries in a FROM Clause

Another place you can use a subquery is in a FROM clause. The results of a subquery are treated just like a table, which you can select values from.

Let's say you had this query which showed you the average salary per department:

1SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
2FROM employee
3GROUP BY dept_id;

The results are:

DEPT_ID	AVG_SALARY
1	55333.33
2	51333.33
3	48000

You have two columns here: dept_id and avg_salary, with three results Let's say you wanted to treat these results as a table without actually creating a table. You can do this by using this query as a subquery in the from clause.

1SELECT dept_id, avg_salary
2FROM (
3  SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
4  FROM employee
5  GROUP BY dept_id
6);

Running this query will give you the same results:

DEPT_ID	AVG_SALARY
1	55333.33
2	51333.33
3	48000

However, you can use the results of this subquery to join to other tables:

 1SELECT
 2sub.dept_id,
 3d.dept_name,
 4sub.avg_salary
 5FROM
 6(
 7  SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
 8  FROM employee
 9  GROUP BY dept_id
10) sub
11INNER JOIN department d ON sub.dept_id = d.id;

This query does a few things.

First, we're selecting columns from two tables: the sub and d tables. The sub table is actually the result of the subquery, which is the dept_id and avg_salary columns.

We are then joining that subquery to the department table. The subquery has been named sub, and it is then treated just like a table or a view. This type of query is called an inline view.

What is an inline view? It's a subquery that's inside the FROM clause of a query. It's called an inline view because it acts just like a view object, but no view object is created on the database.

If you run this query, you'll get this result:

DEPT_ID	DEPT_NAME	AVG_SALARY
1	SALES	55333.33
2	HR	51333.33
3	IT	48000

This way you can see the dept_name for each department, and their average salary. There may be other ways to get this data, but this is just an example of using a subquery in the FROM clause and using a join to another table.

Subqueries in a SELECT Clause

One more place you can use a subquery is in the SELECT clause.

If a subquery in the WHERE clause acts as a filter, and a subquery in the FROM clause acts as a view, then a subquery in the SELECT clause acts like a column that is to be shown in your output.

To use a subquery in your SELECT clause, you add it in the place of a column.

For example, let's say you wanted to show the average salary alongside each employee record.

You could try a query like this:

1SELECT id, last_name, salary, AVG(salary)
2FROM employee;

However, if you run this query, you'll get this error:

ORA-00937: not a single-group group function

This happens because we have an aggregate function and no GROUP BY clause.

We can adjust our query to use a subquery. The subquery would get the average salary, and that would be used as a "column" in the SELECT clause:

1SELECT id, last_name, salary, (
2  SELECT AVG(salary)
3  FROM employee
4)
5FROM employee;

The result from the SELECT AVG(salary) subquery is used as a value on every row that is returned. The results are:

ID	LAST_NAME	SALARY	(SELECTAVG(SALARY)FROMEMPLOYEE)
1	SMITH	40000	51200
2	ANDERSON	60000	51200
3	JONES	45000	51200
4	LANDY	82000	51200
5	CHARLESTON	21000	51200
6	JOHNSON	51000	51200
7	BROWN	93000	51200
8	HARDEN	29000	51200
9	CONNOR	52000	51200
10	PIERCE	39000	51200

The name of this column looks messy, but that's just how Oracle handles it by default. You can assign a column alias using the AS keyword to make it neater:

1SELECT id, last_name, salary, (
2  SELECT AVG(salary)
3  FROM employee
4) AS avg_salary
5FROM employee;

This query will now show this result:

ID	LAST_NAME	SALARY	AVG_SALARY
1	SMITH	40000	51200
2	ANDERSON	60000	51200
3	JONES	45000	51200
4	LANDY	82000	51200
5	CHARLESTON	21000	51200
6	JOHNSON	51000	51200
7	BROWN	93000	51200
8	HARDEN	29000	51200
9	CONNOR	52000	51200
10	PIERCE	39000	51200

The same value of 51200 is shown for every row. If that's your requirement, then you can use a subquery in this way to show the value.

This alias is required in MySQL but not in other SQL vendors.

Multiple Subqueries (Nested Subqueries)

Now, what if you wanted to use a subquery inside another subquery? This is possible in SQL. In fact, the maximum number of subqueries inside other subqueries you can use is 255. However, you shouldn't even get close to that many subqueries. If you've got more than about 5 subqueries then you should look to redesign your query.

Let's say you wanted to see each employee's salary and the percentage of that salary compared to the average. You can use the earlier query, which showed each employee along with the average salary for all employees:

1SELECT id, last_name, salary, (
2  SELECT AVG(salary)
3  FROM employee
4) AS avg_salary
5FROM employee;

ID	LAST_NAME	SALARY	AVG_SALARY
1	SMITH	40000	51200
2	ANDERSON	60000	51200
3	JONES	45000	51200
4	LANDY	82000	51200
5	CHARLESTON	21000	51200
6	JOHNSON	51000	51200
7	BROWN	93000	51200
8	HARDEN	29000	51200
9	CONNOR	52000	51200
10	PIERCE	39000	51200

You could adjust your query like this to show the percentage of the average salary:

1SELECT id, last_name, salary, (
2  SELECT AVG(salary)
3  FROM employee
4) AS avg_salary,
5ROUND(salary / (
6  SELECT AVG(salary)
7  FROM employee
8), 2) AS pct_avg_salary
9FROM employee;

However, this is running the same subquery twice, which is a waste. The results are:

ID	LAST_NAME	SALARY	AVG_SALARY	PCT_AVG_SALARY
1	SMITH	40000	51200	.78
2	ANDERSON	60000	51200	1.17
3	JONES	45000	51200	.88
4	LANDY	82000	51200	1.6
5	CHARLESTON	21000	51200	.41
6	JOHNSON	51000	51200	1
7	BROWN	93000	51200	1.82
8	HARDEN	29000	51200	.57
9	CONNOR	52000	51200	1.02
10	PIERCE	39000	51200	.76

You could write this as a nested subquery. You use the original results as a subquery into another query:

 1SELECT id,
 2last_name,
 3salary
 4avg_salary,
 5ROUND(salary/avg_salary, 2) AS pct_avg_salary
 6FROM (
 7  SELECT id, last_name, salary, (
 8    SELECT AVG(salary)
 9    FROM employee
10  ) AS avg_salary
11  FROM employee
12);

This query will find all employee records, show the average salary for each of them, and then use the results of that in the outer query.

This shows that you can use several levels of subqueries.

Correlated Subqueries

The final concept on subqueries that I'll cover is a correlated subquery.

What's a correlated subquery? It's when a subquery refers to a column that exists in the outer query. The subquery and the outer query are said to be correlated, as they are linked to each other.

So far, our subqueries are independent queries, with the results used inside an outer query. However, you can refer to a column in the outer query from within the subquery.

An example of this would be a query to find the employees with a salary greater than the average salary in their department.

To find the average salary in the department, we can use this query:

1SELECT dept_id, AVG(salary)
2FROM employee
3GROUP BY dept_id;

The results are:

DEPT_ID	AVG(SALARY)
1	55333.33333
2	51333.33333
3	48000

Now, to find the employees with a salary greater than the average salary in their department, we can use a subquery. The subquery needs to match one of these averages here to the employee record, based on their department ID.

1SELECT e.id, e.last_name, e.salary, e.dept_id
2FROM employee e
3WHERE e.salary > (
4  SELECT AVG(s.salary)
5  FROM employee s
6  WHERE s.dept_id = e.dept_id
7  GROUP BY s.dept_id
8);

This query is different to the others. It selects a few columns from the employee table, with an alias of "e". It then has a subquery that looks at the average of the salary value.

However, this is not just a simple average. It looks for the average where the dept_id matches the employee's dept_id from the outer query.

This allows the query to find the average for each employee's department, and check if their salary is greater than this average.

The results of this query are:

ID	LAST_NAME	SALARY	DEPT_ID
2	ANDERSON	60000	3
4	LANDY	82000	1
6	JOHNSON	51000	3
7	BROWN	93000	2
9	CONNOR	52000	3

These are the only employees that have a salary greater than their department's average. There may be employees with a salary greater than averages of other departments, but the correlated subquery lets you find results based on the related records: in this case, the average salary of a department ID.

Conclusion

Subqueries are powerful tools to use in your SQL queries. It adds to the complexity of a query, but if your requirements say that you need a certain data set, often the only way to do that is using a subquery.