FB

SQL Subqueries: The Complete Guide

Subqueries are powerful features of SQL.

If you want to move beyond the basics of SQL, then subqueries are something you need to know about.

This article will show you everything you need to know about subqueries in SQL.

What is a Subquery?

A subquery is a SELECT query inside another query. It’s also called a nested query in SQL. A subquery is often used inside SELECT queries but can also be used in other types of queries.SQL Subqueries

The concept of a query inside a query might seem simple, but it can be a hard concept to truly understand. I’ll cover everything you need to know about subqueries in this article.

Let’s say you had a simple query like this:

SELECT id, last_name
FROM employee;

This is a standard SELECT query that shows you some data from the employee table.

Now, let’s say you had a query like this:

SELECT id, last_name
FROM employee
WHERE salary > (
  SELECT salary
  FROM some_other_table
);

There are a few extra features in this query that you might not be used to. But the main thing to point out is that there is a subquery in this query. The subquery is:

SELECT salary
FROM some_other_table

That’s the subquery because it’s a SELECT query inside another query. We’ll have some more examples later in this article, but this just demonstrates the concept.

 

Why Use SQL Subqueries?

The main reason to use a subquery is to improve the maintainability and performance of your query and application.

I’ll demonstrate this with an example.

Let’s say you had to display a report in your application that showed all employees that had an above average salary. To find these records, you would need to do two things:

  1. Find the average salary of all employees
  2. List all employees whose salary is larger than that average

Our employee table looks like this:

ID LAST_NAME SALARY
1 SMITH 40000
2 ANDERSON 60000
3 JONES 45000
4 LANDY 82000
5 CHARLESTON 21000
6 JOHNSON 51000
7 BROWN 93000
8 HARDEN 29000
9 CONNOR 52000
10 PIERCE 39000

First, let’s find the average salary of all employees, using an aggregate function:

SELECT AVG(salary)
FROM employee;

Assuming you have this table structure and data set up (which you can get as part of this article), you should get this result:

AVG(SALARY)
51200

The average salary is 51,200. Now, we can use this value to find all of the employees whose salary is larger than this average of 51,200.

SELECT id, last_name, salary
FROM employee
WHERE salary > 51200;

The results of this query are:

ID LAST_NAME SALARY
2 ANDERSON 60000
4 LANDY 82000
7 BROWN 93000
9 CONNOR 52000

This gives us the result we want. However, this is done in two steps.

Using a subquery will let you do this in one step. That means you won’t need to adjust the query each time a new employee is entered to update the average salary. It will run correctly every time.

We’ll see how to do this with a subquery in the next section.

You can also watch my YouTube video on when to use subqueries here:

 

Subqueries in a WHERE Clause

One place where you can use subqueries is in the WHERE clause. It’s probably the most common place to use a subquery that I’ve seen, both in online examples and in code that I’ve written.

Using a subquery in a WHERE clause means that we want to use the results of a query as a WHERE clause for another query.

Example: Find Employees with Above-Average Salary

An example of this is finding all employees with a salary greater than the average employee, which we saw how to do earlier in two separate steps.

We can do this in a single step using a subquery. Our query would look like this:

SELECT id, last_name, salary
FROM employee
WHERE salary > (
  SELECT AVG(salary)
  FROM employee
);

Let’s break this query down:

  • Line 1: this shows the SELECT clause. These are the columns that are displayed when the query is run: id, last_name, and salary.
  • Line 2: this is the FROM clause. We are selecting data from the employee table.
  • Line 3: This is the WHERE clause. We want to see records where the salary is greater than something. We open a bracket here, which is closed later in the query.
  • Line 4: This is another SELECT clause, which selects the AVG of the salary column. This is the start of the subquery – a query within a query.
  • Line 5: Another FROM clause, which is related to the subquery. It’s also the employee table.
  • Line 6: We close the brackets for the subquery, and end with a semicolon.

In this query, the bold part is the subquery, or inner query. The unbolded part is the outer query. The subquery is also indented, to make it easier to ready and easier to identify in your script that it’s a subquery.

SELECT id, last_name, salary
FROM employee
WHERE salary > (
  SELECT AVG(salary)
  FROM employee
);

If you run this query, you’ll get this result:

ID LAST_NAME SALARY
2 ANDERSON 60000
4 LANDY 82000
7 BROWN 93000
9 CONNOR 52000

It’s the same result as running it in separate queries, which we saw earlier. But it’s run in a single query, which means it’s easier for you to run and will properly cater to any changes in the data.

 

Example: Find Employee with the Highest Salary

Another way we can use a subquery is to find a record that exactly matches the result of a subquery. An example of this would be finding the employee that has the highest salary, which is a common type of query no matter what data you have (find the details of the row with the highest value).

To do this using a subquery, it would look like this:

SELECT id, last_name, salary
FROM employee
WHERE salary = (
  SELECT MAX(salary)
  FROM employee
);

This is similar to the earlier example. However, there are some differences.

The subquery finds the maximum salary from the employee table: SELECT MAX(salary).

Also, the WHERE clause in the outer query says WHERE salary =. This means that the salary of the employee needs to be equal to the maximum, which is from the subquery.

If you run this query, you’ll get this result:

ID LAST_NAME SALARY
7 BROWN 93000

This shows Brown with the highest salary. You can check this by running a query using ORDER BY:

SELECT id, last_name, salary
FROM employee
ORDER BY salary DESC;
ID LAST_NAME SALARY
7 BROWN 93000
4 LANDY 82000
2 ANDERSON 60000
9 CONNOR 52000
6 JOHNSON 51000
3 JONES 45000
1 SMITH 40000
10 PIERCE 39000
8 HARDEN 29000
5 CHARLESTON 21000

This shows that Brown does have the highest salary.

 

Example: Multiple Values

The previous example worked because the subquery only returned a single value:

SELECT id, last_name, salary
FROM employee
WHERE salary = (
  SELECT MAX(salary)
  FROM employee
);

When you specify “WHERE salary =”, the value you’re comparing the salary to needs to be a single value. This means the subquery needs to return a single value. Luckily, using MAX or AVG will ensure a single value is returned. However, what happens when your subquery does not return a single value?

Look at this example:

SELECT id, last_name, salary
FROM employee
WHERE salary = (
  SELECT salary
  FROM employee
  WHERE last_name LIKE 'C%'
);

This will look for all salaries that are the same as those employees whose last_names start with “C”. If you run this query, you’ll get this error:

ORA-01427: single-row subquery returns more than one row

This means that your subquery has returned more than one row. The subquery can be run by itself, which shows these results:

SELECT salary
FROM employee
WHERE last_name LIKE 'C%';
SALARY
21000
52000

If these values were used in your query instead of the subquery, it would look like this:

SELECT id, last_name, salary
FROM employee
WHERE salary = (21000, 52000);

You can’t have a salary that is equal to two values. One of the rules of using a subquery is that if you use an equals sign, you must return a single value.

That’s why you get the ORA-01427: single-row subquery returns more than one row. Your subquery has returned more than one row, and the outer query expects a single value (because of an = sign). Resolve this by using an IN operator of changing your subquery.

 

Example: IN Operator

You can match several values in a subquery by using an IN operator. The IN operator will check that the value you specify matches any of the values inside the IN operator.

For example, the earlier query can be written like this:

SELECT id, last_name, salary
FROM employee
WHERE salary IN (21000, 52000);

Now, this isn’t a subquery, but you can write a query with a subquery in the same way:

SELECT id, last_name, salary
FROM employee
WHERE salary IN (
  SELECT salary
FROM employee
WHERE last_name LIKE 'C%'
);

All we’ve done here is replace the = sign with an IN clause, and the query will work:

ID LAST_NAME SALARY
5 CHARLESTON 21000
9 CONNOR 52000

There are all kinds of uses for subqueries. Whenever you think you need to use the result of one query as an input or a filter in another query, a subquery is probably the best way to do it.

 

Inline Views: Subqueries in a FROM Clause

Another place you can use a subquery is in a FROM clause. The results of a subquery are treated just like a table, which you can select values from.

Let’s say you had this query which showed you the average salary per department:

SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
FROM employee
GROUP BY dept_id;

The results are:

DEPT_ID AVG_SALARY
1 55333.33
2 51333.33
3 48000

You have two columns here: dept_id and avg_salary, with three results Let’s say you wanted to treat these results as a table without actually creating a table. You can do this by using this query as a subquery in the from clause.

SELECT dept_id, avg_salary
FROM (
  SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
  FROM employee
  GROUP BY dept_id
);

Running this query will give you the same results:

DEPT_ID AVG_SALARY
1 55333.33
2 51333.33
3 48000

However, you can use the results of this subquery to join to other tables:

SELECT
sub.dept_id,
d.dept_name,
sub.avg_salary
FROM
(
  SELECT dept_id, ROUND(AVG(salary), 2) AS avg_salary
  FROM employee
  GROUP BY dept_id
) sub
INNER JOIN department d ON sub.dept_id = d.id;

This query does a few things.

First, we’re selecting columns from two tables: the sub and d tables. The sub table is actually the result of the subquery, which is the dept_id and avg_salary columns.

We are then joining that subquery to the department table. The subquery has been named sub, and it is then treated just like a table or a view. This type of query is called an inline view.

What is an inline view? It’s a subquery that’s inside the FROM clause of a query. It’s called an inline view because it acts just like a view object, but no view object is created on the database.

If you run this query, you’ll get this result:

DEPT_ID DEPT_NAME AVG_SALARY
1 SALES 55333.33
2 HR 51333.33
3 IT 48000

This way you can see the dept_name for each department, and their average salary. There may be other ways to get this data, but this is just an example of using a subquery in the FROM clause and using a join to another table.

 

Subqueries in a SELECT Clause

One more place you can use a subquery is in the SELECT clause.

If a subquery in the WHERE clause acts as a filter, and a subquery in the FROM clause acts as a view, then a subquery in the SELECT clause acts like a column that is to be shown in your output.

To use a subquery in your SELECT clause, you add it in the place of a column.

For example, let’s say you wanted to show the average salary alongside each employee record.

You could try a query like this:

SELECT id, last_name, salary, AVG(salary)
FROM employee;

However, if you run this query, you’ll get this error:

ORA-00937: not a single-group group function

This happens because we have an aggregate function and no GROUP BY clause.

We can adjust our query to use a subquery. The subquery would get the average salary, and that would be used as a “column” in the SELECT clause:

SELECT id, last_name, salary, (
  SELECT AVG(salary)
  FROM employee
)
FROM employee;

The result from the SELECT AVG(salary) subquery is used as a value on every row that is returned. The results are:

ID LAST_NAME SALARY (SELECTAVG(SALARY)FROMEMPLOYEE)
1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200

The name of this column looks messy, but that’s just how Oracle handles it by default. You can assign a column alias using the AS keyword to make it neater:

SELECT id, last_name, salary, (
  SELECT AVG(salary)
  FROM employee
) AS avg_salary
FROM employee;

This query will now show this result:

ID LAST_NAME SALARY AVG_SALARY
1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200

The same value of 51200 is shown for every row. If that’s your requirement, then you can use a subquery in this way to show the value.

This alias is required in MySQL but not in other SQL vendors.

 

Multiple Subqueries (Nested Subqueries)

Now, what if you wanted to use a subquery inside another subquery? This is possible in SQL. In fact, the maximum number of subqueries inside other subqueries you can use is 255. However, you shouldn’t even get close to that many subqueries. If you’ve got more than about 5 subqueries then you should look to redesign your query.

Let’s say you wanted to see each employee’s salary and the percentage of that salary compared to the average. You can use the earlier query, which showed each employee along with the average salary for all employees:

SELECT id, last_name, salary, (
  SELECT AVG(salary)
  FROM employee
) AS avg_salary
FROM employee;
ID LAST_NAME SALARY AVG_SALARY
1 SMITH 40000 51200
2 ANDERSON 60000 51200
3 JONES 45000 51200
4 LANDY 82000 51200
5 CHARLESTON 21000 51200
6 JOHNSON 51000 51200
7 BROWN 93000 51200
8 HARDEN 29000 51200
9 CONNOR 52000 51200
10 PIERCE 39000 51200

You could adjust your query like this to show the percentage of the average salary:

SELECT id, last_name, salary, (
  SELECT AVG(salary)
  FROM employee
) AS avg_salary,
ROUND(salary / (
  SELECT AVG(salary)
  FROM employee
), 2) AS pct_avg_salary
FROM employee;

However, this is running the same subquery twice, which is a waste. The results are:

ID LAST_NAME SALARY AVG_SALARY PCT_AVG_SALARY
1 SMITH 40000 51200 .78
2 ANDERSON 60000 51200 1.17
3 JONES 45000 51200 .88
4 LANDY 82000 51200 1.6
5 CHARLESTON 21000 51200 .41
6 JOHNSON 51000 51200 1
7 BROWN 93000 51200 1.82
8 HARDEN 29000 51200 .57
9 CONNOR 52000 51200 1.02
10 PIERCE 39000 51200 .76

You could write this as a nested subquery. You use the original results as a subquery into another query:

SELECT id,
last_name,
salary
avg_salary,
ROUND(salary/avg_salary, 2) AS pct_avg_salary
FROM (
  SELECT id, last_name, salary, (
    SELECT AVG(salary)
    FROM employee
  ) AS avg_salary
  FROM employee
);

This query will find all employee records, show the average salary for each of them, and then use the results of that in the outer query.

This shows that you can use several levels of subqueries.

 

Correlated Subqueries

The final concept on subqueries that I’ll cover is a correlated subquery.

What’s a correlated subquery? It’s when a subquery refers to a column that exists in the outer query. The subquery and the outer query are said to be correlated, as they are linked to each other.

So far, our subqueries are independent queries, with the results used inside an outer query. However, you can refer to a column in the outer query from within the subquery.

An example of this would be a query to find the employees with a salary greater than the average salary in their department.

To find the average salary in the department, we can use this query:

SELECT dept_id, AVG(salary)
FROM employee
GROUP BY dept_id;

The results are:

DEPT_ID AVG(SALARY)
1 55333.33333
2 51333.33333
3 48000

Now, to find the employees with a salary greater than the average salary in their department, we can use a subquery. The subquery needs to match one of these averages here to the employee record, based on their department ID.

SELECT e.id, e.last_name, e.salary, e.dept_id
FROM employee e
WHERE e.salary > (
  SELECT AVG(s.salary)
  FROM employee s
  WHERE s.dept_id = e.dept_id
  GROUP BY s.dept_id
);

This query is different to the others. It selects a few columns from the employee table, with an alias of “e”. It then has a subquery that looks at the average of the salary value.

However, this is not just a simple average. It looks for the average where the dept_id matches the employee’s dept_id from the outer query.

This allows the query to find the average for each employee’s department, and check if their salary is greater than this average.

The results of this query are:

ID LAST_NAME SALARY DEPT_ID
2 ANDERSON 60000 3
4 LANDY 82000 1
6 JOHNSON 51000 3
7 BROWN 93000 2
9 CONNOR 52000 3

These are the only employees that have a salary greater than their department’s average. There may be employees with a salary greater than averages of other departments, but the correlated subquery lets you find results based on the related records: in this case, the average salary of a department ID.

 

Conclusion

Subqueries are powerful tools to use in your SQL queries. It adds to the complexity of a query, but if your requirements say that you need a certain data set, often the only way to do that is using a subquery.

8 thoughts on “SQL Subqueries: The Complete Guide”

  1. I agree. This is the best explanation of subqueries I’ve ever read. I wish I had my hands on this a long time ago!

  2. With a select subquery, can you take the result of that subquery column and use it in a calculation in the outer query? For example:

    Select
    col1,
    col2,
    (select sum(col3) from table 2 where conditions) as temp,
    col4 – temp as variance,
    col5
    from table 1

    I keep getting errors trying different things — temp is not recognized

    1. Hi Sharon, you can use it in the outer query, but you’ll have to contain the entire query in a subquery again.
      This is because the alias “temp” is not defined at the point the select columns are required. You could do this:
      Select
      col1,
      col2,
      temp,
      col4 – temp AS variance,
      col5
      FROM (
      Select
      col1,
      col2,
      (select sum(col3) from table 2 where conditions) as temp,
      col4,
      col5
      from table 1
      );

  3. Hello Ben, Databasestar is my new go-to site for all SQL and PLSQL-related questions. I love how every topic is explained in a clean and concise way with examples. I want to subscribe for a yearly membership, but do we have the complete PLSQL course? That’s my question. If we do, then I’m more than happy to be part of the databasestar membership.

    1. Hi Amar, thanks for the kind words! Glad you like the site. Regarding the PLSQL course – I have three courses on PLSQL that cover a range of features like the basics, functions, exceptions, running SQL, parameters, and more. I don’t think it’s everything in PLSQL but it’s a good amount.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.