SQL CTE (WITH Clause): The Ultimate Guide

Apr 1, 2019 · 20 min read · sql ·

A Common Table Expression (or CTE) is a feature in several SQL versions to improve the maintainability and readability of an SQL query.

It goes by a few names:

Common Table Expression
Subquery Factoring
SQL WITH Clause

In this article, you'll learn all about the Common Table Expression, what it's good for, and how to use it.

What Is a Common Table Expression or CTE or With Clause in SQL?

A Common Table Expression (or CTE) is a query you can define within another SQL query.

It's like a subquery. It generates a result that contains rows and columns of data.

The difference is that you can give this result a name, and you can refer to it multiple times within your main query.

You can use a CTE inside the following types of queries:

SELECT
INSERT
UPDATE
DELETE

Why Is This Also Called a WITH Clause or Subquery Factoring?

It's called an SQL WITH clause because it uses the WITH keyword to define this kind of subquery. We'll look at the syntax a little later in this article.

It's sometimes called subquery factoring because you are improving the design of an SQL query, and in software development, the term "refactoring" means to improve the design of code.

Which Databases Support Common Table Expressions?

The Common Table Expression or CTE SQL feature is available in the following databases:

Oracle (as of version 9.2)
SQL Server (as of version 2005)
MySQL (as of version 8.0)
PostgreSQL
SQLite
MariaDB

There may be other databases that include CTEs that are not on this list.

It is part of the SQL-99 standard, so any database that implements that standard will include this feature.

While you're here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

Example of a Common Table Expression

Let's say you had this query. It's based on a fictional employee database.

The data for setting up this database is shown below. You can download this script from my GitHub repository here.

 1CREATE TABLE employee (
 2  emp_id NUMBER(5),
 3  first_name VARCHAR2(50),
 4  last_name VARCHAR2(50),
 5  dept_id NUMBER(5),
 6  manager_id NUMBER(5),
 7  office_id NUMBER(5)
 8);
 9 
10CREATE TABLE department (
11  dept_id NUMBER(5),
12  dept_name VARCHAR2(50)
13);
14 
15INSERT ALL
16INTO employee (emp_id, first_name, last_name, dept_id, manager_id, office_id)
17VALUES (1, 'Sally', 'Jones', 3, 2, 5)
18INTO employee (emp_id, first_name, last_name, dept_id, manager_id, office_id)
19VALUES (2, 'Mark', 'Smith', 2, 4, 3)
20INTO employee (emp_id, first_name, last_name, dept_id, manager_id, office_id)
21VALUES (3, 'John', 'Andrews', 1, 4, 3)
22INTO employee (emp_id, first_name, last_name, dept_id, manager_id, office_id)
23VALUES (4, 'Michelle', 'Johnson', 2, NULL, 5)
24INTO employee (emp_id, first_name, last_name, dept_id, manager_id, office_id)
25VALUES (5, 'Brian', 'Grand', 2, 2, 3)
26SELECT * FROM dual;
27 
28INSERT ALL
29INTO department (dept_id, dept_name) VALUES (1, 'Sales')
30INTO department (dept_id, dept_name) VALUES (2, 'IT')
31INTO department (dept_id, dept_name) VALUES (3, 'Support')
32SELECT * FROM dual;

You could have a query that shows the employee names and the number of employees from the same department. The query could look like this:

 1SELECT e.first_name, 
 2e.last_name, 
 3d.dept_count
 4FROM employee e
 5INNER JOIN (
 6  SELECT dept_id, COUNT(*) AS dept_count
 7  FROM employee
 8  GROUP BY dept_id
 9) d
10ON e.dept_id = d.dept_id;

This query uses a subquery to:

Find the department ID and the count of each department
Use this result as an inline view called "d"
Join "d" to the employee table "e"
Show the employee name and department count

If you run this query using the sample data shown above, your results will look like this:

FIRST_NAME	LAST_NAME	DEPT_COUNT
John	Andrews	1
Brian	Grand	3
Michelle	Johnson	3
Mark	Smith	3
Sally	Jones	1

A Common Table Expression will allow you to move the subquery and define it separately. Using a Common Table Expression, the query will look like this:

 1WITH d_count AS (
 2  SELECT dept_id, COUNT(*) AS dept_count
 3  FROM employee
 4  GROUP BY dept_id
 5)
 6SELECT e.first_name, 
 7e.last_name, 
 8d.dept_count
 9FROM employee e
10INNER JOIN d_count d ON e.dept_id = d.dept_id;

Let's break this down.

This is the CTE:

1WITH d_count AS (
2  SELECT dept_id, COUNT(*) AS dept_count
3  FROM employee
4  GROUP BY dept_id
5)

We can tell it's a CTE because it starts with a WITH keyword. We have defined this SELECT query using the WITH keyword, and given it a name of "d_count". This means that the query inside the brackets is called d_count.

Now, the rest of the query comes after the brackets, and is the main query:

1SELECT e.first_name, 
2e.last_name, 
3d.dept_count
4FROM employee e
5INNER JOIN d_count d ON e.department_id = d.department_id;

This is the query that is run and has the results displayed.

Notice that the course_count is being joined in this query. It's treated just like a table or view - which is the advantage of using a CTE.

The main query is easier to read.

Why would we go to all that trouble just to move a subquery from one place to another?

It's not that big of a deal in a query as small as this, but if you had a large query, it's really helpful.

In large queries, you could be referring to the same subquery more than once. If so, moving it to a CTE will make it easier to read and work with.

Why Use Common Table Expressions in SQL?

We've just seen a simple CTE SQL example.

Why would you use one?

Improve readability. Using a Common Table Expression can improve the readability of an SQL query. This is especially evident in large queries and those that use complicated logic on columns. You can move the complicated logic to a CTE and ensure the overall query is simplified.
Use it like a view. A CTE has a name and columns and therefore it can be treated just like a view. You can join to it and filter from it, which is helpful if you don't want to create a new view object or don't have the permissions to do so.
Use recursion or hierarchical queries. The SQL WITH clause allows you to write recursive queries, or hierarchical queries, which are queries that refer to previous rows of the same query. We'll look at this a little later in this article.

Here's my YouTube video on using Common Table Expressions:

Recursive Common Table Expressions/Recursive WITH Clause

A recursive query is a query that operates on its own results. It's a common way to work with hierarchical data.

The concept is probably better explained with an example. Let's use an employee table, which is commonly used in examples online.

Our employee table looks like this table below. Each employee has a name, a department ID, and a manager.

The important part of this table is that the manager_id refers to another employee record's emp_id value.

EMP_ID	FIRST_NAME	DEPT_ID	MANAGER_ID
1	Sally	3	2
2	Mark	2	4
3	John	1	4
4	Michelle	2	(NULL)
5	Brian	2	2

If you select just the employee data from this table, your query might look like this:

1SELECT emp_id, first_name, dept_id, manager_id
2FROM employee e;

You'll get one row for each employee, along with their manager ID. The output will be the table above.

If you want to see the details of their manager, your query might look like this:

1SELECT e.emp_id, e.first_name, e.dept_id,
2m.emp_id, m.first_name, m.dept_id, m.manager_id
3FROM employee e
4LEFT JOIN employee m ON e.manager_id = m.emp_id;

This will show each employee as well as details about their manager. The output will look like this:

EMP_ID	FIRST_NAME	DEPT_ID	EMP_ID	FIRST_NAME	DEPT_ID	MANAGER_ID
1	Sally	3	2	Mark	2	4
2	Mark	2	4	Michelle	2	(NULL)
3	John	1	4	Michelle	2	(NULL)
4	Michelle	2	(NULL)	(NULL)	(NULL)	(NULL)
5	Brian	2	2	Mark	2	4

This shows you what you need, right?

However, the problem is that it only shows one level of the hierarchy.

What if you had multiple levels in your hierarchy and you wanted to show them all? For example, the CEO, senior management, middle management, team leaders, team members.

To show the data for the whole hierarchy with a regular query, you'll need to LEFT JOIN to the same table many times. Or, you'll need to use UNION ALL a few times to get the data for multiple levels. This can make your query very long.

The answer to this is to use a Recursive Common Table Expression.

A Recursive CTE or Recursive Common Table Expression is a CTE that runs a query over each result of itself to generate an overall result.

This will allow you to generate a full list of hierarchical data, such as all employees and managers, or all product categories and their subcategories.

What Does a Recursive CTE Look Like?

So what does a recursive CTE SQL query look like?

Using the employee and manager example, it looks like this:

 1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (
 2  SELECT emp_id, first_name, manager_id, 1
 3  FROM employee
 4  WHERE manager_id IS NULL
 5  UNION ALL
 6  SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
 7  FROM employee e
 8  INNER JOIN cteEmp r
 9  ON e.manager_id = r.emp_id
10)
11SELECT emp_id,
12first_name,
13manager_id,
14emplevel
15FROM cteEmp
16ORDER BY emplevel;

This query looks a bit complicated, so let's break it down.

1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (

This is where we declare the CTE. It has a name, "cteEmp". Inside the brackets are the names we have given each column for use in the main query.

1SELECT emp_id, first_name, manager_id, 1
2FROM employee
3WHERE manager_id IS NULL

We then have the first part of the recursive CTE. This is known as the anchor. It selects records from the emp table where the manager_id is NULL, which translates to the top-level record. We have also selected a value of 1 in the last column, which maps to the emplevel column. This allows us to see the level of each employee, which we will see in the final output.

1UNION ALL
2SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
3FROM employee e
4INNER JOIN cteEmp r
5ON e.manager_id = r.emp_id
6)

This part is called the recursion and is the part that does the recursive query.

We perform a UNION ALL to combine the results of this with the results of the first part.

We select the same columns - emp_id, first_name, manager_id. However we also select another column: r.emp_level + 1. This will select the previous record's emplevel (their parent), and add 1 to it.

We then join the emp table to cteEmp, which is the CTE name. This is what makes it recursive - joining the query to itself. It allows us to see all levels of the hierarchy.

We then join the two on the common column: employee's manager_id to the recursive query's emp_id (which is the parent).

1SELECT emp_id,
2first_name,
3manager_id,
4emplevel
5FROM cteEmp
6ORDER BY emplevel;

Finally, we select some columns from the recursive CTE, ordering by the emplevel column we calculated.

Our output will look like this:

EMP_ID	FIRST_NAME	MANAGER_ID	EMP_LEVEL
4	Michelle	(NULL)	1
2	Mark	4	2
3	John	4	2
1	Sally	2	3
5	Brian	2	3

This result doesn't look that different, but it allows you to work with large sets of data with simple queries. You can also move the complex logic for calculating and joining this data to a CTE, and just display the data how you like.

So, that's how a recursive CTE can work.

Note: This works in all databases that support Common Table Expressions. However, Oracle has a special set of hierarchical keywords that provide an alternative to recursive CTEs. Look up CONNECT BY PRIOR for more information.

While you're here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

WITH Clause/Common Table Expression Syntax

The syntax for writing a Common Table Expression in Oracle or SQL Server using the SQL WITH clause is:

1WITH cte_name [(column_aliases)] AS (
2  subquery_sql_statement
3)
4SELECT column_list
5FROM cte_name;

You are able to declare multiple CTEs in a single statement, by separating them with a comma.

1WITH cte_name AS (
2  subquery_sql_statement
3),
4cte_two AS (
5  second_subquery_sql
6)
7SELECT column_list
8FROM cte_name
9INNER JOIN cte_two …;

There are a few things to note with the Oracle syntax:

You can declare optional column aliases for the columns after the brackets like you can with SQL Server syntax. The aliases are added as regular column aliases inside the subquery. This is the [(column_aliases)] in the query above.
Oracle will optimise the queries as though they are inline views, so you shouldn't have to worry about any performance issues over creating a query this way instead of an inline view.

Writing a CTE in SQL Server is done in a similar way to writing a CTE in Oracle, MySQL, and PostgreSQL.

Let's take a look at some examples.

WITH Clause Example

We can use the employee database to demonstrate some examples.

The SQL to create the sample data is available at the top of this article.

Example 1 - Simple WITH Clause

This example shows a basic WITH clause (also known as subquery factoring or Common Table Expression). It shows each employee and the number of employees in the office.

1WITH o_count AS (
2  SELECT office_id, COUNT(*) AS office_count
3  FROM employee
4  GROUP BY office_id
5)
6SELECT e.emp_id, e.first_name, o.office_count
7FROM employee e
8INNER JOIN o_count o ON e.office_id = o.office_id;

The results of this query are:

EMP_ID	FIRST_NAME	OFFICE_COUNT
4	Michelle	2
1	Sally	2
5	Brian	3
3	John	3
2	Mark	3

Example 2 - Recursive WITH Clause

You can use the WITH clause in Oracle, SQL Server, or other databases to create a recursive query.

As mentioned earlier in this article, it's useful for data that is set up in a hierarchy. Oracle offers the CONNECT BY PRIOR syntax as an alternative to the WITH clause for recursive queries, but you can still use the WITH clause to do this.

Let's say we wanted to see the hierarchy of employees and managers. Our query would look like this:

 1WITH cteEmp AS (
 2  SELECT emp_id, first_name, manager_id, 1
 3  FROM employee
 4  WHERE manager_id IS NULL
 5  UNION ALL
 6  SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
 7  FROM employee e
 8  INNER JOIN cteEmp r
 9  ON e.manager_id = r.emp_id
10)
11SELECT emp_id,
12first_name,
13manager_id,
14emplevel
15FROM cteEmp
16ORDER BY emplevel;

If you run this query in Oracle, you'll get this error:

1ORA-32039: recursive WITH clause must have column alias list
232039.00000 -  "recursive WITH clause must have column alias list"
3*Cause:    A WITH clause query referred to itself (recursive) but did
4not have a column alias list specified for it.
5*Action:   Add a column alias list for the WITH clause query name.

How do you resolve this ORA-32039 error?

You need to add in the column alias to the first line of your query, where the SQL WITH clause is declared.

Right now, the line looks like this:

1WITH cteEmp AS (

To work with a recursive WITH clause in Oracle, you need to add column aliases. This can be done by adding opening and closing brackets after the CTE name, and entering in a name for each of the columns returned from the query:

1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (

The full query looks like this:

 1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (
 2  SELECT emp_id, first_name, manager_id, 1
 3  FROM employee
 4  WHERE manager_id IS NULL
 5  UNION ALL
 6  SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
 7  FROM employee e
 8  INNER JOIN cteEmp r
 9  ON e.manager_id = r.emp_id
10)
11SELECT emp_id,
12first_name,
13manager_id,
14emplevel
15FROM cteEmp
16ORDER BY emplevel;

The results from this query are here:

EMP_ID	FIRST_NAME	MANAGER_ID	EMPLEVEL
4	Michelle	(NULL)	1
2	Mark	4	2
3	John	4	2
1	Sally	2	3
5	Brian	2	3

Ordering Data in a Hierarchical Query

If you think about the way that hierarchical data is structured, it can be ordered in different ways:

The records on the same level (the siblings) are shown before any records on lower levels (the children)
The records on lower levels (the children) are shown before any records on the same level (the siblings)

Which order does the query return results in? Siblings first, or children first?

The good thing is that in Oracle you can specify which order you want to use. We can do this by adding the SEARCH clause.

1WITH cte_name AS (
2  cte_query
3)
4SEARCH {DEPTH|BREADTH} FIRST BY id_col ASC SET order_col
5SELECT query;

Using the DEPTH FIRST keyword, the children will be ordered before any siblings. Using BREADTH FIRST, the siblings will be ordered before any children.

Let's see an example.

Ordering Breadth First (Siblings Before Children)

Here is the same query as before, but we are ordering by breadth first (children before siblings):

 1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (
 2  SELECT emp_id, first_name, manager_id, 1
 3  FROM employee
 4  WHERE manager_id IS NULL
 5  UNION ALL
 6  SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
 7  FROM employee e
 8  INNER JOIN cteEmp r
 9  ON e.manager_id = r.emp_id
10)
11SEARCH BREADTH FIRST BY emp_id ASC SET order_col
12SELECT emp_id,
13first_name,
14manager_id,
15emplevel
16FROM cteEmp
17ORDER BY order_col;

We have specified BREADTH FIRST, and after the BY keyword we have used the emp_id as this is the ID column or primary key. We have specified a new pseudocolumn called order_col, which is then used by the main SELECT query to perform the ordering.

The results are:

EMP_ID	FIRST_NAME	MANAGER_ID	EMPLEVEL
4	Michelle	(NULL)	1
2	Mark	4	2
3	John	4	2
1	Sally	2	3
5	Brian	2	3

We can see that both of Michelle's team members (Mark and John) are shown before Mark's team members.

Ordering Depth First (Children Before Siblings)

We can change the query to show all team members for an employee before the other employees in their team by using DEPTH FIRST:

 1WITH cteEmp (emp_id, first_name, manager_id, emplevel) AS (
 2  SELECT emp_id, first_name, manager_id, 1
 3  FROM employee
 4  WHERE manager_id IS NULL
 5  UNION ALL
 6  SELECT e.emp_id, e.first_name, e.manager_id, r.emplevel + 1
 7  FROM employee e
 8  INNER JOIN cteEmp r
 9  ON e.manager_id = r.emp_id
10)
11SEARCH DEPTH FIRST BY emp_id ASC SET order_col
12SELECT emp_id,
13first_name,
14manager_id,
15emplevel
16FROM cteEmp
17ORDER BY order_col;

We have specified DEPTH FIRST, and after the BY keyword, we have used the emp_id as this is the ID column or primary key. We have specified a new pseudocolumn called order_col, which is then used by the main SELECT query to perform the ordering.

The results are:

EMP_ID	FIRST_NAME	MANAGER_ID	EMPLEVEL
4	Michelle	(NULL)	1
2	Mark	4	2
1	Sally	2	3
5	Brian	2	3
3	John	4	2

The difference here is that Sally and Brian are shown in the results before John. This is because Sally and Brian are both in Mark's team.

Common Table Expressions FAQ

Here are the answers to some common questions when working with CTEs or Common Table Expressions.

What Does CTE Stand For?

CTE stands for Common Table Expression. It's a feature of the SQL language that allows you to name a part of a query that is used within the main query, to improve readability and reduce complexity.

Can You Use a CTE in an UPDATE Statement?

In Oracle, you can only use a CTE in a SELECT statement. However, in SQL Server, you can use it in an UPDATE statement.

The syntax for using a CTE in an UPDATE statement looks like this:

1WITH cteName AS (
2  SELECT cols
3  FROM table...
4)
5UPDATE cteName
6SET col = value;

What's the Difference Between a CTE vs Temp Table?

The biggest difference is that a CTE is only available and accessible within the query it is defined, while a temp table can be accessed by either the current session or all database sessions, depending on how you have defined it (either local or global).

The differences between a CTE and a temporary table are listed below.

Criteria	CTE	Temp Table
Indexes	Cannot have indexes	Can be indexed
Constraints	Cannot have constraints	Can have constraints
Lifetime	Only exist when the query is run	Can exist for the connection or for all sessions (local or global)
References	Only inside the query	Can be referenced by stored procedures
Recursive	Can be recursive	Not recursive
Stats	Don't have dedicated stats	Can have dedicated stats
Main usage	Improve readability of complex queries, recursive queries	Large result set, or refer to result set more than once

What about performance? How does the performance compare between a CTE and a temporary table?

Generally, a CTE does not improve performance. It's just a way for your query to be restructured so it's easier to read. However, it depends on many factors such as your query and your data. Measure your query before and after applying a CTE to see if it performs better.

More information can be found in this StackExchange answer.

What's the Difference Between CTE vs Subquery?

A CTE (Common Table Expression) is a way to write part of your query to make your overall query easier to read. The database should run it the same way as if it was the one query.

The way that the database executes the query should be the same regardless of if it was written as a subquery or as a CTE. This is because the database will work out the best way of running the query, and a CTE is essentially changing how the query looks but not creating any new objects like a temporary table would.

Can You Create a CTE in MySQL?

Yes, you can create a CTE (Common Table Expression) in MySQL. The syntax and the way it works is similar to Oracle and SQL Server. You use the WITH keyword, give it a name, define the SELECT query, and then continue with the rest of the main query.

For example:

 1WITH d_count AS (
 2  SELECT dept_id, COUNT(*) AS dept_count
 3  FROM department
 4  GROUP BY dept_id
 5)
 6SELECT e.first_name, 
 7e.last_name, 
 8d.dept_count
 9FROM employee e
10INNER JOIN d_count d ON e.dept_id = d.dept_id;

Can You Create a CTE in PostgreSQL?

Yes, PostgreSQL includes support for CTEs (Common Table Expressions). They work the same as other databases: you declare them using the WITH clause, give it a name, define the SELECT query, then write the rest of the main query. For example:

 1WITH d_count AS (
 2  SELECT dept_id, COUNT(*) AS dept_count
 3  FROM department
 4  GROUP BY dept_id
 5)
 6SELECT e.first_name, 
 7e.last_name, 
 8d.dept_count
 9FROM employee e
10INNER JOIN d_count d ON e.dept_id = d.dept_id;

Can You Use Multiple CTEs in a Query?

Yes, you can use multiple CTEs in a single query. You just separate each CTE query with a comma.

For example:

1WITH firstQuery AS (
2  SELECT columns FROM table
3),
4secondQuery AS (
5  SELECT columns FROM second_table
6)
7SELECT main_columns
8FROM tables;

Notice that after the closing bracket of the first CTE and before the name of the second CTE there is a comma, then we have added the second CTE.

This will allow you to use both CTEs in the one query.

What Is The CTE Performance Like?

A CTE is a feature that lets you change how you write your query. No new objects are created on the database. The database will analyse your query and decide on the best way to run it.

Because a CTE is mainly used to improve readability and reduce complexity, the database will treat the query in the same way as if it was written as one big query. Using a CTE will usually ensure the performance stays the same compared to writing it as one query. I say usually, as I have read cases where a CTE has improved performance.

But I wouldn't rely on a CTE to improve performance.

In short, CTEs should not change performance but test your query before and after.

Conclusion

CTEs or Common Table Expressions are a feature in SQL that creates a named query to use in the rest of your query. They help to improve the readability of your query and are implemented using the WITH clause.

Consider using a CTE in any query you feel is too complex.

While you're here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here: