FB

SQL Set Operators: The Complete Guide to UNION, INTERSECT & MINUS

Are you wondering what set operators are?

Have you seen keywords like UNION or MINUS and want to know what they do and why they are used?

In this article, you’ll learn all about set operators such as these, why they are used, and see some examples.

Table of Contents

Here’s what you’ll learn in this article. Click on each of the headings to be taken to that place in this article.

  1. What are SQL Set Operators?
  2. The Different Types of Set Operators
  3. UNION: Combining Results
  4. UNION ALL: Combining Results in a Different Way
  5. MINUS: Finding Results Rhat Are Missing
  6. INTERSECT: Showing Common Results
  7. EXCEPT: Another Way of Finding Missing Results
  8. Working with Two or More Set Operators

Let’s get into learning about set operators!

 

What are SQL Set Operators?

(Back to Top)

A set operator in SQL is a keyword that lets you combine the results of two queries into a single query.

Sometimes when working with SQL, you’ll have a need to query data from two more tables. But instead of joining these two tables, you’ll need to list the results from both tables in a single result, or in different rows. That’s what set operators do.

 

The Different Types of Set Operators

(Back to Top)

There are a few different set operators that can be used, depending on your needs, and which database vendor you’re using.SQL Set Operators

The different set operators are:

  • UNION
  • UNION ALL
  • MINUS
  • INTERSECT
  • EXCEPT

Let’s take a look at each of these, using some sample data.

 

Using Set Operators

(Back to Top)

Set operators are used like this:

SELECT your_select_query
set_operator
SELECT another_select_query;

It uses two (or more) SELECT queries, with a set operator in the middle.

There are a few things to keep in mind though.

When selecting your columns, the number of columns needs to match between queries, and the data type of each column needs to be compatible.

So, if you select three columns in the first query, you need to select three columns in the second query. The data types also need to be compatible, so if you select a number and two character types in the first query, you need to do the same in the second query.

Also, if you want to order your results, the ORDER BY must go at the end of the last query. You can’t add ORDER BY inside each SELECT query before the set operator.

 

Sample Data

Our sample data for this article uses a customer and employee table.

Customer

first_name last_name
Stephen Jones
Mark Smith
Denise King
Paula Johnson
Richard Archer

Employee

first_name last_name
Christina Jones
Michael McDonald
Paula Johnson
Stephen Jones
Richard Smith

Each of our query examples in this article will use this data.

 

UNION: Combining Results

(Back to Top)

The UNION keyword or set operator will allow you to combine the results of two queries. It removes any duplicate results and shows you the combination of both.

Expressed as a Venn diagram, where each circle represents a query result, it looks like this:

SQL Union

Let’s see an example based on the sample data above. If we wanted to select from both the employee and customer tables, using UNION, our query would look like this:

SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee;

Result:

first_name last_name
Stephen Jones
Mark Smith
Denise King
Paula Johnson
Richard Archer
Christina Jones
Michael McDonald
Richard Smith

This shows us all customer and employee records.

But looking at this, we can’t tell which record comes from each table. Often we don’t need to know, but sometimes we do.

How can we do that? We can add a static value to each query to indicate which table it came from, or what type of record it is. As long as the number and type of columns match, the query will work.

SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee;

Result:

record_type first_name last_name
Customer Stephen Jones
Customer Mark Smith
Customer Denise King
Customer Paula Johnson
Customer Richard Archer
Employee Christina Jones
Employee Michael McDonald
Employee Richard Smith

In this query, I added in a value of ‘Customer’ to display for all records from the Customer table. It has a column alias of record_type. The same column is labelled as ‘Employee’ from the employee table. Looking at the results you can see which records came from which table.

You don’t need to specify the column aliases on the second table. Oracle will know that the first columns match and use the alias already provided in the first query.

To order the results, the ORDER BY needs to go at the end.

SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee
ORDER BY record_type, first_name, last_name;

Results:

record_type first_name last_name
Customer Denise King
Customer Mark Smith
Customer Paula Johnson
Customer Richard Archer
Customer Stephen Jones
Employee Christina Jones
Employee Michael McDonald
Employee Richard Smith

All of the set operators support standard keywords inside the SELECT statement. The set operator is just a way of combining results.

So, you can add WHERE clauses like this if you need to:

SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
WHERE status = ‘Active’
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee
WHERE emp_status = ‘Current’
ORDER BY record_type, first_name, last_name;

 

What’s the Difference Between a UNION and a JOIN?

The UNION and JOIN keywords both combine results from two different tables or queries.

The difference is how they are combined.

UNION combines data into separate rows, and JOIN combines data into separate columns.

When performing a JOIN, there is a column that matches between the two tables, and additional data may be displayed.

If we wanted to JOIN our employee and customer tables, our query might look like this:

SELECT
e.emp_id,
e.first_name,
e.last_name,
c.customer_id,
c.first_name,
c.last_name
FROM employee e
INNER JOIN customer c
ON e.first_name = c.first_name
AND e.last_name = c.last_name;

Result:

cust_id first_name last_name emp_id first_name last_name
15 Stephen Jones 3 Stephen Jones
81 Paula Johnson 9 Paula Johnson

When we use UNION to combine our tables, it looks like this:

SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee;

Result:

first_name last_name
Stephen Jones
Mark Smith
Denise King
Paula Johnson
Richard Archer
Christina Jones
Michael McDonald
Richard Smith

We can use different JOIN types to display different records based on matches being found, but in general, that’s how a join is different to a UNION.

More Information

For more information on UNION:

 

UNION ALL: Combining Results in a Different Way

(Back to Top)

The UNION ALL set operator also combines the results from two queries.

It’s very similar to UNION, but it does not remove duplicates.

Expressed as a Venn diagram, it looks like this:

SQL Union

Let’s see an example. If we wanted to select from both the employee and customer tables, using UNION ALL, our query would look like this:

SELECT first_name, last_name
FROM customer
UNION ALL
SELECT first_name, last_name
FROM employee;

Result:

first_name last_name
Stephen Jones
Mark Smith
Denise King
Paula Johnson
Richard Archer
Christina Jones
Michael McDonald
Paula Johnson
Stephen Jones
Richard Smith

This shows us all customer and employee records.

There are some records in this table that appear twice (Stephen Jones and Paula Johnson). Why is that?

This is because UNION ALL does not remove duplicates. So the same first name and last name values exist in both the customer and employee tables, and this query shows each of them.

We can use the same adjustment to the query as we looked at earlier to see which table the data comes from:

SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION ALL
SELECT ‘Employee’, first_name, last_name
FROM employee;

Result:

record_type first_name last_name
Customer Stephen Jones
Customer Mark Smith
Customer Denise King
Customer Paula Johnson
Customer Richard Archer
Employee Christina Jones
Employee Michael McDonald
Employee Paula Johnson
Employee Stephen Jones
Employee Richard Smith

This will let us see which table the data comes from, for each record.

 

What’s the Difference between UNION and UNION ALL?

The major difference between UNION ALL and UNION in SQL is that UNION removes any duplicate results from the final result set, and UNION ALL does not. UNION performs a DISTINCT on the result set to remove duplicates.

To remember this, consider that “ALL” in UNION ALL means “show all records”.

Therefore, UNION ALL will almost always show more results, as it does not remove duplicate records.

As a result of this, UNION is often slower than UNION ALL, because there is an operation to remove duplicate values (a.k.a DISTINCT), which is often a costly step in a query.

UNION ALL does not perform a distinct, so is usually faster.

So, if you don’t need to have unique rows in your result set, or if you’re sure the rows in the database or query are unique already, then use UNION ALL.

More Information

For more information on UNION ALL:

 

MINUS: Finding Results That Are Missing

(Back to Top)

Another set operator we can use is the MINUS keyword.

The MINUS set operator will return results that are found in the first query specified that don’t exist in the second query.

As a Venn diagram, it looks like this:

SQL Minus

Using our example data, we could use the MINUS set operator to find all names in the customer table that don’t exist in the employee table.

Our query would look like this:

SELECT first_name, last_name
FROM customer
MINUS
SELECT first_name, last_name
FROM employee;

Result:

first_name last_name
Mark Smith
Denise King
Richard Archer

If a result exists in the employee table as well as the customer table, it is not shown. Only the results from the customer table that are not in the employee table are shown.

 

What About the EXCEPT Set Operator?

The EXCEPT keyword is another set operator you might see in your code or in online examples.

EXCEPT is the same as MINUS – they both show results from one query that don’t exist in another query.

However, MINUS is an Oracle-specific keyword, and EXCEPT is in other databases such as SQL Server.

So, if you see EXCEPT anywhere, just know it’s the same as MINUS but for a different database.

More Information

For more information on MINUS and EXCEPT:

 

INTERSECT: Showing Common Results

(Back to Top)

The INTERSECT keyword allows you to find results that exist in both queries. Two SELECT statements are needed, and any results that are found in both of them are returned if INTERSECT is used.

As a Venn diagram, it looks like this:

SQL Intersect

Using our example data, we could use the INTERSECT set operator to find all names in the customer table that don’t exist in the employee table.

Our query would look like this:

SELECT first_name, last_name
FROM customer
INTERSECT
SELECT first_name, last_name
FROM employee;

Result:

first_name last_name
Stephen Jones
Paula Johnson

This shows all names that are in both the customer and employee table.

If we want to order these, we add an ORDER BY at the end.

SELECT first_name, last_name
FROM customer
INTERSECT
SELECT first_name, last_name
FROM employee
ORDER BY first_name, last_name;

Result:

first_name last_name
Paula Johnson
Stephen Jones

 

What’s the Difference between UNION and INTERSECT?

The difference between UNION and INTERSECT is that UNION gets results from both queries and combines them, while INTERSECT gets results that only exist in both queries.

So, if Query 1 returns records A and B, and Query 2 returns records B and C, UNION would return A, B and C. INTERSECT would only return B.

More Information

For more information on the INTERSECT set operator:

 

Working with Two or More Set Operators

(Back to Top)

So far we’ve looked at combining two queries into one using one set operator.

What if you want to combine three queries using two set operators? Or more?

You can do that in SQL.

You can just add another set operator and another query at the end.

For example, let’s say that along with our customer and employee tables, we had a table for referrers (people who referred customers).

If we wanted to see all of the names in all tables, we could write a query like this:

SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee
UNION
SELECT first_name, last_name
FROM referrer
ORDER BY first_name, last_name;

This query will select from the customer table, UNION that to the employee table, and then union the results at that point to the referrer table.

If we wanted to see all records in the customer and employee that weren’t in the referrer table, we can write this query:

SELECT first_name, last_name
FROM customer
UNION ALL
SELECT first_name, last_name
FROM employee
MINUS
SELECT first_name, last_name
FROM referrer
ORDER BY first_name, last_name;

Now, you might be wondering how this works? The first query is run, then a UNION ALL is performed with the second query, then using those results a MINUS is performed with the third query.

Unlike the mathematical operators (such as addition and multiplication), there is no order of operations with set operators. They are treated equally and run in order in the query from start to finish.

However, you can use brackets to force an order of operations with set operators.  Let’s say you wanted to find all customer names, as well as employee names that weren’t referrer names.

Customer + (Employee – Referrer)

The query would look like this:

SELECT first_name, last_name
FROM customer
UNION ALL
(SELECT first_name, last_name
FROM employee
MINUS
SELECT first_name, last_name
FROM referrer)
ORDER BY first_name, last_name;

Adding the brackets around the second and third SELECT queries will ensure those queries are run together first, and the results of that are used to UNION ALL to the first select.

Using different combinations of brackets and set operators can give you the results you need when working with multiple queries.

 

Conclusion

(Back to Top)

Set operators can be useful when you want to combine the results of different queries into different rows. In SQL, the different set operators are UNION, UNION ALL, INTERSECT and MINUS (or EXCEPT, depending on your database).

Do you have any questions about set operators? Let me know in the comments below.

16 thoughts on “SQL Set Operators: The Complete Guide to UNION, INTERSECT & MINUS”

  1. How to match the number of columns..?? If table1 has 3 columns first_name, last_name and salary
    and table2 has only 2 columns first_name and last_name then how to combine data using set operators..?? Also add an example for this…

    1. It would depend what you want to show from your second table in the salary column. Do you want to show 0, or null, or something else?
      If you want to show NULL, for example, it could be:
      SELECT first_name, last_name, salary FROM table1
      UNION ALL
      SELECT first_name, last_name, NULL FROM table2

  2. I really like your presentation style. It is simple and straightforward. However, you really gloss over the idea of proprietary SQL dialect keywords, such as Oracle’s minus. A post about ANSI SQL and proprietary SQL would be welcome. Did you know the Oracle RDBMS engine as of version 20c supports use of the ANSI standard except and except all, in addition to minus?

    1. Hi Todd, thanks for the comment and glad you like the style.
      Yeah I only mention it briefly here. I can write the article you mentioned, sounds like a good idea. I’m not sure if Oralce 20c supports those but I’ll find out.

  3. Anil Kumar Mahato

    HI Ben,

    I have seen spelling mistake in the topic of “The Different Types of Set Operators”.
    There by mistake you have written INTERCEPT in place of INTERSECT .

    Somehow that will make fresher confused that way i have point out.

  4. Hi Ben,
    This is a pretty great and easy to read guide on set operators. I wanted to mention an SQL error that’s listed 4 times with some variation in the union and union all sections above.
    The sample query that demonstrates union and union all is missing a comma in the 2nd SELECT statement, which if run would result in a “query block has incorrect number of result columns” error. Additionally, it would not give the intended result for the 2nd query itself, as it would be replacing the first_name column with a hardcoded string. It caught me for a minute and made me double-check myself, so I thought it was worth mentioning in case it might trip up someone else as well.

    In the original form, the corrected query would be:
    SELECT ‘Customer’ AS record_type, first_name, last_name
    FROM customer
    UNION ALL
    SELECT ‘Employee’, first_name, last_name
    FROM employee;

    Thanks for the free info. I really appreciate it!

    1. Hi Ben,
      Thanks for the feedback and thanks for pointing out the errors! I’ll update the guide now to correct it based on what you’ve pointed out.
      Thanks!

  5. Jyoti Prajapati

    Thanks Ben. Can we use these operators for four sets in similar manner. For eg.
    example Given set A, set B, set C and Set D

    for union n(AUBUCUD) <- sqldf("select distinct id from set A union select distinct id from set B union select distinct id from set C union select distinct id from set D union")

    for intersection

    n(AnBnCnD) <- sqldf("select distinct id from set A intersect select distinct id from set B intersect select distinct id from set C intersect select distinct id from set D intersect")

    for compliments onlyA <-sqldf("select distinct id from set A except select distinct id from set B except select distinct id from set C except select distinct id from set D except")

    on using these kind of queries in sqldf package in r i get the results but i am not sure if its correct as the total doenot match wiht the union. Please let me know? Its relaly an important calculations

    1. Yes I think you can use them on four sets. The Select union select union… you have mentioned should work. I don’t know what the “for union” and “sqldf” means, but the Select query inside should work.

  6. Thank you for this informative presentation of Set operator , it would be great if you could share complex/ tricky queries which are asked on interview how do we do go about and solve it

  7. Dear Ben,

    under INTERSECT chapter there is stated: “Using our example data, we could use the INTERSECT set operator to find all names in the customer table that don’t exist in the employee table..” . It should be EXIST , right?
    Regards, Miro

  8. Very helpful, and clearly explained. Now I know better how to deal with our members, volunteers, and donors whose overlapping rows are just larger versions of your examples!
    Thanks.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.