Are you wondering what set operators are?
Have you seen keywords like UNION or MINUS and want to know what they do and why they are used?
In this article, you’ll learn all about set operators such as these, why they are used, and see some examples.
What are SQL Set Operators?
A set operator in SQL is a keyword that lets you combine the results of two queries into a single query.
Sometimes when working with SQL, you’ll have a need to query data from two more tables. But instead of joining these two tables, you’ll need to list the results from both tables in a single result, or in different rows. That’s what set operators do.
The Different Types of Set Operators
There are a few different set operators that can be used, depending on your needs, and which database vendor you’re using.
The different set operators are:
- UNION
- UNION ALL
- MINUS
- INTERSECT
- EXCEPT
Let’s take a look at each of these, using some sample data.
Using Set Operators
Set operators are used like this:
SELECT your_select_query
set_operator
SELECT another_select_query;
It uses two (or more) SELECT queries, with a set operator in the middle.
There are a few things to keep in mind though.
When selecting your columns, the number of columns needs to match between queries, and the data type of each column needs to be compatible.
So, if you select three columns in the first query, you need to select three columns in the second query. The data types also need to be compatible, so if you select a number and two character types in the first query, you need to do the same in the second query.
Also, if you want to order your results, the ORDER BY must go at the end of the last query. You can’t add ORDER BY inside each SELECT query before the set operator.
Sample Data
Our sample data for this article uses a customer and employee table.
Customer
first_name | last_name |
Stephen | Jones |
Mark | Smith |
Denise | King |
Paula | Johnson |
Richard | Archer |
Employee
first_name | last_name |
Christina | Jones |
Michael | McDonald |
Paula | Johnson |
Stephen | Jones |
Richard | Smith |
Each of our query examples in this article will use this data.
UNION: Combining Results
The UNION keyword or set operator will allow you to combine the results of two queries. It removes any duplicate results and shows you the combination of both.
Expressed as a Venn diagram, where each circle represents a query result, it looks like this:
Let’s see an example based on the sample data above. If we wanted to select from both the employee and customer tables, using UNION, our query would look like this:
SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee;
Result:
first_name | last_name |
Stephen | Jones |
Mark | Smith |
Denise | King |
Paula | Johnson |
Richard | Archer |
Christina | Jones |
Michael | McDonald |
Richard | Smith |
This shows us all customer and employee records.
But looking at this, we can’t tell which record comes from each table. Often we don’t need to know, but sometimes we do.
How can we do that? We can add a static value to each query to indicate which table it came from, or what type of record it is. As long as the number and type of columns match, the query will work.
SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee;
Result:
record_type | first_name | last_name |
Customer | Stephen | Jones |
Customer | Mark | Smith |
Customer | Denise | King |
Customer | Paula | Johnson |
Customer | Richard | Archer |
Employee | Christina | Jones |
Employee | Michael | McDonald |
Employee | Richard | Smith |
In this query, I added in a value of ‘Customer’ to display for all records from the Customer table. It has a column alias of record_type. The same column is labelled as ‘Employee’ from the employee table. Looking at the results you can see which records came from which table.
You don’t need to specify the column aliases on the second table. Oracle will know that the first columns match and use the alias already provided in the first query.
To order the results, the ORDER BY needs to go at the end.
SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee
ORDER BY record_type, first_name, last_name;
Results:
record_type | first_name | last_name |
Customer | Denise | King |
Customer | Mark | Smith |
Customer | Paula | Johnson |
Customer | Richard | Archer |
Customer | Stephen | Jones |
Employee | Christina | Jones |
Employee | Michael | McDonald |
Employee | Richard | Smith |
All of the set operators support standard keywords inside the SELECT statement. The set operator is just a way of combining results.
So, you can add WHERE clauses like this if you need to:
SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
WHERE status = ‘Active’
UNION
SELECT ‘Employee’, first_name, last_name
FROM employee
WHERE emp_status = ‘Current’
ORDER BY record_type, first_name, last_name;
What’s the Difference Between a UNION and a JOIN?
The UNION and JOIN keywords both combine results from two different tables or queries.
The difference is how they are combined.
UNION combines data into separate rows, and JOIN combines data into separate columns.
When performing a JOIN, there is a column that matches between the two tables, and additional data may be displayed.
If we wanted to JOIN our employee and customer tables, our query might look like this:
SELECT
e.emp_id,
e.first_name,
e.last_name,
c.customer_id,
c.first_name,
c.last_name
FROM employee e
INNER JOIN customer c
ON e.first_name = c.first_name
AND e.last_name = c.last_name;
Result:
cust_id | first_name | last_name | emp_id | first_name | last_name |
15 | Stephen | Jones | 3 | Stephen | Jones |
81 | Paula | Johnson | 9 | Paula | Johnson |
When we use UNION to combine our tables, it looks like this:
SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee;
Result:
first_name | last_name |
Stephen | Jones |
Mark | Smith |
Denise | King |
Paula | Johnson |
Richard | Archer |
Christina | Jones |
Michael | McDonald |
Richard | Smith |
We can use different JOIN types to display different records based on matches being found, but in general, that’s how a join is different to a UNION.
More Information
For more information on UNION:
UNION ALL: Combining Results in a Different Way
The UNION ALL set operator also combines the results from two queries.
It’s very similar to UNION, but it does not remove duplicates.
Expressed as a Venn diagram, it looks like this:
Let’s see an example. If we wanted to select from both the employee and customer tables, using UNION ALL, our query would look like this:
SELECT first_name, last_name
FROM customer
UNION ALL
SELECT first_name, last_name
FROM employee;
Result:
first_name | last_name |
Stephen | Jones |
Mark | Smith |
Denise | King |
Paula | Johnson |
Richard | Archer |
Christina | Jones |
Michael | McDonald |
Paula | Johnson |
Stephen | Jones |
Richard | Smith |
This shows us all customer and employee records.
There are some records in this table that appear twice (Stephen Jones and Paula Johnson). Why is that?
This is because UNION ALL does not remove duplicates. So the same first name and last name values exist in both the customer and employee tables, and this query shows each of them.
We can use the same adjustment to the query as we looked at earlier to see which table the data comes from:
SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION ALL
SELECT ‘Employee’, first_name, last_name
FROM employee;
Result:
record_type | first_name | last_name |
Customer | Stephen | Jones |
Customer | Mark | Smith |
Customer | Denise | King |
Customer | Paula | Johnson |
Customer | Richard | Archer |
Employee | Christina | Jones |
Employee | Michael | McDonald |
Employee | Paula | Johnson |
Employee | Stephen | Jones |
Employee | Richard | Smith |
This will let us see which table the data comes from, for each record.
What’s the Difference between UNION and UNION ALL?
The major difference between UNION ALL and UNION in SQL is that UNION removes any duplicate results from the final result set, and UNION ALL does not. UNION performs a DISTINCT on the result set to remove duplicates.
To remember this, consider that “ALL” in UNION ALL means “show all records”.
Therefore, UNION ALL will almost always show more results, as it does not remove duplicate records.
As a result of this, UNION is often slower than UNION ALL, because there is an operation to remove duplicate values (a.k.a DISTINCT), which is often a costly step in a query.
UNION ALL does not perform a distinct, so is usually faster.
So, if you don’t need to have unique rows in your result set, or if you’re sure the rows in the database or query are unique already, then use UNION ALL.
More Information
For more information on UNION ALL:
- UNION ALL Optimisation
- SQL SERVER – Puzzle – Why Does UNION ALL Work but UNION Produces Error?
- Is there any limit on the number of UNION ALL statements in a single SQL query on Oracle database?
MINUS or EXCEPT: Finding Results That Are Missing
Another set operator we can use is the MINUS keyword or the EXCEPT keyword.
The MINUS set operator will return results that are found in the first query specified that don’t exist in the second query.
The EXCEPT keyword is similar to MINUS, but is available in SQL Server and other databases.
As a Venn diagram, it looks like this:
Using our example data, we could use the MINUS set operator to find all names in the customer table that don’t exist in the employee table.
Our query would look like this:
SELECT first_name, last_name
FROM customer
MINUS
SELECT first_name, last_name
FROM employee;
Result:
first_name | last_name |
Mark | Smith |
Denise | King |
Richard | Archer |
If a result exists in the employee table as well as the customer table, it is not shown. Only the results from the customer table that are not in the employee table are shown.
What About the EXCEPT Set Operator?
The EXCEPT keyword is another set operator you might see in your code or in online examples.
EXCEPT is the same as MINUS – they both show results from one query that don’t exist in another query.
However, MINUS is an Oracle-specific keyword, and EXCEPT is in other databases such as SQL Server.
So, if you see EXCEPT anywhere, just know it’s the same as MINUS but for a different database.
More Information
For more information on MINUS and EXCEPT:
- Minus operator in SQL
- SQL Server: EXCEPT and INTERSECT
- SQL Server equivalent of Oracle”s “MINUS” operator?
INTERSECT: Showing Common Results
The INTERSECT keyword allows you to find results that exist in both queries. Two SELECT statements are needed, and any results that are found in both of them are returned if INTERSECT is used.
As a Venn diagram, it looks like this:
Using our example data, we could use the INTERSECT set operator to find all names in the customer table that don’t exist in the employee table.
Our query would look like this:
SELECT first_name, last_name
FROM customer
INTERSECT
SELECT first_name, last_name
FROM employee;
Result:
first_name | last_name |
Stephen | Jones |
Paula | Johnson |
This shows all names that are in both the customer and employee table.
If we want to order these, we add an ORDER BY at the end.
SELECT first_name, last_name
FROM customer
INTERSECT
SELECT first_name, last_name
FROM employee
ORDER BY first_name, last_name;
Result:
first_name | last_name |
Paula | Johnson |
Stephen | Jones |
What’s the Difference between UNION and INTERSECT?
The difference between UNION and INTERSECT is that UNION gets results from both queries and combines them, while INTERSECT gets results that only exist in both queries.
So, if Query 1 returns records A and B, and Query 2 returns records B and C, UNION would return A, B and C. INTERSECT would only return B.
More Information
For more information on the INTERSECT set operator:
- Oracle Documentation
- SQL Server Documentation
- You Probably don’t Use SQL INTERSECT or EXCEPT Often Enough
Working with Two or More Set Operators
So far we’ve looked at combining two queries into one using one set operator.
What if you want to combine three queries using two set operators? Or more?
You can do that in SQL.
You can just add another set operator and another query at the end.
For example, let’s say that along with our customer and employee tables, we had a table for referrers (people who referred customers).
If we wanted to see all of the names in all tables, we could write a query like this:
SELECT first_name, last_name
FROM customer
UNION
SELECT first_name, last_name
FROM employee
UNION
SELECT first_name, last_name
FROM referrer
ORDER BY first_name, last_name;
This query will select from the customer table, UNION that to the employee table, and then union the results at that point to the referrer table.
If we wanted to see all records in the customer and employee that weren’t in the referrer table, we can write this query:
SELECT first_name, last_name
FROM customer
UNION ALL
SELECT first_name, last_name
FROM employee
MINUS
SELECT first_name, last_name
FROM referrer
ORDER BY first_name, last_name;
Now, you might be wondering how this works? The first query is run, then a UNION ALL is performed with the second query, then using those results a MINUS is performed with the third query.
Unlike the mathematical operators (such as addition and multiplication), there is no order of operations with set operators. They are treated equally and run in order in the query from start to finish.
However, you can use brackets to force an order of operations with set operators. Let’s say you wanted to find all customer names, as well as employee names that weren’t referrer names.
Customer + (Employee – Referrer)
The query would look like this:
SELECT first_name, last_name
FROM customer
UNION ALL
(SELECT first_name, last_name
FROM employee
MINUS
SELECT first_name, last_name
FROM referrer)
ORDER BY first_name, last_name;
Adding the brackets around the second and third SELECT queries will ensure those queries are run together first, and the results of that are used to UNION ALL to the first select.
Using different combinations of brackets and set operators can give you the results you need when working with multiple queries.
Conclusion
Set operators can be useful when you want to combine the results of different queries into different rows. In SQL, the different set operators are UNION, UNION ALL, INTERSECT and MINUS (or EXCEPT, depending on your database).
Do you have any questions about set operators? Let me know in the comments below.
Excellent your genus sir.
For getting all records in the customer and employee that weren’t in the referrer table, you have used union all . Shouldn’t union be used instead ? Assuming a customer A who is also an employee and referrer would appear in result set ?
SELECT first_name, last_name
FROM customer
UNION ALL
SELECT first_name, last_name
FROM employee
MINUS
SELECT first_name, last_name
FROM referrer
ORDER BY first_name, last_name;
Consider this:
https://dzone.com/articles/say-no-to-venn-diagrams-when-explaining-joins
Thanks, yes I’ve read that article before and understand where they are coming from. It’s also a good way of explaining joins.
How to match the number of columns..?? If table1 has 3 columns first_name, last_name and salary
and table2 has only 2 columns first_name and last_name then how to combine data using set operators..?? Also add an example for this…
It would depend what you want to show from your second table in the salary column. Do you want to show 0, or null, or something else?
If you want to show NULL, for example, it could be:
SELECT first_name, last_name, salary FROM table1
UNION ALL
SELECT first_name, last_name, NULL FROM table2
I really like your presentation style. It is simple and straightforward. However, you really gloss over the idea of proprietary SQL dialect keywords, such as Oracle’s minus. A post about ANSI SQL and proprietary SQL would be welcome. Did you know the Oracle RDBMS engine as of version 20c supports use of the ANSI standard except and except all, in addition to minus?
Hi Todd, thanks for the comment and glad you like the style.
Yeah I only mention it briefly here. I can write the article you mentioned, sounds like a good idea. I’m not sure if Oralce 20c supports those but I’ll find out.
HI Ben,
I have seen spelling mistake in the topic of “The Different Types of Set Operators”.
There by mistake you have written INTERCEPT in place of INTERSECT .
Somehow that will make fresher confused that way i have point out.
Hi Ben,
This is a pretty great and easy to read guide on set operators. I wanted to mention an SQL error that’s listed 4 times with some variation in the union and union all sections above.
The sample query that demonstrates union and union all is missing a comma in the 2nd SELECT statement, which if run would result in a “query block has incorrect number of result columns” error. Additionally, it would not give the intended result for the 2nd query itself, as it would be replacing the first_name column with a hardcoded string. It caught me for a minute and made me double-check myself, so I thought it was worth mentioning in case it might trip up someone else as well.
In the original form, the corrected query would be:
SELECT ‘Customer’ AS record_type, first_name, last_name
FROM customer
UNION ALL
SELECT ‘Employee’, first_name, last_name
FROM employee;
Thanks for the free info. I really appreciate it!
Hi Ben,
Thanks for the feedback and thanks for pointing out the errors! I’ll update the guide now to correct it based on what you’ve pointed out.
Thanks!
Hi Ben,
Thanks for great explanation. It’s very useful.
Thanks Ben. Can we use these operators for four sets in similar manner. For eg.
example Given set A, set B, set C and Set D
for union n(AUBUCUD) <- sqldf("select distinct id from set A union select distinct id from set B union select distinct id from set C union select distinct id from set D union")
for intersection
n(AnBnCnD) <- sqldf("select distinct id from set A intersect select distinct id from set B intersect select distinct id from set C intersect select distinct id from set D intersect")
for compliments onlyA <-sqldf("select distinct id from set A except select distinct id from set B except select distinct id from set C except select distinct id from set D except")
on using these kind of queries in sqldf package in r i get the results but i am not sure if its correct as the total doenot match wiht the union. Please let me know? Its relaly an important calculations
Yes I think you can use them on four sets. The Select union select union… you have mentioned should work. I don’t know what the “for union” and “sqldf” means, but the Select query inside should work.
Thank you for this informative presentation of Set operator , it would be great if you could share complex/ tricky queries which are asked on interview how do we do go about and solve it
Dear Ben,
under INTERSECT chapter there is stated: “Using our example data, we could use the INTERSECT set operator to find all names in the customer table that don’t exist in the employee table..” . It should be EXIST , right?
Regards, Miro
Very helpful, and clearly explained. Now I know better how to deal with our members, volunteers, and donors whose overlapping rows are just larger versions of your examples!
Thanks.
thank you so much for taking your effort and time for posting this very informative knowledge about the SET Operators. Cheers from India !! :)
Thanks for sharing. Very useful article.