FB

What Does SQL PARTITION BY Do?

Have you ever seen the PARTITION BY keyword used in SQL queries? If so, do you know what it does and how to use it? Learn about the SQL PARTITION BY keyword and see some examples in this article.

What is the SQL PARTITION BY Keyword?

PARTITION BY is a keyword that can be used in aggregate queries in SQL, such as SUM and COUNT.

This keyword, along with the OVER keyword, allows you to specify the range of records that are used for each group within the function.

It works a little like the GROUP BY clause but it’s a bit different.

 

Simple Example of PARTITION BY

Let’s take a look at an example.

Suppose we had a subject table that defined subject information for a school. They had a teacher_id that referred to the teacher of that subject.

Let’s say you wanted to find the subject IDs, teacher IDs, and the number of subjects that teacher teaches using COUNT.

SELECT
subject_id,
teacher_id,
COUNT(*) OVER (PARTITION BY teacher_id) AS teacher_count
FROM subject;

Result:

SUBJECT_ID TEACHER_ID TEACHER_COUNT
1 14 1
2 8 1
3 9 1
4 12 1
5 21 2
6 21 2

This table shows you all of the subject_id and teacher_id values. It also shows the teacher_count, which is the number of subjects for each teacher for each row.

You can see in the last few records, the teacher_count value is 2 because there are 2 records with that teacher_id. The value is repeated for that row.

The other records, which have a different value for the teacher_id, have a different count value.

It’s like you’re performing an aggregate function on a set of records, but without aggregating the entire result set.

 

Why Not Use a Subquery?

Sure, you can also use a subquery to get the same results. However, the subquery would be:

  • Slower (probably)
  • Harder to write

How would you do this in a subquery?

SELECT
s.subject_id,
s.teacher_id,
sub.teacher_count
FROM subject s
INNER JOIN (
  SELECT
  teacher_id,
  COUNT(*) AS teacher_count
  FROM subject
  GROUP BY teacher_id)
sub ON sub.teacher_id = s.teacher_id
ORDER BY s.teacher_id, s.subject_id;

Result:

SUBJECT_ID TEACHER_ID TEACHER_COUNT
1 14 1
2 8 1
3 9 1
4 12 1
5 21 2
6 21 2

You can see that the results are the same, but the query is longer. It also may perform slower because you’re looking at the table twice (once in the outer query and once in the inner query that has been joined to it) instead of just once.

While you’re here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

 

Why Should I Use PARTITION BY Instead of GROUP BY?

We can try to use a normal query with a GROUP BY to get the same results.

SELECT
subject_id,
teacher_id,
COUNT(*)
FROM subject
GROUP BY subject_id, teacher_id;

Result:

SUBJECT_ID TEACHER_ID COUNT(*)
1 14 1
2 8 1
3 9 1
4 12 1
5 21 1
6 21 1

You can see that there is a value of 1 for each record. This is because the COUNT is counting the number of unique pairs of subject and teacher, and they are always unique.

How about we change the GROUP BY to only group by teacher?

SELECT
subject_id,
teacher_id,
COUNT(*)
FROM subject
GROUP BY teacher_id;

Result (if we run this in Oracle):

ORA-00979: not a GROUP BY expression

This error appears because of how GROUP BY works. You need to specify each column in the SELECT clause that isn’t an aggregate function.

So, we can see that using the GROUP BY won’t give us the result we want. For this purpose, we can use a subquery (like we did earlier), or use the PARTITION BY keyword.

 

Conclusion

Hopefully, that answers your questions on what the SQL PARTITION BY keyword is and how to use it.

Do you have any questions on the PARTITION BY keyword? Leave them in the comments section below.

While you’re here, if you want an easy-to-use list of the main features in SQL for different vendors, get my SQL Cheat Sheets here:

5 thoughts on “What Does SQL PARTITION BY Do?”

  1. “Let’s say you wanted to find the subject “IDs, teacher IDs, and the number of teachers for that subject using COUNT”.

    Doesn’t COUNT show the number of subjects a teacher teaches? The count of teachers for each subject is one. Teacher 21 teaches two subjects.

    1. Hi Steve,
      Yes, the COUNT function will show the number of subjects a teacher teaches. You’re right, it’s a slight wording difference but a big difference in meaning! I’ve updated the article.
      Thanks,
      Ben

  2. SQL PARTITION BY is a SQL statement that divides a table into multiple partitions. Each partition contains a subset of the table’s data.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Table of Contents