FB

How to Generate a Running Total in SQL

Have you ever needed to calculate a “running total” or “cumulative sum” in SQL?

You can do this in SQL without any other programming language to process the data.

In this guide, you’ll learn how to calculate a running total in SQL. This method will work in Oracle, SQL Server, MySQL, and PostgreSQL.

We’ll look at an example, explain the query, and see how it can be done in groups as well.

Let’s get into it.

What is a Running Total?

A running total, or cumulative sum, is a value that indicates the total of a value in the current row and all rows above it.

It can be used across a span of time, to find the total order value so far for an eCommerce site. Or, it could be used in a sports website to display the team’s score so far for every score that is made.

There are many uses for it, and in this guide we’ll see how to calculate it.

 

Sample Data

We’ll use some sample data to demonstrate the concept of an SQL running total.

Here’s our sample table called sales_history, which represents orders and the value of orders that have been made over time.

order_id sales_date salesperson order_value
1 01 Aug 2021 John 15
2 01 Aug 2021 Sarah 8
3 01 Aug 2021 Sally 19
4 02 Aug 2021 John 2
5 02 Aug 2021 Mark 18
6 03 Aug 2021 Sally 3
7 03 Aug 2021 Mark 21
8 03 Aug 2021 Sarah 16
9 04 Aug 2021 John 4

Here’s the SQL to create the table. This will work in MySQL but you can adjust the data types for other databases.

CREATE TABLE sales_history (
  order_id INT,
  sales_date DATE,
  salesperson VARCHAR(20),
  order_value INT
);

 

INSERT INTO sales_history
(order_id, sales_date, salesperson, order_value) VALUES
(1, '20210801', 'John', 15),
(2, '20210801', 'Sarah', 8),
(3, '20210801', 'Sally', 19),
(4, '20210802', 'John', 2),
(5, '20210802', 'Mark', 18),
(6, '20210803', 'Sally', 3),
(7, '20210803', 'Mark', 21),
(8, '20210803', 'Sarah', 16),
(9, '20210804', 'John', 4);

If we want to select data from this table, our query would look like this:

SELECT
order_id,
sales_date,
salesperson,
order_value
FROM sales_history;

How can we generate a running total?

 

Generate a Running Total in SQL with a Window Function

To generate a running total in SQL, we can use something called a “window function”.

A window function is a way to write a function so it looks over a range of records, or a “window”, instead of all records or a single record.

In this case, we can use a window function to look at the SUM of the number of sales. Our “window” or range of data to look at would be all of the rows that have come before this.

I’ve written a guide to window functions here if you want to learn more about them.

For our example, we can use the SUM function to calculate the running total.

The SUM function, written as a window function, looks like this:

SUM (expression)
OVER (
  [ PARTITION BY partition_clause ]
  [ ORDER BY order_clause ]
)

This has a few components:

  • The expression is the column we want to sum.
  • The partition_clause specifies the range of data to use
  • The order_clause specifies how data is ordered in the range

Let’s see an example of this.

Here’s our Select query from earlier:

SELECT
order_id,
sales_date,
salesperson,
order_value
FROM sales_history;

Next, we add the SUM function in:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM()
FROM sales_history;

We add in the order_value column inside the SUM function:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value)
FROM sales_history;

Then, we add the word OVER and some brackets after the SUM function:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value)
  OVER ()
FROM sales_history;

Inside the OVER clause, or the brackets after OVER, we can add two things:

  • the partition clause, which is the range of data to calculate the sum on
  • the order by clause, which defines how the data is ordered for the calculation

For our running total, we want to calculate the SUM based on all records so far, and not a specific range of data. So, we can leave the partition by clause out.

We want to include the order by clause though. This defines how the running total is calculated.

A running total includes all of the values before it. But how do we know what comes before it? We can use the order_id value. This lets us know what the order of the records are.

So, the running total for order ID 2 includes the order value for orders 1 and 2. For order 3, it includes orders 1, 2, and 3.

We can specify this in the order by clause inside the OVER clause.

Our query would look like this:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value)
  OVER (ORDER BY order_id ASC)
FROM sales_history;

Next, we can give this SUM function a column alias of running_total, so we know what it represents.

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value)
  OVER (ORDER BY order_id ASC) AS running_total
FROM sales_history;

Finally, we should order the entire result set. We’re calculating the running total, but if the rows appear in a different order it may seem confusing. The Order By inside the OVER clause is only for the SUM calculation, and not for showing data.

Here’s the query with an ORDER BY clause which ensures the data is shown in order.

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value)
  OVER (ORDER BY order_id ASC) AS running_total
FROM sales_history
ORDER BY order_id ASC;

Run this query, and you’ll see these results:

order_id sales_date salesperson order_value running_total
1 01 Aug 2021 John 15 15
2 01 Aug 2021 Sarah 8 23
3 01 Aug 2021 Sally 19 42
4 02 Aug 2021 John 2 44
5 02 Aug 2021 Mark 18 62
6 03 Aug 2021 Sally 3 65
7 03 Aug 2021 Mark 21 86
8 03 Aug 2021 Sarah 16 102
9 04 Aug 2021 John 4 106

Awesome! We’ve used the SUM function as a window function to calculate the running total.

Here’s a summary of how to do it:

SELECT
your_columns,
SUM(column_to_sum)
  OVER (ORDER BY column_that_determines_order ASC) AS running_total
FROM your_table
ORDER BY column_that_determines_order ASC;

This includes:

  • your_columns: the columns you want to see
  • column_to_sum: the column that you want to sum for the running total
  • column_that_determines_order: the column that determines how the running total is determined
  • your_table: the table that contains your data

 

Generating a Running Total In Groups

Another scenario for generating a running total is to see a running total in different groups.

Perhaps you want to see a running total of sales per month, or scores in a match by each person, or something else.

You can do this with a slight modification of the SUM function.

Here’s our query from earlier:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value) OVER (
  ORDER BY order_id ASC
) AS running_total
FROM sales_history
ORDER BY order_id ASC;

In the OVER clause, another clause to add is the partition by clause. This allows us to specify groups when calculating the SUM value.

This means, for our data, we could see the running total of orders for each salesperson.

To do this, we partition by the salesperson in our query:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) AS running_total
FROM sales_history
ORDER BY order_id ASC;

If we run this query, this is what we’ll see:

order_id sales_date salesperson order_value running_total
1 01 Aug 2021 John 15 15
2 01 Aug 2021 Sarah 8 8
3 01 Aug 2021 Sally 19 19
4 02 Aug 2021 John 2 17
5 02 Aug 2021 Mark 18 18
6 03 Aug 2021 Sally 3 22
7 03 Aug 2021 Mark 21 39
8 03 Aug 2021 Sarah 16 24
9 04 Aug 2021 John 4 21

The numbers are different from our earlier query. What does this mean?

The running total is calculated separately for each salesperson.

We can see John’s first running total is 15, then on John’s next record (order id 4), the running total increases to 17, and then to 21 for order id 9.

Order id 2, for Sarah, has a running total of 8. Her total then increases on order id 8, to 24 (as the order value of 16 was added).

It may make more sense if we order by the salesperson, then by the order ID.

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) AS running_total
FROM sales_history
ORDER BY salesperson ASC, order_id ASC;

Here are the results:

order_id sales_date salesperson order_value running_total
1 01 Aug 2021 John 15 15
4 02 Aug 2021 John 2 17
9 04 Aug 2021 John 4 21
5 02 Aug 2021 Mark 18 18
7 03 Aug 2021 Mark 21 39
3 01 Aug 2021 Sally 19 19
6 03 Aug 2021 Sally 3 22
2 01 Aug 2021 Sarah 8 8
8 03 Aug 2021 Sarah 16 24

We can see how each salesperson’s running total is incremented, and the running total is then reset for each new salesperson.

So, to calculate a running total for groups, you can use the PARTITION BY clause of the SUM function.

 

Filter on a Running Total

Another scenario you might see when working with running totals is the need to filter on it.

How can you filter on this data? Let’s say you want to see all orders where the running total is greater than 20.

We could try this query, using the WHERE clause:

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) AS running_total
FROM sales_history
WHERE SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) > 20
ORDER BY salesperson ASC, order_id ASC;

Other than the WHERE clause being quite long, it causes an error as we can’t use window functions in a WHERE clause.

What about the HAVING clause?

SELECT
order_id,
sales_date,
salesperson,
order_value,
SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) AS running_total
FROM sales_history
HAVING SUM(order_value) OVER (
  PARTITION BY salesperson
  ORDER BY order_id ASC) > 20
ORDER BY salesperson ASC, order_id ASC;

This doesn’t work either. If you run this in MySQL, for instance, you’ll get an error:

Error Code: 3593. You cannot use the window function ‘sum’ in this context.’

So how can we do it?

We can enclose the entire query inside a subquery, then perform filtering based on those columns.

This is a helpful technique in many situations and we can use it here too.

So, enclose the query in a subquery like this:

SELECT
order_id,
sales_date,
salesperson,
order_value,
running_total
FROM (
  SELECT
  order_id,
  sales_date,
  salesperson,
  order_value,
  SUM(order_value) OVER (
    PARTITION BY salesperson
    ORDER BY order_id ASC) AS running_total
  FROM sales_history
) sub
ORDER BY salesperson ASC, order_id ASC;

We leave the ORDER BY on the outside of the query as that’s how the data is ordered.

Now, we can add a where clause on the running total, because here it’s a derived column (from the subquery) and not a window function.

SELECT
order_id,
sales_date,
salesperson,
order_value,
running_total
FROM (
  SELECT
  order_id,
  sales_date,
  salesperson,
  order_value,
  SUM(order_value) OVER (
    PARTITION BY salesperson
    ORDER BY order_id ASC) AS running_total
  FROM sales_history
) sub
WHERE running_total > 20
ORDER BY salesperson ASC, order_id ASC;

Run this query and this is the result we’ll see:

order_id sales_date salesperson order_value running_total
9 04 Aug 2021 John 4 21
7 03 Aug 2021 Mark 21 39
6 03 Aug 2021 Sally 3 22
8 03 Aug 2021 Sarah 16 24

We’ve filtered the results based on the running total value by enclosing the query inside a subquery and putting the WHERE clause outside the subquery.

Only those rows with a running total value of greater than 20 are shown.

 

Conclusion

So, that’s how to calculate a running total in SQL. This technique (using SUM as a window function) should work in Oracle, SQL Server, MySQL, and PostgreSQL, and many other databases too.

If you have any questions or comments, leave them in the comments below.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.