Oracle SQL Aggregate FunctionsOracle SQL aggregate functions are very useful, and are some of the most commonly used functions. Learn what Oracle SQL aggregate functions are and what they do in this article.

What Are Oracle SQL Aggregate Functions?

Aggregate functions are functions that allow you to view a single piece of data from multiple pieces of data.

Most functions in Oracle operate on a single row or single record – such as DECODE or LENGTH.

However, aggregate functions are different because you can use them on multiple rows and get a single value as a result.

An example of an aggregate function is SUM. This function allows you to input a range of values, such as those in a column, and get the sum total of those numbers.

If you had a table that looked like this:

SELECT * FROM student;
STUDENT_ID FIRST_NAME LAST_NAME FEES_REQUIRED FEES_PAID ENROLMENT_DATE GENDER ADDRESS_STATE
1 John Smith 500 100 01/FEB/2015 M New York
2 Susan Johnson 150 150 12/JAN/2015 F Colorado
3 Tom Capper 350 320 06/MAR/2015 M Nevada
4 Mark Holloway 500 410 20/JAN/2015 M New York
5 Steven Webber 100 80 09/MAR/2015 M New York
6 Julie Armstrong 100 0 12/FEB/2015 F Texas
7 Michelle Randall 250 23/JAN/2015 F Florida
8 Andrew Cooper 800 400 04/MAR/2015 M Texas
9 Robert Pickering 110 100 30/JAN/2015 M Colorado
10 Tanya Hall 150 150 28/JAN/2015 F Texas

You could then use an aggregate function, such as SUM, to get the total of all values in the fees_paid column.

SELECT SUM(fees_paid) FROM student;
SUM(FEES_PAID)
1710

This function collapses the many rows from the table into a single row that represents the SUM aggregate function.

There are many more aggregate functions, which I’ll explain later in this article.

 

What Is The GROUP BY Clause?

SQL has something called the “group by” clause. What does it mean? What does it do?

The GROUP BY clause allows you to use aggregate functions on parts of your data.

The example above showed the SUM of all of the fees_paid values in the table.

But what if you wanted to see the SUM of fees_paid for each address_state?

SELECT address_state,
SUM(fees_paid)
FROM student;

Result:

ORA-00937: not a single-group group function

You’ll get this error message. This is because we haven’t grouped our data.

The ORA-00937 occurs whenever you use an aggregate function and don’t include all of the non-aggregate functions from the SELECT clause in a GROUP BY clause;

This is where the GROUP BY clause comes in.

The GROUP BY clause allows you to group your aggregate function by other values.

You’ll need to add GROUP BY, then the names of all of the other fields in the SELECT clause.

SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state;

Result:

ADDRESS_STATE SUM(FEES_PAID)
Texas 550
New York 590
Colorado 250
Nevada 320
Florida (null)

This tells Oracle that you want to see the SUM of fees paid, but group by all of the address_state values.

 

What Is The HAVING Clause?

The HAVING clause is another clause that is relevant for aggregate functions.

It allows you to limit the results that are returned based on data after it has been grouped.

What if you wanted to see the SUM of the fees_paid values for each state, but only where the sum was over 500?

You could try use a WHERE clause and say WHERE fees_paid > 500.

SELECT address_state,
SUM(fees_paid)
FROM student
WHERE fees_paid > 500
GROUP BY address_state;

Result:

ADDRESS_STATE SUM(FEES_PAID)

However, in this case, you’ll get no results.

Why is this?

This is because the WHERE clause is run first, which eliminates rows that don’t meet the criteria. In this case, all rows where fees_paid <= 500 are removed from the result set.

Then, the SUM and GROUP BY are performed. However, there are no rows in the result set to do this to.

How can we get the result we want?

Use the HAVING clause.

The HAVING clause is similar to the WHERE clause, but it works on data after it has been grouped.

This is an important point to remember.

The WHERE clause works on data before it has been grouped, and the HAVING clause works on data after it has been grouped.

The syntax is similar to a WHERE clause.

SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state
HAVING fees_paid > 500;

Result:

ORA-00979: not a GROUP BY expression

I’m getting this error because I’ve specified the fees_paid in the HAVING clause, and I need to specify the aggregate function as well.

So, instead of saying HAVING fees_paid, I need to say HAVING SUM(fees_paid), because that’s the value we are checking.

SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state
HAVING SUM(fees_paid) > 500;

Result:

ADDRESS_STATE SUM(FEES_PAID)
Texas 550
New York 590

This now shows the result that we want.

 

DISTINCT, UNIQUE, and ALL with Aggregate Functions

Oracle allows you to use several optional keywords within aggregate functions, which are:

  • DISTINCT
  • UNIQUE
  • ALL

DISTINCT and UNIQUE are synonymous with each other so I’ll refer to it as DISTINCT in this article.

They are used in this way:

SUM ( [ALL | DISTINCT ] expression)

The ALL keyword is the default, and means that the aggregate function will consider all values when performing its calculation.

The DISTINCT keyword can be specified, and it means the aggregate function will only consider distinct or unique values of the expression in the calculation. It’s also documented on the Oracle website here.

Let’s see an example.

SELECT SUM(fees_paid)
FROM student;

Result:

SUM(FEES_PAID)
1710

This shows the normal SUM function without DISTINCT or ALL.

SELECT SUM(ALL fees_paid)
FROM student;

Result:

SUM(ALLFEES_PAID)
1710

If we add the ALL function, it shows the same as the first example. This is because ALL is the default.

SELECT SUM(DISTINCT fees_paid)
FROM student;
SUM(DISTINCTFEES_PAID)
1460

If we add the DISTINCT keyword, the SUM is smaller, because only distinct values are considered.

 

How Does Oracle Handle NULL Values in Aggregate Functions?

In Oracle, almost all of the aggregate functions ignore NULL values.

The functions that consider NULL values are:

  • COUNT(*)
  • GROUPING
  • GROUPING_ID

If you want to include the records that have NULL values, you can convert them to another value using the NVL function.

What happens if you have NULLs in your data and you use an aggregate function on them?

If you use COUNT and REGR_COUNT, you’ll get either a number for the count of records, or 0.

If you use any other function, and you have only NULL values in the data set, the function will return NULL.

 

Can You Use Nested Aggregate Functions?

Yes, you can.

Let’s say you have the SUM of fees_paid for students in each state, as we mentioned earlier.

SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state;

Result:

ADDRESS_STATE SUM(FEES_PAID)
Texas 550
New York 590
Colorado 250
Nevada 320
Florida (null)

What if you wanted to find the average of the SUM of fees paid in each state?

You can wrap the SUM function in another aggregate function, such as AVG.

SELECT
AVG(SUM(fees_paid))
FROM student
GROUP BY address_state;

Result:

AVG(SUM(FEES_PAID)
427.5

The calculation here will find the SUM of fees_paid for each address_state, and then get the AVG of those values.

However, as you might have noticed, you need to remove the field from the SELECT clause. This is because you’re trying to get the overall average, not the average per state.

 

List of Oracle SQL Aggregate Functions

Here’s a list of Oracle SQL aggregate functions, with a short description of what they do.

I have written detailed guides on many functions, and where a guide exists, I’ve linked the function name to it in this table. So, if you click on the function name, you’ll find more information on how to use it and see some examples.

Function Definition
APPROX_COUNT_DISTINCT Finds the approximate number of distinct rows in the expression.
AVG Finds the average of the expression.
COLLECT Creates a nested table from the provided column.
CORR Finds the coefficient of correlation of a set of number pairs.
CORR_K Calculates the Kendall’s tau-b correlation coefficient.
CORR_S Calculates the Spearman’s rho correlation coefficient.
COUNT Finds the number of rows returned by a query.
COVAR_POP Finds the population covariance of a set of number pairs.
COVAR_SAMP Finds the sample covariance of a set of number pairs.
CUME_DIST Finds the cumulative distribution of a value in a group of values.
DENSE_RANK Finds the rank of a row in an ordered group of rows, and rows with the same value get different ranks.
FIRST Used with ranking functions to get the first value.
GROUP_ID Identifies an ID of a grouping level in GROUP BY clauses.
GROUPING Identifies superaggregate rows from regular grouped rows.
GROUPING_ID Identifies the field that has been grouped by.
LAST Used with ranking functions to get the last value.
LISTAGG Orders data within a set of data and then concatenates it with a specified character.
MAX Finds the maximum value of the expression.
MEDIAN Finds the middle value of a set of data.
MIN Finds the minimum value of the expression.
PERCENT_RANK Finds the percentage rank of a row, similar to the CUME_DIST function.
PERCENTILE_CONT Finds a value that would fall within the specified range using a specified percentage value and sort method.
PERCENTILE_DISC Finds a value that would fall within the specified range using a specified percentage value and sort method.
RANK Calculates a rank of a value in a group of values, and rows with the same value get the same rank
REGR_ Functions Fits an ordinary-least-squares regression line ot a set of number pairs.
STATS_BINOMIAL_TEST Tests the difference between a sample proportion and a given proportion.
STATS_CROSSTAB Uses a method to analyse two variables using crosstab.
STATS_F_TEST Checks whether two values are significantly different.
STATS_KS_TEST Checks whether two samples are from the same population, or different populations with the same distribution.
STATS_MODE Finds the value that occurs the most.
STATS_MW_TEST Performs a Mann Whitney test on two populations.
STATS_ONE_WAY_ANOVA Tests the difference in means for groups for statistical significance.
STATS_T_TEST_* Checks the significance of a difference of means.
STATS_WSR_TEST Performs a Wilcoxon Signed Ranks test to see if the median of the differences is significantly different from 0.
STDDEV Finds the standard deviation of the set of values.
STDDEV_POP Finds the population standard deviation of a set of values.
STDDEV_SAMP Finds the cumulative sample standard deviation of a set of values.
SUM Finds the sum of values.
SYS_OP_ZONE_ID Finds a Zone ID from a specified row ID
SYS_XMLAGG Aggregates XML documents and produces a single XML document.
VAR_POP Finds the population variance of a set of numbers.
VAR_SAMP Finds the sample variance of a set of numbers.
VARIANCE Finds the variance of a set of numbers.
XMLAGG Aggregates XML documents and produces a single XML document.

 

So, that’s how you use aggregate functions in Oracle SQL. If you have any questions, please leave a comment at the end of this post.

Lastly, if you enjoy the information and career advice I’ve been providing, sign up to my newsletter below to stay up-to-date on my articles. You’ll also receive a fantastic bonus. Thanks!

Improve Your Oracle SQL With My 10-Day Email Course

x