Oracle SQL aggregate functions are very useful, and are some of the most commonly used functions. Learn what Oracle SQL aggregate functions are and what they do in this article.
What Are Oracle SQL Aggregate Functions?
Aggregate functions are functions that allow you to view a single piece of data from multiple pieces of data.
Most functions in Oracle operate on a single row or single record – such as DECODE or LENGTH.
However, aggregate functions are different because you can use them on multiple rows and get a single value as a result.
An example of an aggregate function is SUM. This function allows you to input a range of values, such as those in a column, and get the sum total of those numbers.
If you had a table that looked like this:
SELECT * FROM student;
STUDENT_ID | FIRST_NAME | LAST_NAME | FEES_REQUIRED | FEES_PAID | ENROLMENT_DATE | GENDER | ADDRESS_STATE |
1 | John | Smith | 500 | 100 | 01/FEB/2015 | M | New York |
2 | Susan | Johnson | 150 | 150 | 12/JAN/2015 | F | Colorado |
3 | Tom | Capper | 350 | 320 | 06/MAR/2015 | M | Nevada |
4 | Mark | Holloway | 500 | 410 | 20/JAN/2015 | M | New York |
5 | Steven | Webber | 100 | 80 | 09/MAR/2015 | M | New York |
6 | Julie | Armstrong | 100 | 0 | 12/FEB/2015 | F | Texas |
7 | Michelle | Randall | 250 | 23/JAN/2015 | F | Florida | |
8 | Andrew | Cooper | 800 | 400 | 04/MAR/2015 | M | Texas |
9 | Robert | Pickering | 110 | 100 | 30/JAN/2015 | M | Colorado |
10 | Tanya | Hall | 150 | 150 | 28/JAN/2015 | F | Texas |
You could then use an aggregate function, such as SUM, to get the total of all values in the fees_paid column.
SELECT SUM(fees_paid) FROM student;
SUM(FEES_PAID) |
1710 |
This function collapses the many rows from the table into a single row that represents the SUM aggregate function.
There are many more aggregate functions, which I’ll explain later in this article.
You can also do some interesting things with aggregate functions, such as selecting all rows that meet a maximum value, or selecting the maximum value based on a subgroup.
What Is The GROUP BY Clause?
SQL has something called the “group by” clause. What does it mean? What does it do?
The GROUP BY clause allows you to use aggregate functions on parts of your data.
The example above showed the SUM of all of the fees_paid values in the table.
But what if you wanted to see the SUM of fees_paid for each address_state?
SELECT address_state,
SUM(fees_paid)
FROM student;
Result:
ORA-00937: not a single-group group function
You’ll get this error message. This is because we haven’t grouped our data.
The ORA-00937 occurs whenever you use an aggregate function and don’t include all of the non-aggregate functions from the SELECT clause in a GROUP BY clause;
This is where the GROUP BY clause comes in.
The GROUP BY clause allows you to group your aggregate function by other values.
You’ll need to add GROUP BY, then the names of all of the other fields in the SELECT clause.
SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state;
Result:
ADDRESS_STATE | SUM(FEES_PAID) |
Texas | 550 |
New York | 590 |
Colorado | 250 |
Nevada | 320 |
Florida | (null) |
This tells Oracle that you want to see the SUM of fees paid, but group by all of the address_state values.
What Is The HAVING Clause?
The HAVING clause is another clause that is relevant for aggregate functions.
It allows you to limit the results that are returned based on data after it has been grouped.
What if you wanted to see the SUM of the fees_paid values for each state, but only where the sum was over 500?
You could try use a WHERE clause and say WHERE fees_paid > 500.
SELECT address_state,
SUM(fees_paid)
FROM student
WHERE fees_paid > 500
GROUP BY address_state;
Result:
ADDRESS_STATE | SUM(FEES_PAID) |
However, in this case, you’ll get no results.
Why is this?
This is because the WHERE clause is run first, which eliminates rows that don’t meet the criteria. In this case, all rows where fees_paid <= 500 are removed from the result set.
Then, the SUM and GROUP BY are performed. However, there are no rows in the result set to do this to.
How can we get the result we want?
Use the HAVING clause.
The HAVING clause is similar to the WHERE clause, but it works on data after it has been grouped.
This is an important point to remember.
The WHERE clause works on data before it has been grouped, and the HAVING clause works on data after it has been grouped.
The syntax is similar to a WHERE clause.
SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state
HAVING fees_paid > 500;
Result:
ORA-00979: not a GROUP BY expression
I’m getting this error because I’ve specified the fees_paid in the HAVING clause, and I need to specify the aggregate function as well.
So, instead of saying HAVING fees_paid, I need to say HAVING SUM(fees_paid), because that’s the value we are checking.
SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state
HAVING SUM(fees_paid) > 500;
Result:
ADDRESS_STATE | SUM(FEES_PAID) |
Texas | 550 |
New York | 590 |
This now shows the result that we want.
DISTINCT, UNIQUE, and ALL with Aggregate Functions
Oracle allows you to use several optional keywords within aggregate functions, which are:
- DISTINCT
- UNIQUE
- ALL
DISTINCT and UNIQUE are synonymous with each other so I’ll refer to it as DISTINCT in this article.
They are used in this way:
SUM ( [ALL | DISTINCT ] expression)
The ALL keyword is the default, and means that the aggregate function will consider all values when performing its calculation.
The DISTINCT keyword can be specified, and it means the aggregate function will only consider distinct or unique values of the expression in the calculation. It’s also documented on the Oracle website here.
Let’s see an example.
SELECT SUM(fees_paid)
FROM student;
Result:
SUM(FEES_PAID) |
1710 |
This shows the normal SUM function without DISTINCT or ALL.
SELECT SUM(ALL fees_paid)
FROM student;
Result:
SUM(ALLFEES_PAID) |
1710 |
If we add the ALL function, it shows the same as the first example. This is because ALL is the default.
SELECT SUM(DISTINCT fees_paid)
FROM student;
SUM(DISTINCTFEES_PAID) |
1460 |
If we add the DISTINCT keyword, the SUM is smaller, because only distinct values are considered.
How Does Oracle Handle NULL Values in Aggregate Functions?
In Oracle, almost all of the aggregate functions ignore NULL values.
The functions that consider NULL values are:
- COUNT(*)
- GROUPING
- GROUPING_ID
If you want to include the records that have NULL values, you can convert them to another value using the NVL function.
What happens if you have NULLs in your data and you use an aggregate function on them?
If you use COUNT and REGR_COUNT, you’ll get either a number for the count of records, or 0.
If you use any other function, and you have only NULL values in the data set, the function will return NULL.
Can You Use Nested Aggregate Functions?
Yes, you can.
Let’s say you have the SUM of fees_paid for students in each state, as we mentioned earlier.
SELECT address_state,
SUM(fees_paid)
FROM student
GROUP BY address_state;
Result:
ADDRESS_STATE | SUM(FEES_PAID) |
Texas | 550 |
New York | 590 |
Colorado | 250 |
Nevada | 320 |
Florida | (null) |
What if you wanted to find the average of the SUM of fees paid in each state?
You can wrap the SUM function in another aggregate function, such as AVG.
SELECT
AVG(SUM(fees_paid))
FROM student
GROUP BY address_state;
Result:
AVG(SUM(FEES_PAID) |
427.5 |
The calculation here will find the SUM of fees_paid for each address_state, and then get the AVG of those values.
However, as you might have noticed, you need to remove the field from the SELECT clause. This is because you’re trying to get the overall average, not the average per state.
List of Oracle SQL Aggregate Functions
Here’s a list of Oracle SQL aggregate functions, with a short description of what they do.
I have written detailed guides on many functions, and where a guide exists, I’ve linked the function name to it in this table. So, if you click on the function name, you’ll find more information on how to use it and see some examples.
Function | Definition |
APPROX_COUNT_DISTINCT | Finds the approximate number of distinct rows in the expression. |
AVG | Finds the average of the expression. |
COLLECT | Creates a nested table from the provided column. |
CORR | Finds the coefficient of correlation of a set of number pairs. |
CORR_K | Calculates the Kendall’s tau-b correlation coefficient. |
CORR_S | Calculates the Spearman’s rho correlation coefficient. |
COUNT | Finds the number of rows returned by a query. |
COVAR_POP | Finds the population covariance of a set of number pairs. |
COVAR_SAMP | Finds the sample covariance of a set of number pairs. |
CUME_DIST | Finds the cumulative distribution of a value in a group of values. |
DENSE_RANK | Finds the rank of a row in an ordered group of rows, and rows with the same value get the same ranks but rank numbers are not skipped. This can result in a ranking of 1, 2, 3, 3, 4 (rank 4 is still used, even if there is a tie for rank 3). |
FIRST | Used with ranking functions to get the first value. |
GROUP_ID | Identifies an ID of a grouping level in GROUP BY clauses. |
GROUPING | Identifies superaggregate rows from regular grouped rows. |
GROUPING_ID | Identifies the field that has been grouped by. |
LAST | Used with ranking functions to get the last value. |
LISTAGG | Orders data within a set of data and then concatenates it with a specified character. |
MAX | Finds the maximum value of the expression. |
MEDIAN | Finds the middle value of a set of data. |
MIN | Finds the minimum value of the expression. |
PERCENT_RANK | Finds the percentage rank of a row, similar to the CUME_DIST function. |
PERCENTILE_CONT | Finds a value that would fall within the specified range using a specified percentage value and sort method. |
PERCENTILE_DISC | Finds a value that would fall within the specified range using a specified percentage value and sort method. |
RANK | Calculates a rank of a value in a group of values, and rows with the same value get the same rank with rank numbers getting skipped. This can result in a ranking of 1, 2, 3, 3, 5 (rank 4 is skipped because there is a tie). |
REGR_ Functions | Fits an ordinary-least-squares regression line ot a set of number pairs. |
STATS_BINOMIAL_TEST | Tests the difference between a sample proportion and a given proportion. |
STATS_CROSSTAB | Uses a method to analyse two variables using crosstab. |
STATS_F_TEST | Checks whether two values are significantly different. |
STATS_KS_TEST | Checks whether two samples are from the same population, or different populations with the same distribution. |
STATS_MODE | Finds the value that occurs the most. |
STATS_MW_TEST | Performs a Mann Whitney test on two populations. |
STATS_ONE_WAY_ANOVA | Tests the difference in means for groups for statistical significance. |
STATS_T_TEST_* | Checks the significance of a difference of means. |
STATS_WSR_TEST | Performs a Wilcoxon Signed Ranks test to see if the median of the differences is significantly different from 0. |
STDDEV | Finds the standard deviation of the set of values. |
STDDEV_POP | Finds the population standard deviation of a set of values. |
STDDEV_SAMP | Finds the cumulative sample standard deviation of a set of values. |
SUM | Finds the sum of values. |
SYS_OP_ZONE_ID | Finds a Zone ID from a specified row ID |
SYS_XMLAGG | Aggregates XML documents and produces a single XML document. |
VAR_POP | Finds the population variance of a set of numbers. |
VAR_SAMP | Finds the sample variance of a set of numbers. |
VARIANCE | Finds the variance of a set of numbers. |
XMLAGG | Aggregates XML documents and produces a single XML document. |
So, that’s how you use aggregate functions in Oracle SQL. If you have any questions, please leave a comment at the end of this post.
Lastly, if you enjoy the information and career advice I’ve been providing, sign up to my newsletter below to stay up-to-date on my articles. You’ll also receive a fantastic bonus. Thanks!
Very informative. Thank you
i think you need to revise your descriptions of RANK and DENSE_RANK. As I understand it, in both cases, rows with the same value get the same rank. The difference lies in the ranks given to values following a tie.
Example:
RANKs for (10, 20, 20, 30) are (1, 2, 2, 4).
DENSE_RANKs for the same sequence are (1, 2, 2, 3)
Hi Nelson, yes you’re right, thanks for letting me know. I’ll update the descriptions!
Thank you Mr Ben for this information