Oracle SQL Analytic FunctionsIn this article, I explain what Oracle SQL analytic functions are, how they are different from other functions, and show you some examples.

What Are Oracle SQL Analytic Functions?

Analytic functions are functions that calculate an aggregate value based on a group of rows.

A recent article I wrote on SQL aggregate functions mention that aggregate functions calculate an aggregate value for a result set, which must be grouped by the columns in the SELECT statement, or to find an overall value.

With analytic functions, however, you can group the data that is used to calculate the function’s result, but show a value one each record in the result set, without grouping.

An analytic function generally looks like this:

function_name ( arguments ) OVER ( [query_partition_clause] 
[ORDER BY order_by_clause [windowing_clause] ] )

Many of the aggregate functions can also be used as an analytic function.

 

SQL Analytic Function Example

Let’s take a look at an example.

If you had a table that looked like this:

SELECT * FROM student;
STUDENT_IDFIRST_NAMELAST_NAMEFEES_REQUIREDFEES_PAIDENROLMENT_DATEGENDERADDRESS_STATE
1JohnSmith5001001/Feb/15MNew York
2SusanJohnson15015012/Jan/15FColorado
3TomCapper3503206/Mar/15MNevada
4MarkHolloway50041020/Jan/15MNew York
5StevenWebber100809/Mar/15MNew York
6JulieArmstrong100012/Feb/15FTexas
7MichelleRandall25023/Jan/15FFlorida
8AndrewCooper8004004/Mar/15MTexas
9RobertPickering11010030/Jan/15MColorado
10TanyaHall15015028/Jan/15FTexas

What if you want to find the number of students in each state, and display that next to each student record?

SELECT address_state,
COUNT(*)
FROM student
GROUP BY address_state;
ADDRESS_STATECOUNT(*)
Texas3
New York3
Colorado2
Nevada1
Florida1

This would work, but it would only show the states and the count. It doesn’t show the student information.

I could add the student information, but then the COUNT(*) records would only show 1, because they are grouped by the student’s first_name and last_name.

This is where analytic functions come in.

You can use the COUNT function as an analytic function

The COUNT function as an analytic function would look like this:

COUNT(*) OVER (PARTITION BY address_state)

This will find the COUNT of records for each state – but will not require any grouping in the SELECT statement.

SELECT first_name,
last_name,
address_state,
COUNT(*) OVER (PARTITION BY address_state) AS state_count
FROM student;
FIRST_NAMELAST_NAMEADDRESS_STATESTATE_COUNT
RobertPickeringColorado2
SusanJohnsonColorado2
MichelleRandallFlorida1
TomCapperNevada1
MarkHollowayNew York3
JohnSmithNew York3
StevenWebberNew York3
AndrewCooperTexas3
TanyaHallTexas3
JulieArmstrongTexas3

As you can see, all of the student records are shown, along with their state, and the number of records that state contains.

 

What Is The PARTITION BY Clause?

As you can see in the earlier example, the PARTITION BY specifies how the function groups the data before calculating the result.

It works similar to a GROUP BY, but it doesn’t reduce the number of rows that are shown in the result set.

If you don’t include the PARTITION BY, then you’ll get this result:

SELECT first_name, last_name, address_state,
COUNT(*) OVER () AS state_count
FROM student;
FIRST_NAMELAST_NAMEADDRESS_STATESTATE_COUNT
JohnSmithNew York10
SusanJohnsonColorado10
TomCapperNevada10
MarkHollowayNew York10
StevenWebberNew York10
JulieArmstrongTexas10
MichelleRandallFlorida10
AndrewCooperTexas10
RobertPickeringColorado10
TanyaHallTexas10

The state_count column shows an overall COUNT for all records, and displays it for each row.

 

Why Can’t I Just Use Subqueries?

You might have noticed that you can get the same results using subqueries and joins.

This is true. But, there are some advantages of using Oracle SQL analytic functions instead of subqueries and joins to get this result.

  • Easier to write. Once you understand how analytic functions work, they are easier to write than subqueries and joins to get the same result.
  • Easier to maintain. If you have a complex query, it can be harder to maintain it if you are using subqueries to get the same result that an analytic function would.
  • May be faster. Most of the time, using Oracle’s built-in functionality is faster than writing your own. Using these analytic functions will likely make your code run faster than using subqueries.

 

When Are Analytic Functions Performed?

When Oracle processes a query, the analytic functions are the last set of operations performed, except for the ORDER BY clause. This means that the joins, the WHERE clause, GROUP BY clause, and HAVING clause are all performed first, then the analytic functions are performed.

This means that the analytic functions can only appear in the SELECT list or the ORDER BY clause.

 

Can You Nest Analytic Functions?

Yes and no.

While aggregate functions allow you to nest functions, you can’t do the same thing with analytic functions.

However, you can specify an analytic function inside a subquery, and then perform another analytic function on that column inside your main query.

 

Can You Use Multiple Analytic Functions in the Same Query?

Yes, you can.

You can have different analytic functions in the same SELECT statement, and they can have the same query_partition_clause or different query_partition_clause values.

You can also have the same analytic functions but use different query_partition_clause values as well.

 

What Is The ORDER_BY_Clause Within Analytic Functions?

The order_by_clause within an analytic function is different to the ORDER BY clause for the entire query.

This clause specifies how data is ordered within a partition.

For some analytic functions, such as COUNT and MAX, the order does not matter.

However, other functions such as LEAD, LAG, and RANK, the order does matter. So, this is how the ordering is done for those functions.

The order_by_clause looks like this:

ORDER BY expr [ASC | DESC] [NULLS [FIRST | LAST ] ]

In this clause:

  • expr: This is the expression or column to order by.
  • ASC|DESC: This specifies that the expr should be ordered in ascending or descending order. The default is ascending.
  • NULLS FIRST|LAST: This specifies where rows with NULL values should appear – either first in the list, or last. For ascending order sorts, NULLS LAST is the default. For descending order sorts, NULLS FIRST is the default.

 

What Is The Windowing Clause?

The windowing_clause allows you to specify a range of rows that are used to perform the calculations for the current row.

The syntax of the windowing_clause is quite complicated, but here it is:

 ROWS|RANGE BETWEEN start_expr AND end_expr

The start_expr can be any of:

  • UNBOUNDED_PRECEDING
  • CURRENT ROW
  • expr PRECEDING|FOLLOWING

And the end_expr can be any of:

  • UNBOUNDED_FOLLOWING
  • CURRENT ROW
  • expr PRECEDING|FOLLOWING

The ROWS and RANGE keywords specify the window for each row for calculating the result of the function. ROWS specifies the window using rows, and RANGE specifies the window as a logical offset.

Let’s take a look at an example using the ROW type syntax.

SELECT first_name,
last_name,
address_state,
COUNT(*) OVER (PARTITION BY address_state 
ORDER BY address_state ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING) AS state_count
FROM student;
FIRST_NAMELAST_NAMEADDRESS_STATESTATE_COUNT
RobertPickeringColorado2
SusanJohnsonColorado2
MichelleRandallFlorida1
TomCapperNevada1
MarkHollowayNew York2
JohnSmithNew York3
StevenWebberNew York3
AndrewCooperTexas2
TanyaHallTexas3
JulieArmstrongTexas3

You can see here that the state_count is a little different. It only counts the records in the range that has been mentioned in the ROWS clause.

The ROWS BETWEEN 2 PRECEDING AND 1 FOLLOWING means that rows are only counted where the address_state equals the current records address_state, and only for the two rows before it and the one row after it.

This is why some of the Texas rows have a state_count of 2 and others are 3.

What Is The List Of Oracle SQL Analytic Functions?

Here’s a list of the Oracle SQL analytic functions. Many of them are also aggregate functions.

FunctionDefinition
AVGFinds the average of the expression.
CORRFinds the coefficient of correlation of a set of number pairs.
COUNTFinds the number of rows returned by a query.
COVAR_POPFinds the population covariance of a set of number pairs.
COVAR_SAMPFinds the sample covariance of a set of number pairs.
CUME_DISTFinds the cumulative distribution of a value in a group of values.
DENSE_RANKFinds the rank of a row in an ordered group of rows, and rows with the same value get different ranks.
FIRSTUsed with ranking functions to get the first value.
FIRST_VALUEFinds the first value in an ordered set of values.
LAGGet data from the preceding row without using a join
LASTUsed with ranking functions to get the last value.
LAST_VALUEFinds the last value in an ordered set of values.
LEADGet data from the following row without using a join
LISTAGGOrders data within a set of data and then concatenates it with a specified character.
MAXFinds the maximum value of the expression.
MEDIANFinds the middle value of a set of data.
MINFinds the minimum value of the expression.
NTH_VALUEReturns the nth value in an ordered set of values
NTILEDivides an ordered data set into buckets and assigns a bucket number to each row
PERCENT_RANKFinds the percentage rank of a row, similar to the CUME_DIST function.
PERCENTILE_CONTFinds a value that would fall within the specified range using a specified percentage value and sort method.
PERCENTILE_DISCFinds a value that would fall within the specified range using a specified percentage value and sort method.
RANKCalculates a rank of a value in a group of values, and rows with the same value get the same rank
RATIO_TO_REPORTFinds the ratio of a value to the sum of values.
REGR_ FunctionsFits an ordinary-least-squares regression line ot a set of number pairs.
ROW_NUMBERAssigns a unique number to each row.
STDDEVFinds the standard deviation of the set of values.
STDDEV_POPFinds the population standard deviation of a set of values.
STDDEV_SAMPFinds the cumulative sample standard deviation of a set of values.
SUMFinds the sum of values.
VAR_POPFinds the population variance of a set of numbers.
VAR_SAMPFinds the sample variance of a set of numbers.
VARIANCEFinds the variance of a set of numbers.

 

Conclusion

So, in conclusion, analytic functions can be used to find totals of data within different groups. They are very useful for analysing data, whether you’re getting data for an application or working on a data warehouse.

Lastly, if you enjoy the information and career advice I’ve been providing, sign up to my newsletter below to stay up-to-date on my articles. You’ll also receive a fantastic bonus. Thanks!

Want To Improve Your Oracle SQL?

Improve your Oracle SQL queries by signing up to my free 10-day Oracle Tips email course.

BONUS: Oracle SQL functions PDF cheat sheet

We won't send you spam. Unsubscribe at any time. Powered by ConvertKit