Do you want to learn how to generate pivoted data or pivot tables in MySQL?
In this article, you’ll learn:
- what the pivot concept is
- how to generate a pivot table in MySQL
- how to generate the column headings dynamically
Let’s get into the guide.
What is PIVOT?
The concept of a “pivot” is where data in rows is transformed to be shown in columns instead. It’s commonly used in Microsoft Excel where you create a Pivot Table to display data differently.
It groups your data based on some values and shows these in columns instead.
Some databases, such as SQL Server, have the PIVOT feature built-in.
However, in MySQL, there is no PIVOT feature or keyword. Fortunately, we can still generate this pivot table output.
Let’s see the sample data we’ll use, and then the code to generate the pivot table.
Sample Data and Expected Output
Let’s see some sample data and our expected output to help demonstrate what PIVOT can do.
We’ve got a product_sales table which has a product name, store location, and number of sales. Each of these are separate columns:
Here’s the sample data for this table: mysql_pivot_data.sql
You can download that SQL file to create and populate the tables yourself.
Let’s say we wanted to see this data with product names on the left and store locations across the top, with the number of sales at each intersection:
We could extract it into an Excel file and create a pivot table.
Or, we could use SQL for this. Let’s see how we can do this.
MySQL Pivot using CASE
You can use a combination of an aggregate function and the CASE statement to show a pivot table.
How can we write a query to do this?
First, we write a SELECT query that gets the product names:
SELECT product_name FROM product_sales;
Then we add in the SUM function as another column:
SELECT product_name, SUM() FROM product_sales;
For this SUM function, we want to show the sum of the num_sales column, but only where the store_location is North.
We can do this kind of logic using a CASE statement. And the important thing here is that a CASE statement can go inside a function.
So, we put a CASE statement inside the SUM function. We want the CASE statement to say “if the store location is North, then show the number of sales, otherwise show 0”.
SELECT product_name, SUM(CASE WHEN store_location = 'North' THEN num_sales ELSE 0 END ) AS north FROM product_sales;
In this example we’ve also put a column alias of “north”, so we get that as a column heading instead of a long CASE statement.
Next, we add a GROUP BY clause, because we are showing a column and an aggregate function.
SELECT product_name, SUM(CASE WHEN store_location = 'North' THEN num_sales ELSE 0 END ) AS north FROM product_sales GROUP BY product_name;
We can run this query, even though there is only one column being calculated, and this is what will be shown:
We can now add on the rest of the columns. Add the same SUM and CASE functions but change the store_location to the value you want to check.
The query should look like this:
SELECT product_name, SUM(CASE WHEN store_location = 'North' THEN num_sales ELSE 0 END ) AS north, SUM(CASE WHEN store_location = 'Central' THEN num_sales ELSE 0 END ) AS central, SUM(CASE WHEN store_location = 'South' THEN num_sales ELSE 0 END ) AS south, SUM(CASE WHEN store_location = 'West' THEN num_sales ELSE 0 END ) AS west FROM product_sales GROUP BY product_name;
When you run this query, you should get these results:
We can see the different locations as columns, the different products as rows, and the sum of sales at the intersection of product and location.
That’s how you can do a pivot table in MySQL. You can change the columns and rows to use by changing your query, but we’ve seen the general structure here.
Dynamic Pivot Columns
In the example above we generated a pivot table using an aggregate function (SUM) and a CASE statement. The downside to that approach is that we need to know the column headings when we write the query, and the more columns that appear the more code we need to write.
This may be OK for smaller results, but what if you don’t know all of the possible values? Or what if the values change?
There is a way to dynamically generate the columns of a PIVOT table output. We use the function called GROUP_CONCAT.
Here’s what the structure of the code is:
SET @sql = NULL; SELECT GROUP_CONCAT(logic) INTO @sql FROM your_table; SET @sql = CONCAT('select…', @sql, 'from…'); PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt;
We use the variable of @sql to store the results of GROUP_CONCAT to avoid characters after the 1024 limit of this function from being cut off.
We then add this value into the rest of the query by concatenating with a SELECT and FROM clause.
We then run it using PREPARE and EXECUTE.
Here’s what our real query would look like on the product_sales table.
SET @sql = NULL; SELECT GROUP_CONCAT(DISTINCT CONCAT( 'SUM( CASE WHEN store_location = "', store_location, '" THEN num_sales ELSE 0 END) AS ', store_location) ) INTO @sql FROM product_sales; SET @sql = CONCAT('SELECT product_name, ', @sql, ' FROM product_sales GROUP BY product_name'); SELECT @sql; PREPARE stmt FROM @sql; EXECUTE stmt; DEALLOCATE PREPARE stmt;
We use the DISTINCT keyword inside GROUP_CONCAT so we get a unique list of store_locations. Otherwise, we’d get a very long list and our query would show an error.
Now when we run this query, this is what we see:
With this example, your column headers are generated based on the values in the table.
While there are no MySQL PIVOT keywords available, you can use the SUM function (or other aggregate functions) as well as the CASE statement to generate a pivot table in MySQL.
You can also use the GROUP_CONCAT function and some procedural code to generate the list of columns dynamically.
10 thoughts on “MySQL Pivot: How To Generate a Pivot Table”
Perhaps you should mention the security risk in using dynamic SQL, and also the greater difficulty in debugging it when it doesn’t work as intended? Otherwise the tutorial is a clear exposition of the technique, and straightforward in explanation.
If the final procedure only has to change infrequently, then it is a viable solution, otherwise avoid it if you can.
Great points on security risk and debugging! I’ll mention that in the article.
looking forward to see this, thanks for the article
Wow, this was exactly what I needed. I needed a pivot table to manipulate data to pass to FusionCharts. Problem is I’m using the PHP wrapper to FusionCharts, and the query errors out when being passed in PHP. So I still need to figure out how to use this solution in that context. But still, one step closer. Thanks.
Hello, thanks for this its a great explanation, Ive built something like this but have a small issue with mine.
I cant get my head around what is wrong. I’ve been looking so long I cant see it myself?
This outputs a column called LeadAge which shows how old the record is allowing me to filter in php via a select on the front end.
If I select 7 days from the front end, it outputs correctly but not with today or yesterdays records. If I select this month, it outputs correctly but doesnt include the last 7 days records, yearly shows all except this month records. And This month shows November from every year in the table when it should be just this month (November)
Is what I am trying to not possible?
DateDiff(CurDate(), tbl_invoice_data.entry_date) As Age,
When tbl_invoice_data.entry_date = CurDate()
When tbl_invoice_data.entry_date = CurDate() – Interval 1 Day
When tbl_invoice_data.entry_date >= CurDate() – Interval 6 Day
Then ‘Last 7 days’
When Month(tbl_invoice_data.entry_date) = Month(CurDate())
Then ‘This Month’
When Month(tbl_invoice_data.entry_date) = Month(Date_Sub(CurDate(), Interval 1 Month))
Then ‘Last Month’
When Year(tbl_invoice_data.entry_date) = Year(Date_Sub(CurDate(), Interval 1 Year))
Then ‘Last Year’
When Year(tbl_invoice_data.entry_date) = Year(CurDate())
Then ‘This Year’
Else ‘+30 days’
End) As ‘LeadAge’,
(Year(tbl_invoice_data.entry_date) = Year(CurDate()) Or
Year(tbl_invoice_data.entry_date) = Year(Date_Sub(CurDate(), Interval 1 Year)))
That’s an interesting issue! I would suggest running this query in your SQL editor (e.g. MySQL Workbench) and checking that the data returns correctly.
I think the issue may be due to how CASE works: if a record has matched on one of the CASE criteria it is not evaluated on another criteria. So, today’s date will return a LeadAge date of “Today”, and therefore won’t display if you’re filtering on “Last 7 Days”. The same can be said for all of your other filters.
I don’t know what the best way to solve this would be. Perhaps you can translate each of your selections into a combination of “start date” and “end date” values, and use that to filter in the WHERE clause. Or you could have a range of separate CASE statements to return a 1 if a value meets a criteria. Something like this:
Case When tbl_invoice_data.entry_date = CurDate() THEN 1 ELSE 0 END AS is_today,
CASE When tbl_invoice_data.entry_date = CurDate() - Interval 1 Day THEN 1 ELSE 0 END AS is_yesterday,
Hope that helps!
Great article, thank you!
I am doing like this
CASE WHEN year(created_at) = “‘, year(created_at), ‘” THEN 1 ELSE 0 END)
AS ‘, date_format(created_at,”%e_%b_%y”))
and i am getting very long list of days, can we put some condition here ?
please let me know
Thanks, exactly what I needed, first I tried to use pivot function just to realize that in mysql doesn’t exist. Anyway, great explanation, pretty clear :)
Extraordinarily awesome as usual. This is incredibly handy at the mechanical data wrangling level.
Any thoughts on “the security risk in using dynamic SQL” mentioned in Phil Roberts initial reply? I’ve read the code a few times and can’t see any problems that can’t be worked around.