Build Pivot Tables in MySQL Using User Variables

Learn how to use MySQL user variables along with the CONCAT and GROUP_CONCAT functions to generate dynamic pivot tables from large data sets

18 days ago   •   7 min read

By Shanika Wickramasinghe
Table of contents

One of the most prominent tasks when dealing with databases is to properly filter and extract meaningful data from the underlying database. Due to that, there can be requirements to pivot the data from rows to columns leading to the creation of pivot tables to visualize data better.  

Some databases like Microsoft SQL Server or Oracle come with inbuilt functionality to create a pivot table using the inbuilt pivot() function. However, this function is not available in some databases such as MySQL and MariaDB. In this post, we will discuss how to create pivot tables in MySQL without depending on any special functions.

Pivot Table Data

The best way to create a pivot table in MySQL is using a SELECT statement since it allows us to create the structure of a pivot table by mixing and matching the required data. The most important segment within a SELECT statement is the required fields that directly correspond to the pivot table structure.

In the following sections, let’s see how to create a pivot table in MySQL using a SELECT statement. All the examples presented in this post are based on the Arctype SQL client in a Windows environment using MySQL database.

First, we need a data set to get started. We will be using the following data set that contains historical monthly stock data of a set of organizations. You can download it here.

formatted stock data
Formatted stock data with open and close values for each month

Using the above data set (pivot_stock_data table), we will create a pivot table to identify the changes in the stock prices on a monthly basis.

Using a CASE Statement to Create a Pivot Table

The simplest way to create a pivot table is using CASE statements within the SELECT statement to get the necessary columns and then group them by companies. Let’s assume that we need the pivot table to display data from February 2013 to June 2013. We can achieve this by creating multiple CASE statements to get data for each required month.

SELECT
  `name`,
  MAX(
    CASE
      WHEN formatted_date = '2013-02' THEN delta_pct
      ELSE NULL
    END
  ) AS '2013-02',
  MAX(
    CASE
      WHEN formatted_date = '2013-03' THEN delta_pct
      ELSE NULL
    END
  ) AS '2013-03',
  MAX(
    CASE
      WHEN formatted_date = '2013-04' THEN delta_pct
      ELSE NULL
    END
  ) AS '2013-04',
  MAX(
    CASE
      WHEN formatted_date = '2013-05' THEN delta_pct
      ELSE NULL
    END
  ) AS '2013-05',
  MAX(
    CASE
      WHEN formatted_date = '2013-06' THEN delta_pct
      ELSE NULL
    END
  ) AS '2013-06'
FROM
  pivot_stock_data
GROUP BY
  `name`
ORDER BY
  `name` ASC;

In the above SQL statement, we have configured individual CASE statements to get the ‘delta_pct’ value for the corresponding month using the ‘formatted_date’ field. The MAX operator is used to obtain the maximum value for the given month. However, it can be any operator such as SUM, AVG, etc., depending on the requirement. Finally, we will group the result set by the ‘name’ column to obtain an aggregated result.

case statement hard coded pivot table
A hard-coded pivot table

While this is a straightforward solution for a simple pivot table, it is not scalable for dealing with larger data sets. Moreover, creating individual CASE statements will be a tedious and time-consuming task, and any changes to the underlying data set will require manual modifications to the statement. However,  we can create reusable SQL statements by creating CASE statements programmatically.

Using GROUP_CONCAT and SQL Variables to Create a Pivot Table

When it comes to creating CASE statements programmatically, we can use the GROUP_CONCACT function to retrieve a string of concatenated values from a group. Then we can store the result in a SQL user variable to be used when creating the pivot table.

Programmatically Creating the CASE Statement

In the below SQL statement, we are programmatically creating the CASE statement by creating a statement using CONTACT to fill the formatted_date field automatically. There, all the distinct results will again be concatenated using the GROUP CONCAT function and assigned to the ‘@sql’ user variable.

SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN formatted_date = '",
          formatted_date,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          formatted_date,
          "'"
        )
      )
    FROM
      pivot_stock_data
  );

We need to configure the maximum length for the GROUP_CONCAT function before executing the above query. By default, it is limited to 1024 characters which will cause anything exceeding that limit to be truncated. We can mitigate this issue by setting a custom max length to the GROUP_CONTACT function before executing the query.

SET
  SESSION group_concat_max_len = 100000;
SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN formatted_date = '",
          formatted_date,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          formatted_date,
          "'"
        )
      )
    FROM
      pivot_stock_data
  );
SELECT
  @sql;
group concat insert into result view
What our @sql variable looks like with the GROUP_CONCAT data

In the above code block, we have set the group_concat_max_len value to 100000 for the current session so that nothing will be truncated. Furthermore, we can obtain the output of the variable using a SELECT statement. We can verify whether the CASE statements have been successfully created by looking at the result string.

formatted group concat insert into variable value
The @sql variable contents, formatted
The fast and easy-to-use SQL client for developers and teams

Creating the Complete SQL Statement

Now we have created the CASE statement, yet we need to add it to the SELECT query to execute successfully. For that, we will again use the CONCAT function to include the ‘@sql’ variable in a select statement.

SET
  @pivot_statement = CONCAT(
    "SELECT name,",
    @sql,
    " FROM pivot_stock_data GROUP BY name ORDER BY name ASC"
  );

In the above statement, we assign the concatenated SELECT statement that includes the ‘@sql’ variable to a new variable called ‘@pivot_statement.’ We can verify the final statement using a SELECT statement as shown below.

SET
  SESSION group_concat_max_len = 100000;
SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN formatted_date = '",
          formatted_date,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          formatted_date,
          "'"
        )
      )
    FROM
      pivot_stock_data
  );
SET
  @pivot_statement = CONCAT(
    "SELECT name,",
    @sql,
    " FROM pivot_stock_data GROUP BY name ORDER BY name ASC"
  );
SELECT
  @pivot_statement;

Result

pivot statement complete SQL statement variable
The entire SELECT statement in our @pivot_statement variable

This will result in a complete SQL statement, as shown below.

We've collapsed the middle ~300 lines, but this is the full statement in the @pivot_statement user variable

Executing the SQL Statement.

Now we have a complete SQL statement in the ‘@pivot_statement’ variable, and we need to execute this statement to create the resulting pivot table. To create an executable SQL statement, we will be using the PREPARE function, which will prepare a SQL statement and assign it a name.

PREPARE complete_pivot_statment
FROM
  @pivot_statement;

We have assigned the complete_pivot_statment as the name of the prepared statement. Then we can execute this by referring to the assigned name using the EXECUTE function.

EXECUTE complete_pivot_statment;

That’s it, and now we have a reusable SQL statement to build pivot tables. The complete code block will be as follows;

-- Set GROUP_CONTACT Max Length
SET
  SESSION group_concat_max_len = 100000;
-- Create GROUP_CONTACT Statement
SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN formatted_date = '",
          formatted_date,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          formatted_date,
          "'"
        )
      )
    FROM
      pivot_stock_data
  );
-- Create the Complete SQL Statement
SET
  @pivot_statement = CONCAT(
    "SELECT name,",
    @sql,
    " FROM pivot_stock_data GROUP BY name ORDER BY name ASC"
  );
-- Prepare and Execute
PREPARE complete_pivot_statment
FROM
  @pivot_statement;
EXECUTE complete_pivot_statment;

RESULT

pivot table results view
Voila! Here's our dynamically-generated pivot table!

Filtering Pivot Table Data

We can further filter the resulting data set by modifying the GROUP_CONCAT statement. The way you filter will depend on the data set, fields, and the data types of the targeted table. For example, we can add a WHERE statement to filter data within the year 2013 and retrieve data between ‘2013-01-01’ to ‘2013-12-31’.

-- Set GROUP_CONTACT Max Length
SET
  SESSION group_concat_max_len = 100000;
-- Create GROUP_CONTACT Statement
SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN formatted_date = '",
          formatted_date,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          formatted_date,
          "'"
        )
      )
    FROM
      pivot_stock_data
    WHERE
      yearmonth BETWEEN '2013-01-01 00:00:00' AND '2013-12-31 00:00:00'
  );
-- Create the Complete SQL Statement
SET
  @pivot_statement = CONCAT(
    "SELECT name,",
    @sql,
    " FROM pivot_stock_data GROUP BY name ORDER BY name ASC"
  );
-- Prepare and Execute
PREPARE complete_pivot_statment
FROM
  @pivot_statement;
EXECUTE complete_pivot_statment;

RESULT

pivot table with filtering
Our pivot table with some filtering applied

Let’s look at another example. Assume that you want to change the way data is presented with the months as rows and stocks as the columns. Since it will change both the columns and rows, we need to change both the GROUP_CONCAT statement to include the stocks (name) and the CONTACT statement for the ‘@pivot_statement’ variable to reflect the dates (formatted_date) as rows.

-- Set GROUP_CONTACT Max Length
SET
  SESSION group_concat_max_len = 100000;
-- Create GROUP_CONTACT Statement
SET
  @sql = (
    SELECT
      GROUP_CONCAT (
        DISTINCT CONCAT(
          "MAX(CASE WHEN name = '",
          name,
          "' THEN `delta_pct` ELSE NULL END) AS '",
          name,
          "'"
        )
      )
    FROM
      pivot_stock_data
  );
-- Create the Complete SQL Statement
SET
  @pivot_statement = CONCAT(
    "SELECT formatted_date,",
    @sql,
    " FROM pivot_stock_data GROUP BY formatted_date ORDER BY formatted_date ASC"
  );
-- Prepare and Execute
PREPARE complete_pivot_statment
FROM
  @pivot_statement;
EXECUTE complete_pivot_statment;

RESULT

another pivot table example result view
A pivot table with the company stock tickers as columns and months as rows

Conclusion

In this article, we covered how to pivot data in MySQL using the GROUP_CONCAT function with user variables. We can follow this approach to build a custom SQL statement that will create a pivot table as the output. As with any SQL statement, the resulting pivot table depends on the underlying data set and the structure of the created statement. With careful consideration, we can create any kind of pivot table programmatically using the methods mentioned above.

JOIN the Arctype Newsletter
Programming stories, tutorials, and database tips every 2 weeks

Spread the word