# Pandas Pivot: A Guide with Examples

Python’s Pandas library is one of the most popular tools in the data scientist’s toolbelt. Data scientists use Pandas to explore, clean, and understand datasets.

In a previous article, we saw how to work with the Pandas DataFrame, the core data structure in the Pandas library. If you’re new to Pandas, that post is a great way to get started.

In this post, we’ll learn how to reshape your DataFrames by using the Pivot method. Pivot allows you to twist your data into a different shape for easier analysis.

We’ll cover:

- Why to pivot your data;
- How to use the
`pivot`

method; - When to use
`pivot`

method vs`pivot_table`

method; - How to use the
`pivot_table`

method.

Let’s get started.

**Why to **`pivot`

your data

`pivot`

your dataBefore we get into details how to pivot, it’s important to know why you want to pivot.

Pivoting your data allows you to reshape it in a way that makes it easier to understand or analyze. Often you’ll use a pivot to demonstrate the relationship between two columns that can be difficult to reason about before the pivot.

If you’re an Excel wizard who has spent a lot of time in spreadsheets, the idea of a pivot may be easy for you. If not, it can be hard to understand without an example. Let’s see how it works.

In the example below, I’ll create a Pandas DataFrame from some stock trading data that I’ve used in the previous Pandas articles.

`>>> import pandas as pd`

>>> stocks = pd.read_csv('https://gist.githubusercontent.com/alexdebrie/b3f40efc3dd7664df5a20f5eee85e854/raw/ee3e6feccba2464cbbc2e185fb17961c53d2a7f5/stocks.csv')

>>> stocks

date symbol open high low close volume

0 2019-03-01 AMZN 1655.13 1674.26 1651.00 1671.73 4974877

1 2019-03-04 AMZN 1685.00 1709.43 1674.36 1696.17 6167358

2 2019-03-05 AMZN 1702.95 1707.80 1689.01 1692.43 3681522

3 2019-03-06 AMZN 1695.97 1697.75 1668.28 1668.95 3996001

4 2019-03-07 AMZN 1667.37 1669.75 1620.51 1625.95 4957017

5 2019-03-01 AAPL 174.28 175.15 172.89 174.97 25886167

6 2019-03-04 AAPL 175.69 177.75 173.97 175.85 27436203

7 2019-03-05 AAPL 175.94 176.00 174.54 175.53 19737419

8 2019-03-06 AAPL 174.67 175.49 173.94 174.52 20810384

9 2019-03-07 AAPL 173.87 174.44 172.02 172.50 24796374

10 2019-03-01 GOOG 1124.90 1142.97 1124.75 1140.99 1450316

11 2019-03-04 GOOG 1146.99 1158.28 1130.69 1147.80 1446047

12 2019-03-05 GOOG 1150.06 1169.61 1146.19 1162.03 1443174

13 2019-03-06 GOOG 1162.49 1167.57 1155.49 1157.86 1099289

14 2019-03-07 GOOG 1155.72 1156.76 1134.91 1143.30 1166559

When we print out the code, you can see that the data has a number of columns and that the rows are organized by trading date and stock symbol.

That organization may be helpful for some analysis, but it can be hard to glean information about trading volume across dates and stock symbols. Let’s reshape our data to look closer at volume.

`>>> stocks.pivot(index='symbol', columns='date', values='volume')`

date 2019-03-01 2019-03-04 2019-03-05 2019-03-06 2019-03-07

symbol

AAPL 25886167 27436203 19737419 20810384 24796374

AMZN 4974877 6167358 3681522 3996001 4957017

GOOG 1450316 1446047 1443174 1099289 1166559

In the example above, I use the pivot method to reshape the data so that the rows are indexed by stock symbol and the columns are trading dates. The value in each cell is the volume on that day. This makes it easy to compare the volume for a stock over time, by reading horizontally, or to compare volume across stocks on a particular day, by reading vertically.

This reshaping power of pivot makes it much easier to understand relationships in your datasets.

**How to use the Pandas **`pivot method`

`pivot method`

To use the pivot method in Pandas, you need to specify three parameters:

**Index**: Which column should be used to identify and order your rows vertically**Columns:**Which column should be used to create the new columns in our reshaped DataFrame. Each unique value in the column stated here will create a column in our new DataFrame.**Values:**Which column(s) should be used to fill the values in the cells of our DataFrame.

Let’s walk through these in an example with our stock trading data. In the example below, I use pivot to examine the closing trading price for each stock symbol over our trading window.

`>>> stocks`

date symbol open high low close volume

0 2019-03-01 AMZN 1655.13 1674.26 1651.00 1671.73 4974877

1 2019-03-04 AMZN 1685.00 1709.43 1674.36 1696.17 6167358

2 2019-03-05 AMZN 1702.95 1707.80 1689.01 1692.43 3681522

3 2019-03-06 AMZN 1695.97 1697.75 1668.28 1668.95 3996001

4 2019-03-07 AMZN 1667.37 1669.75 1620.51 1625.95 4957017

5 2019-03-01 AAPL 174.28 175.15 172.89 174.97 25886167

6 2019-03-04 AAPL 175.69 177.75 173.97 175.85 27436203

7 2019-03-05 AAPL 175.94 176.00 174.54 175.53 19737419

8 2019-03-06 AAPL 174.67 175.49 173.94 174.52 20810384

9 2019-03-07 AAPL 173.87 174.44 172.02 172.50 24796374

10 2019-03-01 GOOG 1124.90 1142.97 1124.75 1140.99 1450316

11 2019-03-04 GOOG 1146.99 1158.28 1130.69 1147.80 1446047

12 2019-03-05 GOOG 1150.06 1169.61 1146.19 1162.03 1443174

13 2019-03-06 GOOG 1162.49 1167.57 1155.49 1157.86 1099289

14 2019-03-07 GOOG 1155.72 1156.76 1134.91 1143.30 1166559

>>> stocks.pivot(index='symbol', columns='date', values='close')

date 2019-03-01 2019-03-04 2019-03-05 2019-03-06 2019-03-07

symbol

AAPL 174.97 175.85 175.53 174.52 172.50

AMZN 1671.73 1696.17 1692.43 1668.95 1625.95

GOOG 1140.99 1147.80 1162.03 1157.86 1143.30

First, I printed out our DataFrame to see how it is organized by default. Then, I use the pivot method to change the data.

Notice that for the **index** parameter, I used `symbol`

. As a result, each unique value for the symbol column — AAPL, AMZN, GOOG — is used as the index, the leftmost column in our DataFrame.

**Faster**?

I used `date`

for the **column** parameter. This resulted in five non-index columns across the top of our DataFrame, one for each unique value in the `date`

column of our initial DataFrame.

Finally, I used `close`

as the **values** parameter. This specifies which value should be placed in each column. I’m interested in the closing price for each stock across the trading days, so I use the `close`

column.

Note that the `index`

and `column`

parameters are interchangeable. If you want to reorganize so that the dates are used as the index and the stock symbols are my columns, you can just flip the parameters:

`>>> stocks.pivot(index='date', columns='symbol', values='close')`

symbol AAPL AMZN GOOG

date

2019-03-01 174.97 1671.73 1140.99

2019-03-04 175.85 1696.17 1147.80

2019-03-05 175.53 1692.43 1162.03

2019-03-06 174.52 1668.95 1157.86

2019-03-07 172.50 1625.95 1143.30

**When to use **`pivot`

vs `pivot_table`

in Pandas

`pivot`

vs `pivot_table`

in PandasSo far we’ve only been using the term ‘pivot’ broadly, but there are actually two Pandas methods for pivoting. The first is the `pivot`

method, which we reviewed in this section. The second is the `pivot_table`

method, which we’ll learn about in the next section.

When deciding between using the `pivot`

or `pivot_table`

method, you need to ask yourself one question:

*Will the results of my pivot have more than one entry in any index + column?*

If the answer to this question is “yes”, you *must* use the `pivot_table`

method. If the answer to this question is “no”, you *may* use the `pivot`

method.

Note that any use of pivot can be switched to `pivot_table`

, but the reverse is not true. If you try to use the pivot method where there would be more than one entry in any index + column combination, it will throw a `ValueError`

.

In the next section, we’ll take a look at how the `pivot_table`

method works in practice.

**How to use the Pandas **`pivot_table`

method

`pivot_table`

methodFor those familiar with Excel or other spreadsheet tools, the pivot table is more familiar as an aggregation tool. Pandas pivot tables are used to group similar columns to find totals, averages, or other aggregations.

You can accomplish this same functionality in Pandas with the `pivot_table`

method. For example, imagine we wanted to find the mean trading volume for each stock symbol in our DataFrame. You could do so with the following use of `pivot_table`

:

`>>> stocks.pivot_table(index='symbol', values='volume')`

volume

symbol

AAPL 23733309.4

AMZN 4755355.0

GOOG 1321077.0

Like with pivot, we specify the index we want to to use for our data as well as the column to use to fill in the values.

Notice that each stock symbol in our index will have five values for the volume column as there are five trading days for each stock. The `pivot_table`

method aggregates these values and applies an aggregate function to reduce it to a single value.

By default, the aggregate function is the mean function from NumPy, but you can pass in a custom aggregation function. If I want to combine my values into a total, I could use NumPy’s sum function:

`>>> import numpy as np`

>>> stocks.pivot_table(index='symbol', values='volume', aggfunc=np.sum)

volume

symbol

AAPL 118666547

AMZN 23776775

GOOG 6605385

The pivot table method is really powerful when using it with a MultiIndex, which allows you to have hierarchies in your index. For example, imagine you had a larger stock trading dataset that included training data over an entire year. You could use a MultiIndex to create a pivot table where values were grouped by stock symbol and month, allowing you to quickly explore how trading volume and other statistics changed on a month-over-month basis for particular stocks.

**Faster**?

**Conclusion**

In this post, we learned about pivoting your DataFrames in Pandas with the pivot and pivot_table functions. We saw why you would want to pivot your data as well as walkthroughs of using both `pivot`

and `pivot_table`

.

Just like Pandas makes it easy to work with data, the Kite plugin for your IDE makes it easy to work with Python. Kite gives you an AI-powered autocomplete in the editor, which saves you keystrokes and helps you code faster on the fly. It also has inline documentation for popular libraries so you don’t have to break your flow. Download it today!