Reshaping with Pivot

Pivot

404 image
cereal_long
name mfr nutrition value
0 Special K K calories 110
1 Special K K protein 6
2 Apple Jacks K calories 110
... ... ... ... ...
7 Cheerios G protein 6
8 Wheaties G calories 100
9 Wheaties G protein 3

10 rows × 4 columns

cereal_long.head(5)
name mfr nutrition value
0 Special K K calories 110
1 Special K K protein 6
2 Apple Jacks K calories 110
3 Apple Jacks K protein 2
4 Raisin Bran K calories 120


cereal_wide = cereal_long.pivot(index='name', columns='nutrition', values='value')
cereal_wide
nutrition calories protein
name
Apple Jacks 110 2
Cheerios 110 6
Raisin Bran 120 3
Special K 110 6
Wheaties 100 3
cereal_long.head(3)
name mfr nutrition value
0 Special K K calories 110
1 Special K K protein 6
2 Apple Jacks K calories 110
cereal_wide = cereal_long.pivot(index='name', columns='nutrition', values='value')
cereal_wide
nutrition calories protein
name
Apple Jacks 110 2
Cheerios 110 6
Raisin Bran 120 3
Special K 110 6
Wheaties 100 3

.pivot() takes 3 arguments:

  • index: Used to make the new dataframe’s index.
  • columns: The column that currently exists but that we want to create new columns labels from. Each unique value in this column will become a new column label.
  • values: The name of the column that currently exists but that contains the cell values we want to relocate to new columns. These values will be displayed in the respective newly created columns.
cereal_long
name mfr nutrition value
0 Special K K calories 110
1 Special K K protein 6
2 Apple Jacks K calories 110
... ... ... ... ...
7 Cheerios G protein 6
8 Wheaties G calories 100
9 Wheaties G protein 3

10 rows × 4 columns


cereal_wide
nutrition calories protein
name
Apple Jacks 110 2
Cheerios 110 6
Raisin Bran 120 3
Special K 110 6
Wheaties 100 3



404 image



404 image


cereal_wide = cereal_long.pivot(index='name', columns='nutrition', values='value')

Resetting the Index

cereal_wide.head(5)
nutrition calories protein
name
Apple Jacks 110 2
Cheerios 110 6
Raisin Bran 120 3
Special K 110 6
Wheaties 100 3


cereal_wide_messy = cereal_wide.reset_index()
cereal_wide_messy.head(5)
nutrition name calories protein
0 Apple Jacks 110 2
1 Cheerios 110 6
2 Raisin Bran 120 3
3 Special K 110 6
4 Wheaties 100 3
cereal_wide_messy.head()
nutrition name calories protein
0 Apple Jacks 110 2
1 Cheerios 110 6
2 Raisin Bran 120 3
3 Special K 110 6
4 Wheaties 100 3


cereal_wide_cleaned = cereal_wide_messy.rename_axis('', axis='columns')
cereal_wide_cleaned.head()
name calories protein
0 Apple Jacks 110 2
1 Cheerios 110 6
2 Raisin Bran 120 3
3 Special K 110 6
4 Wheaties 100 3

Let’s apply what we learned!