Skip to content Skip to sidebar Skip to footer

Reshaping A Pandas Dataframe Into Stacked/record/database/long Format

What is the best way to convert a pandas DataFrame from wide format into stacked/record/database/long format? Here's a small code example: Wide format: date hour1 hour2 ho

Solution 1:

You can use melt to convert a DataFrame from wide format to long format:

import pandas as pd
df = pd.DataFrame({'date': ['2012-12-31', '2012-12-30', '2012-12-29', '2012-12-28', '2012-12-27'],
                   'hour1': [9.18, 13.91, 12.97, 22.01, 11.44],
                   'hour2': [-0.1, 0.09, 11.82, 16.04, 0.07]})
print pd.melt(df, id_vars=['date'], value_vars=['hour1', 'hour2'], var_name='hour', value_name='price')

Output:

datehourprice02012-12-31  hour19.1812012-12-30  hour113.9122012-12-29  hour112.9732012-12-28  hour122.0142012-12-27  hour111.4452012-12-31  hour2-0.1062012-12-30  hour20.0972012-12-29  hour211.8282012-12-28  hour216.0492012-12-27  hour20.07

Solution 2:

You could use stack to pivot the DataFrame. First set date as the index column:

>>>df.set_index('date').stack()
date             
2012-12-31  hour1      9.18
            hour2     -0.10
            hour3     -7.00
            hour4    -64.92
2012-12-30  hour1     13.91
            hour2      0.09
            hour3     -0.96
            hour4      0.08
...

This actually returns a Series with a MultiIndex. To create a DataFrame like the one you specify you could just reset the MultiIndex after stacking and rename the columns:

>>>stacked = df.set_index('date').stack()>>>df2 = stacked.reset_index()>>>df2.columns = ['date', 'hour', 'price']>>>df2
          date   hour   price
0   2012-12-31  hour1    9.18
1   2012-12-31  hour2   -0.10
2   2012-12-31  hour3   -7.00
3   2012-12-31  hour4  -64.92
4   2012-12-30  hour1   13.91
5   2012-12-30  hour2    0.09
6   2012-12-30  hour3   -0.96
7   2012-12-30  hour4    0.08
...

Post a Comment for "Reshaping A Pandas Dataframe Into Stacked/record/database/long Format"