Reshaping A Pandas Dataframe Into Stacked/record/database/long Format
What is the best way to convert a pandas DataFrame from wide format into stacked/record/database/long format? Here's a small code example: Wide format: date hour1 hour2 ho
Solution 1:
You can use melt
to convert a DataFrame from wide format to long format:
import pandas as pd
df = pd.DataFrame({'date': ['2012-12-31', '2012-12-30', '2012-12-29', '2012-12-28', '2012-12-27'],
'hour1': [9.18, 13.91, 12.97, 22.01, 11.44],
'hour2': [-0.1, 0.09, 11.82, 16.04, 0.07]})
print pd.melt(df, id_vars=['date'], value_vars=['hour1', 'hour2'], var_name='hour', value_name='price')
Output:
datehourprice02012-12-31 hour19.1812012-12-30 hour113.9122012-12-29 hour112.9732012-12-28 hour122.0142012-12-27 hour111.4452012-12-31 hour2-0.1062012-12-30 hour20.0972012-12-29 hour211.8282012-12-28 hour216.0492012-12-27 hour20.07
Solution 2:
You could use stack
to pivot the DataFrame. First set date
as the index column:
>>>df.set_index('date').stack()
date
2012-12-31 hour1 9.18
hour2 -0.10
hour3 -7.00
hour4 -64.92
2012-12-30 hour1 13.91
hour2 0.09
hour3 -0.96
hour4 0.08
...
This actually returns a Series with a MultiIndex. To create a DataFrame like the one you specify you could just reset the MultiIndex after stacking and rename the columns:
>>>stacked = df.set_index('date').stack()>>>df2 = stacked.reset_index()>>>df2.columns = ['date', 'hour', 'price']>>>df2
date hour price
0 2012-12-31 hour1 9.18
1 2012-12-31 hour2 -0.10
2 2012-12-31 hour3 -7.00
3 2012-12-31 hour4 -64.92
4 2012-12-30 hour1 13.91
5 2012-12-30 hour2 0.09
6 2012-12-30 hour3 -0.96
7 2012-12-30 hour4 0.08
...
Post a Comment for "Reshaping A Pandas Dataframe Into Stacked/record/database/long Format"