Python PANDAS: Stack By Enumerated Date To Create Records Vectorized
I have a dataframe in the following general format: id,transaction_dt,units,measures 1,2018-01-01,4,30.5 1,2018-01-03,4,26.3 2,2018-01-01,3,12.7 2,2018-01-03,3,8.8 What I am tryi
Solution 1:
You can use numpy.repeat
for duplicate indices by column units
with loc
for duplicates rows. Last per each indices get count
by cumcount
, convert to_timedelta
and add to column transaction_dt
. Last reset_index
for default unique indeices:
df = df.loc[np.repeat(df.index, df['units'])]
df['transaction_dt'] += pd.to_timedelta(df.groupby(level=0).cumcount(), unit='d')
df = df.reset_index(drop=True)
print (df)
id transaction_dt units measures
0 1 2018-01-01 4 30.5
1 1 2018-01-02 4 30.5
2 1 2018-01-03 4 30.5
3 1 2018-01-04 4 30.5
4 1 2018-01-03 4 26.3
5 1 2018-01-04 4 26.3
6 1 2018-01-05 4 26.3
7 1 2018-01-06 4 26.3
8 2 2018-01-01 3 12.7
9 2 2018-01-02 3 12.7
10 2 2018-01-03 3 12.7
11 2 2018-01-03 3 8.8
12 2 2018-01-04 3 8.8
13 2 2018-01-05 3 8.8
Post a Comment for "Python PANDAS: Stack By Enumerated Date To Create Records Vectorized"