Pandas Rolling On A Shifted Dataframe
Here's a piece of code, I don't get why on the last column rm-5, I get NaN for the first 4 items. I understand that for the rm columns the 1st 4 items aren't filled because there i
Solution 1:
You can change the order of operations. Now you are first shifting and afterwards taking the mean. Due to your first shift you create your NaN's at the end.
index = pd.date_range('2000-1-1', periods=100, freq='D')
df = pd.DataFrame(data=np.random.randn(100), index=index, columns=['A'])
df['rm']=pd.rolling_mean(df['A'],5)
df['shift'] = df['A'].shift(-5)
df['rm-5-shift_first']=pd.rolling_mean(df['A'].shift(-5),5)
df['rm-5-mean_first']=pd.rolling_mean(df['A'],5).shift(-5)
print( df.head(n=8))
print( df.tail(n=8))
A rmshift rm-5-shift_first rm-5-mean_first
2000-01-01 -0.120808 NaN 0.830231 NaN 0.184197
2000-01-02 0.029547 NaN 0.047451 NaN 0.187778
2000-01-03 0.002652 NaN 1.040963 NaN 0.395440
2000-01-04 -1.078656 NaN -1.118723 NaN 0.387426
2000-01-05 1.137210 -0.006011 0.469557 0.253896 0.253896
2000-01-06 0.830231 0.184197 -0.390506 0.009748 0.009748
2000-01-07 0.047451 0.187778 -1.624492 -0.324640 -0.324640
2000-01-08 1.040963 0.395440 -1.259306 -0.784694 -0.784694
A rmshift rm-5-shift_first rm-5-mean_first
2000-04-02 -1.283123 -0.270381 0.226257 0.760370 0.760370
2000-04-03 1.369342 0.288072 2.367048 0.959912 0.959912
2000-04-04 0.003363 0.299997 1.143513 1.187941 1.187941
2000-04-05 0.694026 0.400442 NaN NaN NaN
2000-04-06 1.508863 0.458494 NaN NaN NaN
2000-04-07 0.226257 0.760370 NaN NaN NaN
2000-04-08 2.367048 0.959912 NaN NaN NaN
2000-04-09 1.143513 1.187941 NaN NaN NaN
For more see:
http://pandas.pydata.org/pandas-docs/stable/computation.html#moving-rolling-statistics-moments
http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.shift.html
Post a Comment for "Pandas Rolling On A Shifted Dataframe"