Pandas: How To Groupby Based On Series Pattern
Having the following df: pd.DataFrame({'bool':[True,True,True, False,True,True,True], 'foo':[1,3,2,6,2,4,7]}) which results into: bool foo 0 True 1 1 T
Solution 1:
you could do :
import numpy as np
x = df[df["bool"]].index.values
groups = np.split(x, np.where(np.diff(x)>1)[0]+1)
df_groups = [df.iloc[gr, :] for gr in groups]
The output looks like :
df_groups[0]
Out[56]:
bool foo
0True11True32True2
df_groups[1]
Out[57]:
bool foo
4True25True46True7
Solution 2:
Here is a simple way to do it :
# Split the dataframe by `Series` using `cumsum`g =(~data['bool']).cumsum().where(data['bool'])
dfs= {'group_'+str(i+1):v for i, (k, v) in enumerate(data[['foo']].groupby(g))}
you can get access to each dataframe using the keys 'group_'+str(i+1)
like group_1
, group_2
, ..etc:
print(dfs['group_1'])
foo
0 1
1 3
2 2
Post a Comment for "Pandas: How To Groupby Based On Series Pattern"