Skip to content Skip to sidebar Skip to footer

Pandas: How To Groupby Based On Series Pattern

Having the following df: pd.DataFrame({'bool':[True,True,True, False,True,True,True], 'foo':[1,3,2,6,2,4,7]}) which results into: bool foo 0 True 1 1 T

Solution 1:

you could do :

import numpy as np
x = df[df["bool"]].index.values
groups = np.split(x, np.where(np.diff(x)>1)[0]+1)
df_groups = [df.iloc[gr, :] for gr in groups]

The output looks like :


df_groups[0]
Out[56]: 
   bool  foo
0True11True32True2

df_groups[1]
Out[57]: 
   bool  foo
4True25True46True7

Solution 2:

Here is a simple way to do it :

# Split the dataframe by `Series` using `cumsum`g =(~data['bool']).cumsum().where(data['bool'])

dfs= {'group_'+str(i+1):v for i, (k, v) in enumerate(data[['foo']].groupby(g))}

you can get access to each dataframe using the keys 'group_'+str(i+1) like group_1, group_2, ..etc:

print(dfs['group_1'])

   foo
0    1
1    3
2    2

Post a Comment for "Pandas: How To Groupby Based On Series Pattern"