Skip to content Skip to sidebar Skip to footer

Efficiently Grouping The Rows Of A Pandas Dataframe By The Value Of A Column?

I have a Pandas DataFrame df, with two columns A and B. A is also the index. B has a very small range of permissible values (in my case, B is a boolean). How do I quickly answer th

Solution 1:

Here is an example using http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/ also I recommend you going over that tutorial along with the pandas documentation.

>>>data = {'year': [2010, 2011, 2012, 2011, 2012, 2010, 2011, 2012],...'team': ['Bears', 'Bears', 'Bears', 'Packers', 'Packers', 'Lions', 'Lions', 'Lions'],...'wins': [11, 8, 10, 15, 11, 6, 10, 4],...'losses': [5, 8, 6, 1, 5, 10, 6, 12]}>>>football = pd.DataFrame(data, columns=['year', 'team', 'wins', 'losses'])>>>football
   year     team  wins  losses
0  2010    Bears    11       5
1  2011    Bears     8       8
2  2012    Bears    10       6
3  2011  Packers    15       1
4  2012  Packers    11       5
5  2010    Lions     6      10
6  2011    Lions    10       6
7  2012    Lions     4      12

This is what you wanted to do:

>>>football[football['team'] == 'Lions']
   year   team  wins  losses
5  2010  Lions     6      10
6  2011  Lions    10       6
7  2012  Lions     4      12

[3 rows x 4 columns]

In your case you need to replace these column headers and do what you want to obtain from the data frame.

df[df['B'] = True]

I gave above example so you can get more familiar with the operation and play around to get a good idea.

Post a Comment for "Efficiently Grouping The Rows Of A Pandas Dataframe By The Value Of A Column?"