How Do I Use Pandas Groupby Function To Apply A Formula Based On The Groupby Value
My question may be a little confusing, so let me explain. I have a dataframe of information that I would like to group by the unique order id that will produce the following column
Solution 1:
Writing a named funtion and using apply
works:
def func(group):
sum_ = group.qty.sum()
es = (group.csv / group.qty).sum()
return pd.Series([sum_, es], index=['qty', 'es'])
trades.groupby('ordrefno').apply(func)
Result:
qty es
ordrefno
983375 -10000 -0.0015
984702 100 0.0003
984842 -25100 -0.0008
Solution 2:
Assuming you want the ratio of the sums rather than the sum of the ratios (the way the question is worded suggest this but the function in you code would give the sum of the ratios if applied to the df), I think the cleanest way to do this is in two steps. First just get the sum of the two columns and then divide:
agg_td = trades.groupby('ordrefno')[['qty', 'csv']].sum()
agg_td.eval('es = csv/qty')
You could also create a special function and pass it to the groupby apply
method:
es = trades.groupby('ordrefno').apply(lambda df: df.csv.sum() / df.qty.sum())
But this will only get you the 'es'
column. The problem with using agg
is that the dict of functions are column-specific where here you need to combine two columns.
Post a Comment for "How Do I Use Pandas Groupby Function To Apply A Formula Based On The Groupby Value"