Skip to content Skip to sidebar Skip to footer

Pandas Computation In Each Group

I do have a grouped data frame. Here is one group as an example: name pH salt id sample 7.5 50 1 0.48705 2 0.42875

Solution 1:

I think @filmor answered your question. Probably you misunderstood it.

I made up a dataframe by repeating the data you gave and modified indices.

In [117]: df
Out[117]: 
                          mass
name   pH  salt id            
sample 7.5 50   1      0.48705
                2      0.42875
                3      0.38885
                4      0.34615
                5      0.35060
                6      0.29280
                7      0.28210
                8      0.24535
                stock  0.66090
           150  1      0.48705
                2      0.42875
                3      0.38885
                4      0.34615
                5      0.35060
                6      0.29280
                7      0.28210
                8      0.24535
                stock  0.66090
       8.5 50   1      0.48705
                2      0.42875
                3      0.38885
                4      0.34615
                5      0.35060
                6      0.29280
                7      0.28210
                8      0.24535
                stock  0.66090
           150  1      0.48705
                2      0.42875
                3      0.38885
                4      0.34615
                5      0.35060
                6      0.29280
                7      0.28210
                8      0.24535
                stock  0.66090

[36 rows x 1 columns]

If you are sure stock is always last (after sorting if necessary) in each group, you can do the following. Otherwise, df.groupby(level= [0,1,2]).apply(lambda g: g - g[g.index.get_level_values('id')=='stock'].values[0]) should work.

In [118]: df.groupby(level= [0,1,2]).apply(lambda g: g - g.iloc[-1,0])
Out[118]: 
                          mass
name   pH  salt id            
sample 7.5 50   1     -0.17385
                2     -0.23215
                3     -0.27205
                4     -0.31475
                5     -0.31030
                6     -0.36810
                7     -0.37880
                8     -0.41555
                stock  0.00000
           150  1     -0.17385
                2     -0.23215
                3     -0.27205
                4     -0.31475
                5     -0.31030
                6     -0.36810
                7     -0.37880
                8     -0.41555
                stock  0.00000
       8.5 50   1     -0.17385
                2     -0.23215
                3     -0.27205
                4     -0.31475
                5     -0.31030
                6     -0.36810
                7     -0.37880
                8     -0.41555
                stock  0.00000
           150  1     -0.17385
                2     -0.23215
                3     -0.27205
                4     -0.31475
                5     -0.31030
                6     -0.36810
                7     -0.37880
                8     -0.41555
                stock  0.00000

[36 rows x 1 columns]

Solution 2:

You can use groupby for this, in particular df_grouped.groupby(level=[0, 1, 2]).apply(fancy_func) in your case, where fancy_func takes a sub-dataframe and returns a value.

The result will then be a series of values, indexed by the same levels.


Post a Comment for "Pandas Computation In Each Group"