Skip to content Skip to sidebar Skip to footer

Setting Values On A Subset Of Rows (indexing, Boolean Setting)

This is a followup to: What is correct syntax to swap column values for selected rows in a pandas data frame using just one line? No credit will be given for a workaround, only for

Solution 1:

Pandas aligns the right-hand side of a setting operation. Then take the left-hand side mask and sets them equal.

So this is the left hand indexer. So you are going to be making the rhs this same shape (or broadcastable to it).

In [61]: df1.loc[df1.idx,['L','R']] 
Out[61]: 
       L     R
1rightleft3rightleft

Here is the first. I am only going to show the right-hand alignment (the y).

In [49]: x, y = df1.align(df1.loc[df1.idx,['L','R']])

In [51]: y
Out[51]: 
       L     R  idx  num
0NaNNaNNaNNaN1  right  left  NaNNaN2NaNNaNNaNNaN3  right  left  NaNNaN

So even though you reversed the columns in the input on the right hand side, aligning put them back in order. So you are setting the same values, hence no change.

In [63]: x, y = df2.align(df2[['R','L']])

In [65]: y
Out[65]: 
       L      R  idx  num
0leftright  NaN  NaN
1rightleft  NaN  NaN
2leftright  NaN  NaN
3rightleft  NaN  NaN

Notice the difference from the above. This is still a full frame (and not sub-selected, so the shape of the right hand side is now != to the left shape, as opposed to the above example).

Their is a reindexing step at this point. It might be a bug as I think this should come out the same as the your first example. Pls file a bug report for this (e.g. your example using df1 and df2). They should come out == df after the assignment.

In [58]: x, y = df1.align(df3['num'],axis=0)

In [60]: y
Out[60]: 
00112233
Name: num, dtype: int64

This one simply broadcasts the results to the left-hand side. that's why the numbers are propogated.

Bottom line. Pandas attempts to figure out the right hand side in the assignment. Their are a lot of cases.

Post a Comment for "Setting Values On A Subset Of Rows (indexing, Boolean Setting)"