Skip to content Skip to sidebar Skip to footer

Pandas - Filtering A Dataframe By Index Of Another Dataframe, Then Combine The Two Dataframes

I have two dataframes as the following: df1 Index Fruit 1 Apple 2 Banana 3 Peach df2 Index Taste 1 Tasty 1.5 Rotten 2 Tasty 2.6 Tasty

Solution 1:

To get the rows from df2, use numpy broadcasting and argmax. Afterwards, concat the result with df1 using pd.concat.

r = df2.iloc[(df1.Index.values +0.5<= df2.Index.values[:, None]).argmax(axis=0)].reset_index(drop=1)

pd.concat([df1, r], 1)

   Index   Fruit  Index   Taste
01   Apple    1.5  Rotten
12  Banana    2.6   Tasty
23   Peach    4.0   Tasty

Details

Broadcasting gives:

x = (df1.Index.values + 0.5 <= df2.Index.values[:, None])
array([[False, False, False],
       [ True, False, False],
       [ True, False, False],
       [ True,  True, False],
       [ True,  True, False],
       [ True,  True, False],
       [ True,  True,  True]], dtype=bool)

And taking the argmax of this, you have:

x.argmax(axis=0)
array([1, 3, 6])

Solution 2:

Use searchsorted for indices, then select by iloc and last concat:

df = pd.concat([df1.reset_index(), 
                df2.iloc[df2.index.searchsorted(df1.index + .5)].reset_index()], axis=1)
print (df)
   Index   Fruit  Index   Taste
0      1   Apple    1.5  Rotten
1      2  Banana    2.6   Tasty
2      3   Peach    4.0   Tasty

Detail:

print (df2.index.searchsorted(df1.index + .5))
[1 3 6]

print (df2.iloc[df2.index.searchsorted(df1.index + .5)])
        Taste
Index        
1.5    Rotten
2.6     Tasty
4.0     Tasty

Post a Comment for "Pandas - Filtering A Dataframe By Index Of Another Dataframe, Then Combine The Two Dataframes"