Identify Value Across Multiple Columns In A Dataframe That Contain String From A List In Python
I have a dataframe with multiple columns containing phrases. What I would like to do is identify the column (per row observation) that contains a string that exists within a pre-
Solution 1:
Let's take the list of words and data frame as you have mentioned
lst = ['a','m','n','o','p']
df = pd.DataFrame({'Observation': [1], 'col1': ['ab'], 'col2': ['dc'], 'col3': ['ef'], 'col4': ['yz']})
df
Observation col1 col2 col3 col4
0 1 ab dc ef yz
Check whether values of data frame match with values in the list
df['New_var'] = [x for x in df.values[0] if any(b for b in lst if b in str(x))]
df
Observation col1 col2 col3 col4 New_var
0 1 ab dc ef yz ab
Post a Comment for "Identify Value Across Multiple Columns In A Dataframe That Contain String From A List In Python"