Calculate Cosine Similarity For Vectors Between Two Pandas Columns?
I have the following Pandas Dataframe and need to find the cosine similarity by row. but my code returns a matrix of values. import pandas as pd from sklearn.metrics.pairwise impo
Solution 1:
In case you only want to calculate the cosine similarity for each row between the value of column a
and column b
it is easier to use cosine distance and substract the result from 1 to get the cosine similarity.
from scipy.spatial.distance import cosine
df['cosine'] = df.apply(lambda row: 1 - cosine(row['a'], row['b']), axis=1)
df
Output:
ab cosine
0[0.1, 0.2][0.1, 0.2]1.0000001[0.5, 0.3, 0.3][0.2, 0.3, 0.4]0.8778662[0.5][0.5]1.000000
Post a Comment for "Calculate Cosine Similarity For Vectors Between Two Pandas Columns?"