Skip to content Skip to sidebar Skip to footer

Calculate Cosine Similarity For Vectors Between Two Pandas Columns?

I have the following Pandas Dataframe and need to find the cosine similarity by row. but my code returns a matrix of values. import pandas as pd from sklearn.metrics.pairwise impo

Solution 1:

In case you only want to calculate the cosine similarity for each row between the value of column a and column b it is easier to use cosine distance and substract the result from 1 to get the cosine similarity.

from scipy.spatial.distance import cosine

df['cosine'] = df.apply(lambda row: 1 - cosine(row['a'], row['b']), axis=1)
df

Output:

ab    cosine
0[0.1, 0.2][0.1, 0.2]1.0000001[0.5, 0.3, 0.3][0.2, 0.3, 0.4]0.8778662[0.5][0.5]1.000000

Post a Comment for "Calculate Cosine Similarity For Vectors Between Two Pandas Columns?"