Skip to content Skip to sidebar Skip to footer

Extract Dataframe From Duplicated Values

I've a DataFrame with a column in which are stored more duplicates related to different data. I don't know the number of duplicates in A and who are they, but I need to extract n-

Solution 1:

TRY:

df_list = [k for _,k in df.groupby('A')]

OUTPUT:

[     A    B
 1120  abc
 5120def,
      A    B
 2121def4121  abc
 6121def8121  ghi,
      A    B
 3122  ghi
 7122  abc]

Use the below code if you also want to reset the index of each dataframe.

df_list = [k.reset_index(drop=True) for _,k in df.groupby('A')]

You can use dict comprehension if you need group_names:

df_dict = {g:k.reset_index(drop=True) for g,k in df.groupby('A')}

Dict output:

{120:      A    B
0120  abc
1120def,
 121:      A    B
0121def1121  abc
2121def3121  ghi,
 122:      A    B
0122  ghi
1122  abc}

Post a Comment for "Extract Dataframe From Duplicated Values"