Skip to content Skip to sidebar Skip to footer

Convert Pandas Series Of Lists To Dataframe

I have a series made of lists import pandas as pd s = pd.Series([[1, 2, 3], [4, 5, 6]]) and I want a DataFrame with each column a list. None of from_items, from_records, DataFram

Solution 1:

As @Hatshepsut pointed out in the comments, from_items is deprecated as of version 0.23. The link suggests to use from_dict instead, so the old answer can be modified to:

pd.DataFrame.from_dict(dict(zip(s.index, s.values)))

--------------------------------------------------OLD ANSWER-------------------------------------------------------------

You can use from_items like this (assuming that your lists are of the same length):

pd.DataFrame.from_items(zip(s.index, s.values))

   01014125236

or

pd.DataFrame.from_items(zip(s.index, s.values)).T

   01201231456

depending on your desired output.

This can be much faster than using an apply (as used in @Wen's answer which, however, does also work for lists of different length):

%timeitpd.DataFrame.from_items(zip(s.index,s.values))1000 loops,best of 3:669µsperloop%timeits.apply(lambdax:pd.Series(x)).T1000 loops,best of 3:1.37msperloop

and

%timeitpd.DataFrame.from_items(zip(s.index,s.values)).T1000 loops,best of 3:919µsperloop%timeits.apply(lambdax:pd.Series(x))1000 loops,best of 3:1.26msperloop

Also @Hatshepsut's answer is quite fast (also works for lists of different length):

%timeit pd.DataFrame(item foritemin s)
1000 loops, best of 3: 636 µs per loop

and

%timeit pd.DataFrame(item foritemin s).T
1000 loops, best of 3: 884 µs per loop

Fastest solution seems to be @Abdou's answer (tested for Python 2; also works for lists of different length; use itertools.zip_longest in Python 3.6+):

%timeitpd.DataFrame.from_records(izip_longest(*s.values))1000 loops,best of 3:529µsperloop

An additional option:

pd.DataFrame(dict(zip(s.index, s.values)))

   01014125236

Solution 2:

If the length of the series is super high (more than 1m), you can use:

s = pd.Series([[1, 2, 3], [4, 5, 6]])
pd.DataFrame(s.tolist())

Solution 3:

Iterate over the series like this:

series = pd.Series([[1, 2, 3], [4, 5, 6]])
pd.DataFrame(item for item in series)

   01201231456

Solution 4:

pd.DataFrame.from_records should also work using itertools.zip_longest:

from itertools import zip_longest

pd.DataFrame.from_records(zip_longest(*s.values))

#    0  1# 0  1  4# 1  2  5# 2  3  6

Solution 5:

Try:

import numpy as np, pandas as pd
s = pd.Series([[1, 2, 3], [4, 5, 6]])
pd.DataFrame(np.vstack(s))

Post a Comment for "Convert Pandas Series Of Lists To Dataframe"