Can Pandas.DataFrame Have List Type Column?
Is it possible to create pandas.DataFrame which includes list type field? For example, I'd like to load the following csv to pandas.DataFrame: id,scores 1,'[1,2,3,4]' 2,'[1,2]' 3,'
Solution 1:
Strip the double quotes:
id,scores
1, [1,2,3,4]
2, [1,2]
3, [0,2,4]
And you should be able to do this:
query = [[1, [1,2,3,4]], [2, [1,2]], [3, [0,2,4]]]
df = pandas.DataFrame(query, columns=['id', 'scores'])
print df
Solution 2:
You can use:
import pandas as pd
import io
temp=u'''id,scores
1,"[1,2,3,4]"
2,"[1,2]"
3,"[0,2,4]"'''
df = pd.read_csv(io.StringIO(temp), sep=',', index_col=[0] )
print df
scores
id
1 [1,2,3,4]
2 [1,2]
3 [0,2,4]
But dtype of column scores is object
, not list.
One approach use ast
and converters
:
import pandas as pd
import io
from ast import literal_eval
temp=u'''id,scores
1,"[1,2,3,4]"
2,"[1,2]"
3,"[0,2,4]"'''
def converter(x):
#define format of datetime
return literal_eval(x)
#define each column
converters={'scores': converter}
df = pd.read_csv(io.StringIO(temp), sep=',', converters=converters)
print df
id scores
0 1 [1, 2, 3, 4]
1 2 [1, 2]
2 3 [0, 2, 4]
#check lists:
print 2 in df.scores[2]
#True
print 1 in df.scores[2]
#False
Post a Comment for "Can Pandas.DataFrame Have List Type Column?"