Non-NDFFrame Object Error Using Pandas.SparseSeries.from_coo() Function
I am trying to convert a COO type sparse matrix (from Scipy.Sparse) to a Pandas sparse series. From the documentation(http://pandas.pydata.org/pandas-docs/stable/sparse.html) it sa
Solution 1:
I have an older pandas
. It has the sparse code, but not the tocoo
.
The pandas issue that has been filed in connection with this is:
https://github.com/pydata/pandas/issues/10818
But I found on github
that:
def _coo_to_sparse_series(A, dense_index=False):
""" Convert a scipy.sparse.coo_matrix to a SparseSeries.
Use the defaults given in the SparseSeries constructor. """
s = Series(A.data, MultiIndex.from_arrays((A.row, A.col)))
s = s.sort_index()
s = s.to_sparse() # TODO: specify kind?
# ...
return s
With a smallish sparse matrix I construct and display without problems:
In [259]: Asml=sparse.coo_matrix(np.arange(10*5).reshape(10,5))
In [260]: s=pd.Series(Asml.data,pd.MultiIndex.from_arrays((Asml.row,Asml.col)))
In [261]: s=s.sort_index()
In [262]: s
Out[262]:
0 1 1
2 2
3 3
4 4
1 0 5
1 6
2 7
[... mine]
3 48
4 49
dtype: int32
In [263]: ssml=s.to_sparse()
In [264]: ssml
Out[264]:
0 1 1
2 2
3 3
4 4
1 0 5
[... mine]
2 47
3 48
4 49
dtype: int32
BlockIndex
Block locations: array([0])
Block lengths: array([49])
but with a larger array (more nonzero elements) I get a display error. I'm guessing it happens when the display for the (plain) series starts to use an ellipsis (...). I'm running in Py3, so I get a different error message.
....\pandas\core\base.pyc in __str__(self)
45 if compat.PY3:
46 return self.__unicode__() # py3
47 return self.__bytes__() # py2 route
e.g.:
In [265]: Asml=sparse.coo_matrix(np.arange(10*7).reshape(10,7))
In [266]: s=pd.Series(Asml.data,pd.MultiIndex.from_arrays((Asml.row,Asml.col)))
In [267]: s=s.sort_index()
In [268]: s
Out[268]:
0 1 1
2 2
3 3
4 4
5 5
6 6
1 0 7
1 8
2 9
3 10
4 11
5 12
6 13
2 0 14
1 15
...
7 6 55
8 0 56
1 57
[... mine]
Length: 69, dtype: int32
In [269]: ssml=s.to_sparse()
In [270]: ssml
Out[270]: <repr(<pandas.sparse.series.SparseSeries at 0xaff6bc0c>)
failed: AttributeError: 'SparseArray' object has no attribute '_get_repr'>
I'm not sufficiently familiar with pandas code and structures to deduce much more for now.
Post a Comment for "Non-NDFFrame Object Error Using Pandas.SparseSeries.from_coo() Function"