Skip to content Skip to sidebar Skip to footer

Non-NDFFrame Object Error Using Pandas.SparseSeries.from_coo() Function

I am trying to convert a COO type sparse matrix (from Scipy.Sparse) to a Pandas sparse series. From the documentation(http://pandas.pydata.org/pandas-docs/stable/sparse.html) it sa

Solution 1:

I have an older pandas. It has the sparse code, but not the tocoo. The pandas issue that has been filed in connection with this is: https://github.com/pydata/pandas/issues/10818

But I found on github that:

def _coo_to_sparse_series(A, dense_index=False):
    """ Convert a scipy.sparse.coo_matrix to a SparseSeries.
    Use the defaults given in the SparseSeries constructor. """
    s = Series(A.data, MultiIndex.from_arrays((A.row, A.col)))
    s = s.sort_index()
    s = s.to_sparse()  # TODO: specify kind?
    # ...
    return s

With a smallish sparse matrix I construct and display without problems:

In [259]: Asml=sparse.coo_matrix(np.arange(10*5).reshape(10,5))
In [260]: s=pd.Series(Asml.data,pd.MultiIndex.from_arrays((Asml.row,Asml.col)))
In [261]: s=s.sort_index()
In [262]: s
Out[262]: 
0  1     1
   2     2
   3     3
   4     4
1  0     5
   1     6
   2     7
 [...  mine]
   3    48
   4    49
dtype: int32
In [263]: ssml=s.to_sparse()
In [264]: ssml
Out[264]: 
0  1     1
   2     2
   3     3
   4     4
1  0     5
  [...  mine]
   2    47
   3    48
   4    49
dtype: int32
BlockIndex
Block locations: array([0])
Block lengths: array([49])

but with a larger array (more nonzero elements) I get a display error. I'm guessing it happens when the display for the (plain) series starts to use an ellipsis (...). I'm running in Py3, so I get a different error message.

....\pandas\core\base.pyc in __str__(self)
     45         if compat.PY3:
     46             return self.__unicode__()   # py3
     47         return self.__bytes__()         # py2 route

e.g.:

In [265]: Asml=sparse.coo_matrix(np.arange(10*7).reshape(10,7))
In [266]: s=pd.Series(Asml.data,pd.MultiIndex.from_arrays((Asml.row,Asml.col)))
In [267]: s=s.sort_index()
In [268]: s
Out[268]: 
0  1     1
   2     2
   3     3
   4     4
   5     5
   6     6
1  0     7
   1     8
   2     9
   3    10
   4    11
   5    12
   6    13
2  0    14
   1    15
...
7  6    55
8  0    56
   1    57
[... mine]
Length: 69, dtype: int32
In [269]: ssml=s.to_sparse()
In [270]: ssml
Out[270]: <repr(<pandas.sparse.series.SparseSeries at 0xaff6bc0c>)
failed: AttributeError: 'SparseArray' object has no attribute '_get_repr'>

I'm not sufficiently familiar with pandas code and structures to deduce much more for now.


Post a Comment for "Non-NDFFrame Object Error Using Pandas.SparseSeries.from_coo() Function"