Understanding Format Of Data In Scikit-learn
I am trying to work with multi-label text classification using scikit-learn in Python 3.x. I have data in libsvm format which I am loading using load_svmlight_file module. The data
Solution 1:
This has nothing to do with multilabel classification per se. The feature matrix X
that you get from load_svmlight_file
is a SciPy CSR matrix, as explained in the docs, and those print in a rather unfortunate format:
>>> from scipy.sparse import csr_matrix
>>> X = csr_matrix([[0, 0, 1], [2, 3, 0]])
>>> X
<2x3 sparse matrix of type '<type 'numpy.int64'>'
with 3 stored elements in Compressed Sparse Row format>
>>> X.toarray()
array([[0, 0, 1],
[2, 3, 0]])
>>> print(X)
(0, 2) 1
(1, 0) 2
(1, 1) 3
Post a Comment for "Understanding Format Of Data In Scikit-learn"