Skip to content Skip to sidebar Skip to footer

Opening Csv File In A Numpy.txt In Python3

I have a csv file and tryng to open it using numpy.loadtxt. if I open it using pandas, the file will look like this small example: small example: Name Accession Class Specie

Solution 1:

In [19]: txt = '''Name,Accession,Class,Species,Annotation,CF330 
    ...: ,,,,, 
    ...: A2M,NM_000014.4,Endogenous,Hs,,11495 
    ...: ACVR1C,NM_145259.2,Endogenous,Hs,,28 
    ...: ADAM12,NM_003474.5,Endogenous,Hs,,1020 
    ...: ADGRE1,NM_001256252.1,Endogenous,Hs,,42'''

With dtype=None, genfromtxt gives us a structured array:

In [23]: np.genfromtxt(txt.splitlines(), names=True, dtype=None, encoding=None,delimiter=',')        
Out[23]: 
array([('', '', '', '', False,    -1),
       ('A2M', 'NM_000014.4', 'Endogenous', 'Hs', False, 11495),
       ('ACVR1C', 'NM_145259.2', 'Endogenous', 'Hs', False,    28),
       ('ADAM12', 'NM_003474.5', 'Endogenous', 'Hs', False,  1020),
       ('ADGRE1', 'NM_001256252.1', 'Endogenous', 'Hs', False,    42)],
      dtype=[('Name', '<U6'), ('Accession', '<U14'), ('Class', '<U10'), ('Species', '<U2'), ('Annotation', '?'), ('CF330', '<i8')])

In dataframe form:

In [26]: pd.DataFrame(_23)                                                                           
Out[26]: 
     Name       Accession       Class Species  Annotation  CF330
0False-11     A2M     NM_000014.4  Endogenous      Hs       False114952  ACVR1C     NM_145259.2  Endogenous      Hs       False283  ADAM12     NM_003474.5  Endogenous      Hs       False10204  ADGRE1  NM_001256252.1  Endogenous      Hs       False42

Default dtype for loadtxt and genfromtxt is float. You get errors in loadtxt if the file has strings that don't convert; and nan in genfromtxt. The documentation for these functions is long, but worth the read if you want to use them correctly.

np.loadtxt(
    fname,
    dtype=<class'float'>,      # DEFAULT DTYPE
    comments='#',
    delimiter=None,
    converters=None,
    skiprows=0,
    usecols=None,
    unpack=False,
    ndmin=0,
    encoding='bytes',
    max_rows=None,
)

Alternative uses of loadtxt:

In [31]: np.loadtxt(txt.splitlines(), skiprows=1, dtype=str, encoding=None,delimiter=',')            
Out[31]: 
array([['', '', '', '', '', ''],
       ['A2M', 'NM_000014.4', 'Endogenous', 'Hs', '', '11495'],
       ['ACVR1C', 'NM_145259.2', 'Endogenous', 'Hs', '', '28'],
       ['ADAM12', 'NM_003474.5', 'Endogenous', 'Hs', '', '1020'],
       ['ADGRE1', 'NM_001256252.1', 'Endogenous', 'Hs', '', '42']],
      dtype='<U14')
In [32]: np.loadtxt(txt.splitlines(), skiprows=1, dtype=object, encoding=None,delimiter=',')         
Out[32]: 
array([['', '', '', '', '', ''],
       ['A2M', 'NM_000014.4', 'Endogenous', 'Hs', '', '11495'],
       ['ACVR1C', 'NM_145259.2', 'Endogenous', 'Hs', '', '28'],
       ['ADAM12', 'NM_003474.5', 'Endogenous', 'Hs', '', '1020'],
       ['ADGRE1', 'NM_001256252.1', 'Endogenous', 'Hs', '', '42']],
      dtype=object)

Solution 2:

Use dtype=object

Ex:

FH = np.loadtxt('datafile1.csv', delimiter=',', skiprows=1, dtype=object)
print(FH)

Post a Comment for "Opening Csv File In A Numpy.txt In Python3"