Skip to content Skip to sidebar Skip to footer

Python Pandas To_csv Zip Format

I am having a peculiar problem when writing zip files through to_csv. Using GZIP: df.to_csv(path_or_buf = 'sample.csv.gz', compression='gzip', index = None, sep = ',', header=True,

Solution 1:

It is pretty straightforward in pandas since version 1.0.0 using dict as compression options:

filename = 'sample'
compression_options = dict(method='zip', archive_name=f'{filename}.csv')
df.to_csv(f'{filename}.zip', compression=compression_options, ...)

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html


Solution 2:

As the thread linked in the comment discusses, ZIP's directory-like nature makes it hard to do what you want without making a lot of assumptions or complicating the arguments for to_csv

If your goal is to write the data directly to a ZIP file, that's harder than you'd think.

If you can bear temporarily writing your data to the filesystem, you can use Python's zipfile module to put that file in a ZIP with the name you preferred, and then delete the file.


import zipfile
import os

df.to_csv('sample.csv',index=None,sep=",",header=True,encoding='utf-8-sig')

with zipfile.ZipFile('sample.zip', 'w') as zf:
    zf.write('sample.csv')
os.remove('sample.csv')


Solution 3:

Since Pandas 1.0.0 it's possible to set compression using to_csv().

Example in one line:

df.to_csv('sample.zip', compression={'method': 'zip', 'archive_name': 'sample.csv'})

Post a Comment for "Python Pandas To_csv Zip Format"