Skip to content Skip to sidebar Skip to footer

Best Data Types For Binary Variables In Pandas Csv Import To Decrease Memory Usage

My original file for training purpose have 25Gb. My machine has 64Gb of RAM. Importing data with default options always ends up in 'Memory Error', therefore after reading some post

Solution 1:

Referring to the NumPy document here the least possible choice for allocating items in the array/list is "int8" dtype of numpy which has the corresponding "int8_t" in C.

For binary lists / list-like objects, "uint8", "int8", "byte" or "bool" types would yield the same size (allocation) for an item which is 1 byte.

Post a Comment for "Best Data Types For Binary Variables In Pandas Csv Import To Decrease Memory Usage"