Skip to content Skip to sidebar Skip to footer

Set Max String Length In Pandas

I want my dataframe to auto-truncate strings which are longer than a certain length. basically: pd.set_option('auto_truncate_string_exceeding_this_length', 255) Any ideas? I have

Solution 1:

I'm not sure you can do this on the whole df, the following would work after loading:

In [21]:

df = pd.DataFrame({"a":['jasjdhadasd']*5, "b":arange(5)})
df
Out[21]:
             a  b
0  jasjdhadasd  01  jasjdhadasd  12  jasjdhadasd  23  jasjdhadasd  34  jasjdhadasd  4
In [22]:

for col in df:
    if is_string_like(df[col]):
        df[col] = df[col].str.slice(0,5)
df
Out[22]:
       a  b
0  jasjd  01  jasjd  12  jasjd  23  jasjd  34  jasjd  4

EDIT

I think if you specified the dtypes in the args to read_csv then you could set the max length:

df = pd.read_csv('file.csv', dtype=(np.str, maxlen))

I will try this and confirm shortly

UPDATE

Sadly you cannot specify the length, an error is raised if you try this:

NotImplementedError: the dtype <U5 isnot supported for parsing

when attempting to pass the arg dtype=(str,5)

Solution 2:

pd.set_option('display.max_colwidth', 255)

Solution 3:

You can use read_csv converters. Lets say you want to truncate column name abc, you can pass a dictionary with function like

def auto_truncate(val):
    returnval[:255]
df = pd.read_csv('file.csv', converters={'abc': auto_truncate}

If you have columns with different lengths

df = pd.read_csv('file.csv', converters={'abc': lambda: x: x[:255], 'xyz': lambda: x: x[:512]}

Make sure column type is string. Column index can also be used instead of name in converters dict.

Solution 4:

You can also simply truncate a single column with

df['A'] = df['A'].str[:255]

Post a Comment for "Set Max String Length In Pandas"