Set Max String Length In Pandas
I want my dataframe to auto-truncate strings which are longer than a certain length. basically: pd.set_option('auto_truncate_string_exceeding_this_length', 255) Any ideas? I have
Solution 1:
I'm not sure you can do this on the whole df, the following would work after loading:
In [21]:
df = pd.DataFrame({"a":['jasjdhadasd']*5, "b":arange(5)})
df
Out[21]:
a b
0 jasjdhadasd 01 jasjdhadasd 12 jasjdhadasd 23 jasjdhadasd 34 jasjdhadasd 4
In [22]:
for col in df:
if is_string_like(df[col]):
df[col] = df[col].str.slice(0,5)
df
Out[22]:
a b
0 jasjd 01 jasjd 12 jasjd 23 jasjd 34 jasjd 4
EDIT
I think if you specified the dtypes in the args to read_csv
then you could set the max length:
df = pd.read_csv('file.csv', dtype=(np.str, maxlen))
I will try this and confirm shortly
UPDATE
Sadly you cannot specify the length, an error is raised if you try this:
NotImplementedError: the dtype <U5 isnot supported for parsing
when attempting to pass the arg dtype=(str,5)
Solution 2:
pd.set_option('display.max_colwidth', 255)
Solution 3:
You can use read_csv converters. Lets say you want to truncate column name abc
, you can pass a dictionary with function like
def auto_truncate(val):
returnval[:255]
df = pd.read_csv('file.csv', converters={'abc': auto_truncate}
If you have columns with different lengths
df = pd.read_csv('file.csv', converters={'abc': lambda: x: x[:255], 'xyz': lambda: x: x[:512]}
Make sure column type is string. Column index can also be used instead of name in converters dict.
Solution 4:
You can also simply truncate a single column with
df['A'] = df['A'].str[:255]
Post a Comment for "Set Max String Length In Pandas"