Skip to content Skip to sidebar Skip to footer

Ignore Nan-values In Guessdatetime-function, But Raise Valueerror When Date-string Cannot Be Converted

I'm using a function to convert different string datetime formats to the same datetime format. I want the code to raise an error (ValueError) when there is a datetime format not be

Solution 1:

given the described setup, you could check for type str, which would return False for np.nan. I took the freedom to modify the function slightly so you can simply apply it:

def guess_date(string):
    ifnot isinstance(string, str):
        return pd.NaT
    for fmt in ["%Y/%m/%d", "%Y-%m-%d", "%d%m%Y", "%d%b%Y"]:
        try:
            return datetime.datetime.strptime(string, fmt).date()
        except ValueError:
            continueelse:
        raise ValueError(f"incompatible string {string}")


df2['Date'].apply(guess_date)
# 0    2016-01-01# 1    2019-03-25# 2           NaT# 3    2018-01-01# 4    2017-01-01# 5           NaT# 6    2013-01-01# 7    2016-01-01# 8    2019-01-01# 9    2014-01-01# Name: Date, dtype: object

Note though that this is the same result you get from

pd.to_datetime(df2['Date']).dt.date

which is probably more efficient. So the function only serves the purpose to check for "undefined" formats.

Solution 2:

To be honest with you, I would try to refactor the code in the end. But here is the quick fix to ur code to accept nan values:

import pandas as pd
import numpy as np
import datetime

df = pd.DataFrame({"ID": [12,96,73,84,87,64,11,34],
                 "Date": ['2016-01-01', '25Mar2019', '2018/01/01', '2017-01-01', '2013-01-01', '2016-01-01', '2019-01-01', '2014-01-01']})
print(df)

df2 = pd.DataFrame({"ID": [12,96,20,73,84,26,87,64,11,34],
                 "Date": ['2016-01-01', '25Mar2019', np.nan, '2018/01/01', '2017-01-01', np.nan, '2013-01-01', '2016-01-01', '2019-01-01', '2014-01-01']})
print(df2)

defguess_date(string): 
   
   if pd.isnull(string):
       return(string)
   for fmt in ["%Y/%m/%d", "%Y-%m-%d", "%d%m%Y", "%d%b%Y"]:
        try:
            return datetime.datetime.strptime(string, fmt).date()
        except ValueError as e:
            continueraise ValueError(string)

for i inrange(len(df.Date)):  # len(result.DCP_lastmoddate)

    df.loc[i, 'Date'] = guess_date(df.loc[i, 'Date'])
print(df.Date)

for i inrange(len(df2.Date)):  # len(result.DCP_lastmoddate)

    df2.loc[i, 'Date'] = guess_date(df2.loc[i, 'Date'])
print(df2.Date)

Post a Comment for "Ignore Nan-values In Guessdatetime-function, But Raise Valueerror When Date-string Cannot Be Converted"