datetime_nan_data_check¶
Data check that checks each column in the input for datetime features and will issue an error if NaN values are present.
Module Contents¶
Classes Summary¶
Check each column in the input for datetime features and will issue an error if NaN values are present. |
Attributes Summary¶
Contents¶
-
class
evalml.data_checks.datetime_nan_data_check.
DateTimeNaNDataCheck
[source]¶ Check each column in the input for datetime features and will issue an error if NaN values are present.
Methods
Return a name describing the data check.
Check if any datetime columns contain NaN values.
-
name
(cls)¶ Return a name describing the data check.
-
validate
(self, X, y=None)[source]¶ Check if any datetime columns contain NaN values.
- Parameters
X (pd.DataFrame, np.ndarray) – Features.
y (pd.Series, np.ndarray) – Ignored. Defaults to None.
- Returns
dict with a DataCheckError if NaN values are present in datetime columns.
- Return type
dict
Examples
>>> import pandas as pd >>> import numpy as np ... >>> dates = [['2-1-21', '3-1-21'], ... ['2-2-21', '3-2-21'], ... ['2-3-21', '3-3-21'], ... ['2-4-21', '3-4-21']] >>> df = pd.DataFrame(dates, columns=['index', "days"]) >>> dt_nan_dc = DateTimeNaNDataCheck() >>> assert dt_nan_dc.validate(df) == {'warnings': [], 'errors': [], 'actions': []} ... ... >>> dates[0][0] = np.datetime64('NaT') >>> df = pd.DataFrame(dates, columns=['index', "days"]) >>> assert dt_nan_dc.validate(df) == { ... 'warnings': [], ... 'errors': [{'message': 'Input datetime column(s) (index) contains NaN values. Please impute NaN values or drop these rows or columns.', ... 'data_check_name': 'DateTimeNaNDataCheck', ... 'level': 'error', ... 'details': {'columns': ['index'], 'rows': None}, ... 'code': 'DATETIME_HAS_NAN'}], ... 'actions': []} ... ... >>> dates[0][1] = None >>> df = pd.DataFrame(dates, columns=['index', "days"]) >>> assert dt_nan_dc.validate(df) == { ... 'warnings': [], ... 'errors': [{'message': 'Input datetime column(s) (index, days) contains NaN values. Please impute NaN values or drop these rows or columns.', ... 'data_check_name': 'DateTimeNaNDataCheck', ... 'level': 'error', ... 'details': {'columns': ['index', 'days'], 'rows': None}, ... 'code': 'DATETIME_HAS_NAN'}], ... 'actions': []} ... ... >>> dates[0][1] = pd.NA >>> df = pd.DataFrame(dates, columns=['index', "days"]) >>> assert dt_nan_dc.validate(df) == { ... 'warnings': [], ... 'errors': [{'message': 'Input datetime column(s) (index, days) contains NaN values. Please impute NaN values or drop these rows or columns.', ... 'data_check_name': 'DateTimeNaNDataCheck', ... 'level': 'error', ... 'details': {'columns': ['index', 'days'], 'rows': None}, ... 'code': 'DATETIME_HAS_NAN'}], ... 'actions': []}
-
-
evalml.data_checks.datetime_nan_data_check.
error_contains_nan
= Input datetime column(s) ({}) contains NaN values. Please impute NaN values or drop these rows...¶