evalml.data_checks.InvalidTargetDataCheck.validate

InvalidTargetDataCheck.validate(X, y)[source]

Checks if the target data contains missing or invalid values.

Parameters
  • X (pd.DataFrame, np.ndarray) – Features. Ignored.

  • y (pd.Series, np.ndarray) – Target data to check for invalid values.

Returns

List with DataCheckErrors if any invalid values are found in the target data.

Return type

dict (DataCheckError)

Example

>>> import pandas as pd
>>> X = pd.DataFrame({"col": [1, 2, 3, 1]})
>>> y = pd.Series([0, 1, None, None])
>>> target_check = InvalidTargetDataCheck('binary', 'Log Loss Binary')
>>> assert target_check.validate(X, y) == {"errors": [{"message": "2 row(s) (50.0%) of target values are null",                                                                   "data_check_name": "InvalidTargetDataCheck",                                                                   "level": "error",                                                                   "code": "TARGET_HAS_NULL",                                                                   "details": {"num_null_rows": 2, "pct_null_rows": 50}}],                                                       "warnings": [],                                                       "actions": [{'code': 'IMPUTE_COL', 'metadata': {'column': None, 'impute_strategy': 'most_frequent', 'is_target': True}}]}