ts_parameters_data_check

Data check that checks whether the time series parameters are compatible with the data size.

Module Contents

Classes Summary

TimeSeriesParametersDataCheck

Checks whether the time series parameters are compatible with data splitting.

Contents

class evalml.data_checks.ts_parameters_data_check.TimeSeriesParametersDataCheck(problem_configuration, n_splits)[source]

Checks whether the time series parameters are compatible with data splitting.

If gap + max_delay + forecast_horizon > X.shape[0] // (n_splits + 1)

then the feature engineering window is larger than the smallest split. This will cause the pipeline to create features from data that does not exist, which will cause errors.

Parameters
  • problem_configuration (dict) – Dict containing problem_configuration parameters.

  • n_splits (int) – Number of time series splits.

Methods

name

Return a name describing the data check.

validate

Check if the time series parameters are compatible with data splitting.

name(cls)

Return a name describing the data check.

validate(self, X, y=None)[source]

Check if the time series parameters are compatible with data splitting.

Parameters
  • X (pd.DataFrame, np.ndarray) – Features.

  • y (pd.Series, np.ndarray) – Ignored. Defaults to None.

Returns

dict with a DataCheckError if parameters are too big for the split sizes.

Return type

dict