time_series_regularizer ======================================================================================== .. py:module:: evalml.pipelines.components.transformers.preprocessing.time_series_regularizer .. autoapi-nested-parse:: Transformer that regularizes a dataset with an uninferrable offset frequency for time series problems. Module Contents --------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: TimeSeriesRegularizer(time_index=None, frequency_payload=None, window_length=4, threshold=0.4, random_seed=0, **kwargs) Transformer that regularizes an inconsistently spaced datetime column. If X is passed in to fit/transform, the column `time_index` will be checked for an inferrable offset frequency. If the `time_index` column is perfectly inferrable then this Transformer will do nothing and return the original X and y. If X does not have a perfectly inferrable frequency but one can be estimated, then X and y will be reformatted based on the estimated frequency for `time_index`. In the original X and y passed: - Missing datetime values will be added and will have their corresponding columns in X and y set to None. - Duplicate datetime values will be dropped. - Extra datetime values will be dropped. - If it can be determined that a duplicate or extra value is misaligned, then it will be repositioned to take the place of a missing value. This Transformer should be used before the `TimeSeriesImputer` in order to impute the missing values that were added to X and y (if passed). If used on multiseries dataset, works specifically on unstacked datasets. :param time_index: Name of the column containing the datetime information used to order the data, required. Defaults to None. :type time_index: string :param frequency_payload: Payload returned from Woodwork's infer_frequency function where debug is True. Defaults to None. :type frequency_payload: tuple :param window_length: The size of the rolling window over which inference is conducted to determine the prevalence of uninferrable frequencies. :type window_length: int :param Lower values make this component more sensitive to recognizing numerous faulty datetime values. Defaults to 5.: :param threshold: The minimum percentage of windows that need to have been able to infer a frequency. Lower values make this component more :type threshold: float :param sensitive to recognizing numerous faulty datetime values. Defaults to 0.8.: :param random_seed: Seed for the random number generator. This transformer performs the same regardless of the random seed provided. :type random_seed: int :param Defaults to 0.: :raises ValueError: if the frequency_payload parameter has not been passed a tuple **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Time Series Regularizer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.clone evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.default_parameters evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.describe evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.fit evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.fit_transform evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.load evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.needs_fitting evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.parameters evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.save evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.transform evalml.pipelines.components.transformers.preprocessing.time_series_regularizer.TimeSeriesRegularizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the TimeSeriesRegularizer. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: if self.time_index is None, if X and y have different lengths, if `time_index` in X does not have an offset frequency that can be estimated :raises TypeError: if the `time_index` column is not of type Datetime :raises KeyError: if the `time_index` column doesn't exist .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Regularizes a dataframe and target data to an inferrable offset frequency. A 'clean' X and y (if y was passed in) are created based on an inferrable offset frequency and matching datetime values with the original X and y are imputed into the clean X and y. Datetime values identified as misaligned are shifted into their appropriate position. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Data with an inferrable `time_index` offset frequency. :rtype: (pd.DataFrame, pd.Series) .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional