polynomial_decomposer ====================================================================================== .. py:module:: evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer .. autoapi-nested-parse:: Component that removes trends from time series by fitting a polynomial to the data. Module Contents --------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: PolynomialDecomposer(time_index: str = None, degree: int = 1, seasonal_period: int = -1, random_seed: int = 0, **kwargs) Removes trends and seasonality from time series by fitting a polynomial and moving average to the data. Scikit-learn's PolynomialForecaster is used to generate the additive trend portion of the target data. A polynomial will be fit to the data during fit. That additive polynomial trend will be removed during fit so that statsmodel's seasonal_decompose can determine the addititve seasonality of the data by using rolling averages over the series' inferred periodicity. For example, daily time series data will generate rolling averages over the first week of data, normalize out the mean and return those 7 averages repeated over the entire length of the given series. Those seven averages, repeated as many times as necessary to match the length of the given target data, will be used as the seasonal signal of the data. :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param degree: Degree for the polynomial. If 1, linear model is fit to the data. If 2, quadratic model is fit, etc. Defaults to 1. :type degree: int :param seasonal_period: The number of entries in the time series data that corresponds to one period of a cyclic signal. For instance, if data is known to possess a weekly seasonal signal, and if the data is daily data, seasonal_period should be 7. For daily data with a yearly seasonal signal, seasonal_period should be 365. Defaults to -1, which uses the statsmodels libarary's freq_to_period function. https://github.com/statsmodels/statsmodels/blob/main/statsmodels/tsa/tsatools.py :type seasonal_period: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "degree": Integer(1, 3)} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Polynomial Decomposer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.clone evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.default_parameters evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.describe evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.fit evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.fit_transform evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.get_trend_dataframe evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.inverse_transform evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.load evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.needs_fitting evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.parameters evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.plot_decomposition evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.save evalml.pipelines.components.transformers.preprocessing.polynomial_decomposer.PolynomialDecomposer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X: pandas.DataFrame, y: pandas.Series = None) -> PolynomialDecomposer Fits the PolynomialDecomposer and determine the seasonal signal. Currently only fits the polynomial detrender. The seasonality is determined by removing the trend from the signal and using statsmodels' seasonal_decompose(). Both the trend and seasonality are currently assumed to be additive. :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: self :raises ValueError: If y is None. :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data .. py:method:: fit_transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Removes fitted trend and seasonality from target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the fitted trend removed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: get_trend_dataframe(self, X: pandas.DataFrame, y: pandas.Series) -> list[pandas.DataFrame] Return a list of dataframes with 3 columns: trend, seasonality, residual. Scikit-learn's PolynomialForecaster is used to generate the trend portion of the target data. statsmodel's seasonal_decompose is used to generate the seasonality of the data. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :returns: Each DataFrame contains the columns "signal", "trend", "seasonality" and "residual," with the latter 3 column values being the decomposed elements of the target data. The "signal" column is simply the input target signal but reindexed with a datetime index to match the input features. :rtype: list of pd.DataFrame :raises TypeError: If X does not have time-series data in the index. :raises ValueError: If time series index of X does not have an inferred frequency. :raises ValueError: If the forecaster associated with the detrender has not been fit yet. :raises TypeError: If y is not provided as a pandas Series or DataFrame. .. py:method:: inverse_transform(self, y: pandas.Series) -> tuple[pandas.DataFrame, pandas.Series] Adds back fitted trend and seasonality to target variable. The polynomial trend is added back into the signal, calling the detrender's inverse_transform(). Then, the seasonality is projected forward to and added back into the signal. :param y: Target variable. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the trend and seasonality added back in. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If y is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: plot_decomposition(self, X: pandas.DataFrame, y: pandas.Series, show=False) Plots the decomposition of the target signal. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :param show: Whether to display the plot or not. Defaults to False. :type show: bool :returns: The figure and axes that have the decompositions plotted on them :rtype: matplotlib.pyplot.Figure, matplotlib.pyplot.Axes .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Transforms the target data by removing the polynomial trend and rolling average seasonality. Applies the fit polynomial detrender to the target data, removing the additive polynomial trend. Then, utilizes the first period's worth of seasonal data determined in the .train() function to extrapolate the seasonal signal of the data to be transformed. This seasonal signal is also assumed to be additive and is removed. :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is detrended and deseasonalized. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data