components ===================================== .. py:module:: evalml.pipelines.components .. autoapi-nested-parse:: EvalML component classes. Subpackages ----------- .. toctree:: :titlesonly: :maxdepth: 3 ensemble/index.rst estimators/index.rst transformers/index.rst Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 component_base/index.rst component_base_meta/index.rst utils/index.rst Package Contents ---------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.ARIMARegressor evalml.pipelines.components.BaselineClassifier evalml.pipelines.components.BaselineRegressor evalml.pipelines.components.CatBoostClassifier evalml.pipelines.components.CatBoostRegressor evalml.pipelines.components.ComponentBase evalml.pipelines.components.ComponentBaseMeta evalml.pipelines.components.DateTimeFeaturizer evalml.pipelines.components.DecisionTreeClassifier evalml.pipelines.components.DecisionTreeRegressor evalml.pipelines.components.DFSTransformer evalml.pipelines.components.DropColumns evalml.pipelines.components.DropNaNRowsTransformer evalml.pipelines.components.DropNullColumns evalml.pipelines.components.DropRowsTransformer evalml.pipelines.components.ElasticNetClassifier evalml.pipelines.components.ElasticNetRegressor evalml.pipelines.components.EmailFeaturizer evalml.pipelines.components.Estimator evalml.pipelines.components.ExponentialSmoothingRegressor evalml.pipelines.components.ExtraTreesClassifier evalml.pipelines.components.ExtraTreesRegressor evalml.pipelines.components.FeatureSelector evalml.pipelines.components.Imputer evalml.pipelines.components.KNeighborsClassifier evalml.pipelines.components.LabelEncoder evalml.pipelines.components.LightGBMClassifier evalml.pipelines.components.LightGBMRegressor evalml.pipelines.components.LinearDiscriminantAnalysis evalml.pipelines.components.LinearRegressor evalml.pipelines.components.LogisticRegressionClassifier evalml.pipelines.components.LogTransformer evalml.pipelines.components.LSA evalml.pipelines.components.NaturalLanguageFeaturizer evalml.pipelines.components.OneHotEncoder evalml.pipelines.components.OrdinalEncoder evalml.pipelines.components.Oversampler evalml.pipelines.components.PCA evalml.pipelines.components.PerColumnImputer evalml.pipelines.components.PolynomialDecomposer evalml.pipelines.components.ProphetRegressor evalml.pipelines.components.RandomForestClassifier evalml.pipelines.components.RandomForestRegressor evalml.pipelines.components.ReplaceNullableTypes evalml.pipelines.components.RFClassifierRFESelector evalml.pipelines.components.RFClassifierSelectFromModel evalml.pipelines.components.RFRegressorRFESelector evalml.pipelines.components.RFRegressorSelectFromModel evalml.pipelines.components.SelectByType evalml.pipelines.components.SelectColumns evalml.pipelines.components.SimpleImputer evalml.pipelines.components.StackedEnsembleBase evalml.pipelines.components.StackedEnsembleClassifier evalml.pipelines.components.StackedEnsembleRegressor evalml.pipelines.components.StandardScaler evalml.pipelines.components.STLDecomposer evalml.pipelines.components.SVMClassifier evalml.pipelines.components.SVMRegressor evalml.pipelines.components.TargetEncoder evalml.pipelines.components.TargetImputer evalml.pipelines.components.TimeSeriesBaselineEstimator evalml.pipelines.components.TimeSeriesFeaturizer evalml.pipelines.components.TimeSeriesImputer evalml.pipelines.components.TimeSeriesRegularizer evalml.pipelines.components.Transformer evalml.pipelines.components.Undersampler evalml.pipelines.components.URLFeaturizer evalml.pipelines.components.VowpalWabbitBinaryClassifier evalml.pipelines.components.VowpalWabbitMulticlassClassifier evalml.pipelines.components.VowpalWabbitRegressor evalml.pipelines.components.XGBoostClassifier evalml.pipelines.components.XGBoostRegressor Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: ARIMARegressor(time_index: Optional[Hashable] = None, trend: Optional[str] = None, start_p: int = 2, d: int = 0, start_q: int = 2, max_p: int = 5, max_d: int = 2, max_q: int = 5, seasonal: bool = True, sp: int = 1, n_jobs: int = -1, random_seed: Union[int, float] = 0, maxiter: int = 10, use_covariates: bool = True, **kwargs) Autoregressive Integrated Moving Average Model. The three parameters (p, d, q) are the AR order, the degree of differencing, and the MA order. More information here: https://www.statsmodels.org/devel/generated/statsmodels.tsa.arima.model.ARIMA.html. Currently ARIMARegressor isn't supported via conda install. It's recommended that it be installed via PyPI. :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param trend: Controls the deterministic trend. Options are ['n', 'c', 't', 'ct'] where 'c' is a constant term, 't' indicates a linear trend, and 'ct' is both. Can also be an iterable when defining a polynomial, such as [1, 1, 0, 1]. :type trend: str :param start_p: Minimum Autoregressive order. Defaults to 2. :type start_p: int :param d: Minimum Differencing degree. Defaults to 0. :type d: int :param start_q: Minimum Moving Average order. Defaults to 2. :type start_q: int :param max_p: Maximum Autoregressive order. Defaults to 5. :type max_p: int :param max_d: Maximum Differencing degree. Defaults to 2. :type max_d: int :param max_q: Maximum Moving Average order. Defaults to 5. :type max_q: int :param seasonal: Whether to fit a seasonal model to ARIMA. Defaults to True. :type seasonal: boolean :param sp: Period for seasonal differencing, specifically the number of periods in each season. If "detect", this model will automatically detect this parameter (given the time series is a standard frequency) and will fall back to 1 (no seasonality) if it cannot be detected. Defaults to 1. :type sp: int or str :param n_jobs: Non-negative integer describing level of parallelism used for pipelines. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "start_p": Integer(1, 3), "d": Integer(0, 2), "start_q": Integer(1, 3), "max_p": Integer(3, 10), "max_d": Integer(2, 5), "max_q": Integer(3, 10), "seasonal": [True, False],} * - **max_cols** - 7 * - **max_rows** - 1000 * - **model_family** - ModelFamily.ARIMA * - **modifies_features** - True * - **modifies_target** - False * - **name** - ARIMA Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ARIMARegressor.clone evalml.pipelines.components.ARIMARegressor.default_parameters evalml.pipelines.components.ARIMARegressor.describe evalml.pipelines.components.ARIMARegressor.feature_importance evalml.pipelines.components.ARIMARegressor.fit evalml.pipelines.components.ARIMARegressor.get_prediction_intervals evalml.pipelines.components.ARIMARegressor.load evalml.pipelines.components.ARIMARegressor.needs_fitting evalml.pipelines.components.ARIMARegressor.parameters evalml.pipelines.components.ARIMARegressor.predict evalml.pipelines.components.ARIMARegressor.predict_proba evalml.pipelines.components.ARIMARegressor.save evalml.pipelines.components.ARIMARegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> numpy.ndarray :property: Returns array of 0's with a length of 1 as feature_importance is not defined for ARIMA regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits ARIMA regressor to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y was not passed in. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: pandas.Series = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted ARIMARegressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Optional. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Not used for ARIMA regressor. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) -> pandas.Series Make predictions using fitted ARIMA regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series :raises ValueError: If X was passed to `fit` but not passed in `predict`. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: BaselineClassifier(strategy='mode', random_seed=0, **kwargs) Classifier that predicts using the specified strategy. This is useful as a simple baseline classifier to compare with other classifiers. :param strategy: Method used to predict. Valid options are "mode", "random" and "random_weighted". Defaults to "mode". :type strategy: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Baseline Classifier * - **supported_problem_types** - [ProblemTypes.BINARY, ProblemTypes.MULTICLASS] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.BaselineClassifier.classes_ evalml.pipelines.components.BaselineClassifier.clone evalml.pipelines.components.BaselineClassifier.default_parameters evalml.pipelines.components.BaselineClassifier.describe evalml.pipelines.components.BaselineClassifier.feature_importance evalml.pipelines.components.BaselineClassifier.fit evalml.pipelines.components.BaselineClassifier.get_prediction_intervals evalml.pipelines.components.BaselineClassifier.load evalml.pipelines.components.BaselineClassifier.needs_fitting evalml.pipelines.components.BaselineClassifier.parameters evalml.pipelines.components.BaselineClassifier.predict evalml.pipelines.components.BaselineClassifier.predict_proba evalml.pipelines.components.BaselineClassifier.save evalml.pipelines.components.BaselineClassifier.update_parameters .. py:method:: classes_(self) :property: Returns class labels. Will return None before fitting. :returns: Class names :rtype: list[str] or list(float) .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline classifiers do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes :rtype: pd.Series .. py:method:: fit(self, X, y=None) Fits baseline classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make prediction probabilities using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: BaselineRegressor(strategy='mean', random_seed=0, **kwargs) Baseline regressor that uses a simple strategy to make predictions. This is useful as a simple baseline regressor to compare with other regressors. :param strategy: Method used to predict. Valid options are "mean", "median". Defaults to "mean". :type strategy: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Baseline Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.BaselineRegressor.clone evalml.pipelines.components.BaselineRegressor.default_parameters evalml.pipelines.components.BaselineRegressor.describe evalml.pipelines.components.BaselineRegressor.feature_importance evalml.pipelines.components.BaselineRegressor.fit evalml.pipelines.components.BaselineRegressor.get_prediction_intervals evalml.pipelines.components.BaselineRegressor.load evalml.pipelines.components.BaselineRegressor.needs_fitting evalml.pipelines.components.BaselineRegressor.parameters evalml.pipelines.components.BaselineRegressor.predict evalml.pipelines.components.BaselineRegressor.predict_proba evalml.pipelines.components.BaselineRegressor.save evalml.pipelines.components.BaselineRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline regressors do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes. :rtype: np.ndarray (float) .. py:method:: fit(self, X, y=None) Fits baseline regression component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input y is None. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the baseline regression strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: CatBoostClassifier(n_estimators=10, eta=0.03, max_depth=6, bootstrap_type=None, silent=True, allow_writing_files=False, random_seed=0, n_jobs=-1, **kwargs) CatBoost Classifier, a classifier that uses gradient-boosting on decision trees. CatBoost is an open-source library and natively supports categorical features. For more information, check out https://catboost.ai/ :param n_estimators: The maximum number of trees to build. Defaults to 10. :type n_estimators: float :param eta: The learning rate. Defaults to 0.03. :type eta: float :param max_depth: The maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param bootstrap_type: Defines the method for sampling the weights of objects. Available methods are 'Bayesian', 'Bernoulli', 'MVS'. Defaults to None. :type bootstrap_type: string :param silent: Whether to use the "silent" logging mode. Defaults to True. :type silent: boolean :param allow_writing_files: Whether to allow writing snapshot files while training. Defaults to False. :type allow_writing_files: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(4, 100), "eta": Real(0.000001, 1), "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.CATBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - CatBoost Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.CatBoostClassifier.clone evalml.pipelines.components.CatBoostClassifier.default_parameters evalml.pipelines.components.CatBoostClassifier.describe evalml.pipelines.components.CatBoostClassifier.feature_importance evalml.pipelines.components.CatBoostClassifier.fit evalml.pipelines.components.CatBoostClassifier.get_prediction_intervals evalml.pipelines.components.CatBoostClassifier.load evalml.pipelines.components.CatBoostClassifier.needs_fitting evalml.pipelines.components.CatBoostClassifier.parameters evalml.pipelines.components.CatBoostClassifier.predict evalml.pipelines.components.CatBoostClassifier.predict_proba evalml.pipelines.components.CatBoostClassifier.save evalml.pipelines.components.CatBoostClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted CatBoost classifier. .. py:method:: fit(self, X, y=None) Fits CatBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make prediction probabilities using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: CatBoostRegressor(n_estimators=10, eta=0.03, max_depth=6, bootstrap_type=None, silent=False, allow_writing_files=False, random_seed=0, n_jobs=-1, **kwargs) CatBoost Regressor, a regressor that uses gradient-boosting on decision trees. CatBoost is an open-source library and natively supports categorical features. For more information, check out https://catboost.ai/ :param n_estimators: The maximum number of trees to build. Defaults to 10. :type n_estimators: float :param eta: The learning rate. Defaults to 0.03. :type eta: float :param max_depth: The maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param bootstrap_type: Defines the method for sampling the weights of objects. Available methods are 'Bayesian', 'Bernoulli', 'MVS'. Defaults to None. :type bootstrap_type: string :param silent: Whether to use the "silent" logging mode. Defaults to True. :type silent: boolean :param allow_writing_files: Whether to allow writing snapshot files while training. Defaults to False. :type allow_writing_files: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(4, 100), "eta": Real(0.000001, 1), "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.CATBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - CatBoost Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.CatBoostRegressor.clone evalml.pipelines.components.CatBoostRegressor.default_parameters evalml.pipelines.components.CatBoostRegressor.describe evalml.pipelines.components.CatBoostRegressor.feature_importance evalml.pipelines.components.CatBoostRegressor.fit evalml.pipelines.components.CatBoostRegressor.get_prediction_intervals evalml.pipelines.components.CatBoostRegressor.load evalml.pipelines.components.CatBoostRegressor.needs_fitting evalml.pipelines.components.CatBoostRegressor.parameters evalml.pipelines.components.CatBoostRegressor.predict evalml.pipelines.components.CatBoostRegressor.predict_proba evalml.pipelines.components.CatBoostRegressor.save evalml.pipelines.components.CatBoostRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted CatBoost regressor. .. py:method:: fit(self, X, y=None) Fits CatBoost regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted CatBoost regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ComponentBase(parameters=None, component_obj=None, random_seed=0, **kwargs) Base class for all components. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ComponentBase.clone evalml.pipelines.components.ComponentBase.default_parameters evalml.pipelines.components.ComponentBase.describe evalml.pipelines.components.ComponentBase.fit evalml.pipelines.components.ComponentBase.load evalml.pipelines.components.ComponentBase.modifies_features evalml.pipelines.components.ComponentBase.modifies_target evalml.pipelines.components.ComponentBase.name evalml.pipelines.components.ComponentBase.needs_fitting evalml.pipelines.components.ComponentBase.parameters evalml.pipelines.components.ComponentBase.save evalml.pipelines.components.ComponentBase.training_only evalml.pipelines.components.ComponentBase.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: modifies_features(cls) :property: Returns whether this component modifies (subsets or transforms) the features variable during transform. For Estimator objects, this attribute determines if the return value from `predict` or `predict_proba` should be used as features or targets. .. py:method:: modifies_target(cls) :property: Returns whether this component modifies (subsets or transforms) the target variable during transform. For Estimator objects, this attribute determines if the return value from `predict` or `predict_proba` should be used as features or targets. .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: training_only(cls) :property: Returns whether or not this component should be evaluated during training-time only, or during both training and prediction time. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ComponentBaseMeta Metaclass that overrides creating a new component by wrapping methods with validators and setters. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **FIT_METHODS** - ['fit', 'fit_transform'] * - **METHODS_TO_CHECK** - ['predict', 'predict_proba', 'transform', 'inverse_transform', 'get_trend_dataframe'] * - **PROPERTIES_TO_CHECK** - ['feature_importance'] **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ComponentBaseMeta.check_for_fit evalml.pipelines.components.ComponentBaseMeta.register evalml.pipelines.components.ComponentBaseMeta.set_fit .. py:method:: check_for_fit(cls, method) :classmethod: `check_for_fit` wraps a method that validates if `self._is_fitted` is `True`. It raises an exception if `False` and calls and returns the wrapped method if `True`. :param method: Method to wrap. :type method: callable :returns: The wrapped method. :raises ComponentNotYetFittedError: If component is not yet fitted. .. py:method:: register(cls, subclass) Register a virtual subclass of an ABC. Returns the subclass, to allow usage as a class decorator. .. py:method:: set_fit(cls, method) :classmethod: Wrapper for the fit method. .. py:class:: DateTimeFeaturizer(features_to_extract=None, encode_as_categories=False, time_index=None, random_seed=0, **kwargs) Transformer that can automatically extract features from datetime columns. :param features_to_extract: List of features to extract. Valid options include "year", "month", "day_of_week", "hour". Defaults to None. :type features_to_extract: list :param encode_as_categories: Whether day-of-week and month features should be encoded as pandas "category" dtype. This allows OneHotEncoders to encode these features. Defaults to False. :type encode_as_categories: bool :param time_index: Name of the column containing the datetime information used to order the data. Ignored. :type time_index: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - DateTime Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DateTimeFeaturizer.clone evalml.pipelines.components.DateTimeFeaturizer.default_parameters evalml.pipelines.components.DateTimeFeaturizer.describe evalml.pipelines.components.DateTimeFeaturizer.fit evalml.pipelines.components.DateTimeFeaturizer.fit_transform evalml.pipelines.components.DateTimeFeaturizer.get_feature_names evalml.pipelines.components.DateTimeFeaturizer.load evalml.pipelines.components.DateTimeFeaturizer.needs_fitting evalml.pipelines.components.DateTimeFeaturizer.parameters evalml.pipelines.components.DateTimeFeaturizer.save evalml.pipelines.components.DateTimeFeaturizer.transform evalml.pipelines.components.DateTimeFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fit the datetime featurizer component. :param X: Input features. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: get_feature_names(self) Gets the categories of each datetime feature. :returns: Dictionary, where each key-value pair is a column name and a dictionary mapping the unique feature values to their integer encoding. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by creating new features using existing DateTime columns, and then dropping those DateTime columns. :param X: Input features. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DecisionTreeClassifier(criterion='gini', max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, random_seed=0, **kwargs) Decision Tree Classifier. :param criterion: The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "entropy" for the information gain. Defaults to "gini". :type criterion: {"gini", "entropy"} :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. Defaults to 2. :type min_samples_split: int or float :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "criterion": ["gini", "entropy"], "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.DECISION_TREE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Decision Tree Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DecisionTreeClassifier.clone evalml.pipelines.components.DecisionTreeClassifier.default_parameters evalml.pipelines.components.DecisionTreeClassifier.describe evalml.pipelines.components.DecisionTreeClassifier.feature_importance evalml.pipelines.components.DecisionTreeClassifier.fit evalml.pipelines.components.DecisionTreeClassifier.get_prediction_intervals evalml.pipelines.components.DecisionTreeClassifier.load evalml.pipelines.components.DecisionTreeClassifier.needs_fitting evalml.pipelines.components.DecisionTreeClassifier.parameters evalml.pipelines.components.DecisionTreeClassifier.predict evalml.pipelines.components.DecisionTreeClassifier.predict_proba evalml.pipelines.components.DecisionTreeClassifier.save evalml.pipelines.components.DecisionTreeClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DecisionTreeRegressor(criterion='squared_error', max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, random_seed=0, **kwargs) Decision Tree Regressor. :param criterion: The function to measure the quality of a split. Supported criteria are: - "squared_error" for the mean squared error, which is equal to variance reduction as feature selection criterion and minimizes the L2 loss using the mean of each terminal node - "friedman_mse", which uses mean squared error with Friedman"s improvement score for potential splits - "absolute_error" for the mean absolute error, which minimizes the L1 loss using the median of each terminal node, - "poisson" which uses reduction in Poisson deviance to find splits. :type criterion: {"squared_error", "friedman_mse", "absolute_error", "poisson"} :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. Defaults to 2. :type min_samples_split: int or float :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "criterion": ["squared_error", "friedman_mse", "absolute_error"], "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.DECISION_TREE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Decision Tree Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DecisionTreeRegressor.clone evalml.pipelines.components.DecisionTreeRegressor.default_parameters evalml.pipelines.components.DecisionTreeRegressor.describe evalml.pipelines.components.DecisionTreeRegressor.feature_importance evalml.pipelines.components.DecisionTreeRegressor.fit evalml.pipelines.components.DecisionTreeRegressor.get_prediction_intervals evalml.pipelines.components.DecisionTreeRegressor.load evalml.pipelines.components.DecisionTreeRegressor.needs_fitting evalml.pipelines.components.DecisionTreeRegressor.parameters evalml.pipelines.components.DecisionTreeRegressor.predict evalml.pipelines.components.DecisionTreeRegressor.predict_proba evalml.pipelines.components.DecisionTreeRegressor.save evalml.pipelines.components.DecisionTreeRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DFSTransformer(index='index', features=None, random_seed=0, **kwargs) Featuretools DFS component that generates features for the input features. :param index: The name of the column that contains the indices. If no column with this name exists, then featuretools.EntitySet() creates a column with this name to serve as the index column. Defaults to 'index'. :type index: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param features: List of features to run DFS on. Defaults to None. Features will only be computed if the columns used by the feature exist in the input and if the feature itself is not in input. If features is an empty list, no transformation will occur to inputted data. :type features: list **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - DFS Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DFSTransformer.clone evalml.pipelines.components.DFSTransformer.contains_pre_existing_features evalml.pipelines.components.DFSTransformer.default_parameters evalml.pipelines.components.DFSTransformer.describe evalml.pipelines.components.DFSTransformer.fit evalml.pipelines.components.DFSTransformer.fit_transform evalml.pipelines.components.DFSTransformer.load evalml.pipelines.components.DFSTransformer.needs_fitting evalml.pipelines.components.DFSTransformer.parameters evalml.pipelines.components.DFSTransformer.save evalml.pipelines.components.DFSTransformer.transform evalml.pipelines.components.DFSTransformer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: contains_pre_existing_features(dfs_features: Optional[List[featuretools.feature_base.FeatureBase]], input_feature_names: List[str], target: Optional[str] = None) :staticmethod: Determines whether or not features from a DFS Transformer match pipeline input features. :param dfs_features: List of features output from a DFS Transformer. :type dfs_features: Optional[List[FeatureBase]] :param input_feature_names: List of input features into the DFS Transformer. :type input_feature_names: List[str] :param target: The target whose values we are trying to predict. This is used to know which column to ignore if the target column is present in the list of features in the DFS Transformer's parameters. :type target: Optional[str] .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the DFSTransformer Transformer component. :param X: The input data to transform, of shape [n_samples, n_features]. :type X: pd.DataFrame, np.array :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Computes the feature matrix for the input X using featuretools' dfs algorithm. :param X: The input training data to transform. Has shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: Ignored. :type y: pd.Series, optional :returns: Feature matrix :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DropColumns(columns=None, random_seed=0, **kwargs) Drops specified columns in input data. :param columns: List of column names, used to determine which columns to drop. :type columns: list(string) :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Drop Columns Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropColumns.clone evalml.pipelines.components.DropColumns.default_parameters evalml.pipelines.components.DropColumns.describe evalml.pipelines.components.DropColumns.fit evalml.pipelines.components.DropColumns.fit_transform evalml.pipelines.components.DropColumns.load evalml.pipelines.components.DropColumns.parameters evalml.pipelines.components.DropColumns.save evalml.pipelines.components.DropColumns.transform evalml.pipelines.components.DropColumns.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, ignored :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by dropping columns. :param X: Data to transform. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: Transformed X. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DropNaNRowsTransformer(parameters=None, component_obj=None, random_seed=0, **kwargs) Transformer to drop rows with NaN values. :param random_seed: Seed for the random number generator. Is not used by this component. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Drop NaN Rows Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropNaNRowsTransformer.clone evalml.pipelines.components.DropNaNRowsTransformer.default_parameters evalml.pipelines.components.DropNaNRowsTransformer.describe evalml.pipelines.components.DropNaNRowsTransformer.fit evalml.pipelines.components.DropNaNRowsTransformer.fit_transform evalml.pipelines.components.DropNaNRowsTransformer.load evalml.pipelines.components.DropNaNRowsTransformer.needs_fitting evalml.pipelines.components.DropNaNRowsTransformer.parameters evalml.pipelines.components.DropNaNRowsTransformer.save evalml.pipelines.components.DropNaNRowsTransformer.transform evalml.pipelines.components.DropNaNRowsTransformer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data using fitted component. :param X: Features. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Data with NaN rows dropped. :rtype: (pd.DataFrame, pd.Series) .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DropNullColumns(pct_null_threshold=1.0, random_seed=0, **kwargs) Transformer to drop features whose percentage of NaN values exceeds a specified threshold. :param pct_null_threshold: The percentage of NaN values in an input feature to drop. Must be a value between [0, 1] inclusive. If equal to 0.0, will drop columns with any null values. If equal to 1.0, will drop columns with all null values. Defaults to 0.95. :type pct_null_threshold: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Drop Null Columns Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropNullColumns.clone evalml.pipelines.components.DropNullColumns.default_parameters evalml.pipelines.components.DropNullColumns.describe evalml.pipelines.components.DropNullColumns.fit evalml.pipelines.components.DropNullColumns.fit_transform evalml.pipelines.components.DropNullColumns.load evalml.pipelines.components.DropNullColumns.needs_fitting evalml.pipelines.components.DropNullColumns.parameters evalml.pipelines.components.DropNullColumns.save evalml.pipelines.components.DropNullColumns.transform evalml.pipelines.components.DropNullColumns.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by dropping columns that exceed the threshold of null values. :param X: Data to transform :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DropRowsTransformer(indices_to_drop=None, random_seed=0) Transformer to drop rows specified by row indices. :param indices_to_drop: List of indices to drop in the input data. Defaults to None. :type indices_to_drop: list :param random_seed: Seed for the random number generator. Is not used by this component. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Drop Rows Transformer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropRowsTransformer.clone evalml.pipelines.components.DropRowsTransformer.default_parameters evalml.pipelines.components.DropRowsTransformer.describe evalml.pipelines.components.DropRowsTransformer.fit evalml.pipelines.components.DropRowsTransformer.fit_transform evalml.pipelines.components.DropRowsTransformer.load evalml.pipelines.components.DropRowsTransformer.needs_fitting evalml.pipelines.components.DropRowsTransformer.parameters evalml.pipelines.components.DropRowsTransformer.save evalml.pipelines.components.DropRowsTransformer.transform evalml.pipelines.components.DropRowsTransformer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If indices to drop do not exist in input features or target. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data using fitted component. :param X: Features. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Data with row indices dropped. :rtype: (pd.DataFrame, pd.Series) .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ElasticNetClassifier(penalty='elasticnet', C=1.0, l1_ratio=0.15, multi_class='auto', solver='saga', n_jobs=-1, random_seed=0, **kwargs) Elastic Net Classifier. Uses Logistic Regression with elasticnet penalty as the base estimator. :param penalty: The norm used in penalization. Defaults to "elasticnet". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param l1_ratio: The mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2. Defaults to 0.15. :type l1_ratio: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "saga". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0.01, 10), "l1_ratio": Real(0, 1)} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Elastic Net Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ElasticNetClassifier.clone evalml.pipelines.components.ElasticNetClassifier.default_parameters evalml.pipelines.components.ElasticNetClassifier.describe evalml.pipelines.components.ElasticNetClassifier.feature_importance evalml.pipelines.components.ElasticNetClassifier.fit evalml.pipelines.components.ElasticNetClassifier.get_prediction_intervals evalml.pipelines.components.ElasticNetClassifier.load evalml.pipelines.components.ElasticNetClassifier.needs_fitting evalml.pipelines.components.ElasticNetClassifier.parameters evalml.pipelines.components.ElasticNetClassifier.predict evalml.pipelines.components.ElasticNetClassifier.predict_proba evalml.pipelines.components.ElasticNetClassifier.save evalml.pipelines.components.ElasticNetClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted ElasticNet classifier. .. py:method:: fit(self, X, y) Fits ElasticNet classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ElasticNetRegressor(alpha=0.0001, l1_ratio=0.15, max_iter=1000, random_seed=0, **kwargs) Elastic Net Regressor. :param alpha: Constant that multiplies the penalty terms. Defaults to 0.0001. :type alpha: float :param l1_ratio: The mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2. Defaults to 0.15. :type l1_ratio: float :param max_iter: The maximum number of iterations. Defaults to 1000. :type max_iter: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "alpha": Real(0, 1), "l1_ratio": Real(0, 1),} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Elastic Net Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ElasticNetRegressor.clone evalml.pipelines.components.ElasticNetRegressor.default_parameters evalml.pipelines.components.ElasticNetRegressor.describe evalml.pipelines.components.ElasticNetRegressor.feature_importance evalml.pipelines.components.ElasticNetRegressor.fit evalml.pipelines.components.ElasticNetRegressor.get_prediction_intervals evalml.pipelines.components.ElasticNetRegressor.load evalml.pipelines.components.ElasticNetRegressor.needs_fitting evalml.pipelines.components.ElasticNetRegressor.parameters evalml.pipelines.components.ElasticNetRegressor.predict evalml.pipelines.components.ElasticNetRegressor.predict_proba evalml.pipelines.components.ElasticNetRegressor.save evalml.pipelines.components.ElasticNetRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted ElasticNet regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: EmailFeaturizer(random_seed=0, **kwargs) Transformer that can automatically extract features from emails. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Email Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.EmailFeaturizer.clone evalml.pipelines.components.EmailFeaturizer.default_parameters evalml.pipelines.components.EmailFeaturizer.describe evalml.pipelines.components.EmailFeaturizer.fit evalml.pipelines.components.EmailFeaturizer.fit_transform evalml.pipelines.components.EmailFeaturizer.load evalml.pipelines.components.EmailFeaturizer.needs_fitting evalml.pipelines.components.EmailFeaturizer.parameters evalml.pipelines.components.EmailFeaturizer.save evalml.pipelines.components.EmailFeaturizer.transform evalml.pipelines.components.EmailFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: Estimator(parameters: dict = None, component_obj: Type[evalml.pipelines.components.ComponentBase] = None, random_seed: Union[int, float] = 0, **kwargs) A component that fits and predicts given data. To implement a new Estimator, define your own class which is a subclass of Estimator, including a name and a list of acceptable ranges for any parameters to be tuned during the automl search (hyperparameters). Define an `__init__` method which sets up any necessary state and objects. Make sure your `__init__` only uses standard keyword arguments and calls `super().__init__()` with a parameters dict. You may also override the `fit`, `transform`, `fit_transform` and other methods in this class if appropriate. To see some examples, check out the definitions of any Estimator component subclass. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **model_family** - ModelFamily.NONE * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Estimator.clone evalml.pipelines.components.Estimator.default_parameters evalml.pipelines.components.Estimator.describe evalml.pipelines.components.Estimator.feature_importance evalml.pipelines.components.Estimator.fit evalml.pipelines.components.Estimator.get_prediction_intervals evalml.pipelines.components.Estimator.load evalml.pipelines.components.Estimator.model_family evalml.pipelines.components.Estimator.name evalml.pipelines.components.Estimator.needs_fitting evalml.pipelines.components.Estimator.parameters evalml.pipelines.components.Estimator.predict evalml.pipelines.components.Estimator.predict_proba evalml.pipelines.components.Estimator.save evalml.pipelines.components.Estimator.supported_problem_types evalml.pipelines.components.Estimator.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: model_family(cls) :property: Returns ModelFamily of this component. .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: supported_problem_types(cls) :property: Problem types this estimator supports. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ExponentialSmoothingRegressor(trend: Optional[str] = None, damped_trend: bool = False, seasonal: Optional[str] = None, sp: int = 2, n_jobs: int = -1, random_seed: Union[int, float] = 0, **kwargs) Holt-Winters Exponential Smoothing Forecaster. Currently ExponentialSmoothingRegressor isn't supported via conda install. It's recommended that it be installed via PyPI. :param trend: Type of trend component. Defaults to None. :type trend: str :param damped_trend: If the trend component should be damped. Defaults to False. :type damped_trend: bool :param seasonal: Type of seasonal component. Takes one of {“additive”, None}. Can also be multiplicative if :type seasonal: str :param none of the target data is 0: :param but AutoMLSearch wiill not tune for this. Defaults to None.: :param sp: The number of seasonal periods to consider. Defaults to 2. :type sp: int :param n_jobs: Non-negative integer describing level of parallelism used for pipelines. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "trend": [None, "additive"], "damped_trend": [True, False], "seasonal": [None, "additive"], "sp": Integer(2, 8),} * - **model_family** - ModelFamily.EXPONENTIAL_SMOOTHING * - **modifies_features** - True * - **modifies_target** - False * - **name** - Exponential Smoothing Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExponentialSmoothingRegressor.clone evalml.pipelines.components.ExponentialSmoothingRegressor.default_parameters evalml.pipelines.components.ExponentialSmoothingRegressor.describe evalml.pipelines.components.ExponentialSmoothingRegressor.feature_importance evalml.pipelines.components.ExponentialSmoothingRegressor.fit evalml.pipelines.components.ExponentialSmoothingRegressor.get_prediction_intervals evalml.pipelines.components.ExponentialSmoothingRegressor.load evalml.pipelines.components.ExponentialSmoothingRegressor.needs_fitting evalml.pipelines.components.ExponentialSmoothingRegressor.parameters evalml.pipelines.components.ExponentialSmoothingRegressor.predict evalml.pipelines.components.ExponentialSmoothingRegressor.predict_proba evalml.pipelines.components.ExponentialSmoothingRegressor.save evalml.pipelines.components.ExponentialSmoothingRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns array of 0's with a length of 1 as feature_importance is not defined for Exponential Smoothing regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits Exponential Smoothing Regressor to data. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y was not passed in. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted ExponentialSmoothingRegressor. Calculates the prediction intervals by using a simulation of the time series following a specified state space model. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Optional. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: List[float] :param predictions: Not used for Exponential Smoothing regressor. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) -> pandas.Series Make predictions using fitted Exponential Smoothing regressor. :param X: Data of shape [n_samples, n_features]. Ignored except to set forecast horizon. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ExtraTreesClassifier(n_estimators=100, max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, n_jobs=-1, random_seed=0, **kwargs) Extra Trees Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. :type min_samples_split: int or float :param Defaults to 2.: :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.EXTRA_TREES * - **modifies_features** - True * - **modifies_target** - False * - **name** - Extra Trees Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExtraTreesClassifier.clone evalml.pipelines.components.ExtraTreesClassifier.default_parameters evalml.pipelines.components.ExtraTreesClassifier.describe evalml.pipelines.components.ExtraTreesClassifier.feature_importance evalml.pipelines.components.ExtraTreesClassifier.fit evalml.pipelines.components.ExtraTreesClassifier.get_prediction_intervals evalml.pipelines.components.ExtraTreesClassifier.load evalml.pipelines.components.ExtraTreesClassifier.needs_fitting evalml.pipelines.components.ExtraTreesClassifier.parameters evalml.pipelines.components.ExtraTreesClassifier.predict evalml.pipelines.components.ExtraTreesClassifier.predict_proba evalml.pipelines.components.ExtraTreesClassifier.save evalml.pipelines.components.ExtraTreesClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ExtraTreesRegressor(n_estimators: int = 100, max_features: str = 'auto', max_depth: int = 6, min_samples_split: int = 2, min_weight_fraction_leaf: float = 0.0, n_jobs: int = -1, random_seed: Union[int, float] = 0, **kwargs) Extra Trees Regressor. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. :type min_samples_split: int or float :param Defaults to 2.: :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.EXTRA_TREES * - **modifies_features** - True * - **modifies_target** - False * - **name** - Extra Trees Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExtraTreesRegressor.clone evalml.pipelines.components.ExtraTreesRegressor.default_parameters evalml.pipelines.components.ExtraTreesRegressor.describe evalml.pipelines.components.ExtraTreesRegressor.feature_importance evalml.pipelines.components.ExtraTreesRegressor.fit evalml.pipelines.components.ExtraTreesRegressor.get_prediction_intervals evalml.pipelines.components.ExtraTreesRegressor.load evalml.pipelines.components.ExtraTreesRegressor.needs_fitting evalml.pipelines.components.ExtraTreesRegressor.parameters evalml.pipelines.components.ExtraTreesRegressor.predict evalml.pipelines.components.ExtraTreesRegressor.predict_proba evalml.pipelines.components.ExtraTreesRegressor.save evalml.pipelines.components.ExtraTreesRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted ExtraTreesRegressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Optional. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: FeatureSelector(parameters=None, component_obj=None, random_seed=0, **kwargs) Selects top features based on importance weights. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.FeatureSelector.clone evalml.pipelines.components.FeatureSelector.default_parameters evalml.pipelines.components.FeatureSelector.describe evalml.pipelines.components.FeatureSelector.fit evalml.pipelines.components.FeatureSelector.fit_transform evalml.pipelines.components.FeatureSelector.get_names evalml.pipelines.components.FeatureSelector.load evalml.pipelines.components.FeatureSelector.name evalml.pipelines.components.FeatureSelector.needs_fitting evalml.pipelines.components.FeatureSelector.parameters evalml.pipelines.components.FeatureSelector.save evalml.pipelines.components.FeatureSelector.transform evalml.pipelines.components.FeatureSelector.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: Imputer(categorical_impute_strategy='most_frequent', categorical_fill_value=None, numeric_impute_strategy='mean', numeric_fill_value=None, boolean_impute_strategy='most_frequent', boolean_fill_value=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy. :param categorical_impute_strategy: Impute strategy to use for string, object, boolean, categorical dtypes. Valid values include "most_frequent" and "constant". :type categorical_impute_strategy: string :param numeric_impute_strategy: Impute strategy to use for numeric columns. Valid values include "mean", "median", "most_frequent", and "constant". :type numeric_impute_strategy: string :param boolean_impute_strategy: Impute strategy to use for boolean columns. Valid values include "most_frequent" and "constant". :type boolean_impute_strategy: string :param categorical_fill_value: When categorical_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with the string "missing_value". :type categorical_fill_value: string :param numeric_fill_value: When numeric_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with 0. :type numeric_fill_value: int, float :param boolean_fill_value: When boolean_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with True. :type boolean_fill_value: bool :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "categorical_impute_strategy": ["most_frequent"], "numeric_impute_strategy": ["mean", "median", "most_frequent", "knn"], "boolean_impute_strategy": ["most_frequent"]} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Imputer.clone evalml.pipelines.components.Imputer.default_parameters evalml.pipelines.components.Imputer.describe evalml.pipelines.components.Imputer.fit evalml.pipelines.components.Imputer.fit_transform evalml.pipelines.components.Imputer.load evalml.pipelines.components.Imputer.needs_fitting evalml.pipelines.components.Imputer.parameters evalml.pipelines.components.Imputer.save evalml.pipelines.components.Imputer.transform evalml.pipelines.components.Imputer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame, np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by imputing missing values. :param X: Data to transform :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, random_seed=0, **kwargs) K-Nearest Neighbors Classifier. :param n_neighbors: Number of neighbors to use by default. Defaults to 5. :type n_neighbors: int :param weights: Weight function used in prediction. Can be: - ‘uniform’ : uniform weights. All points in each neighborhood are weighted equally. - ‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away. - [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights. Defaults to "uniform". :type weights: {‘uniform’, ‘distance’} or callable :param algorithm: Algorithm used to compute the nearest neighbors: - ‘ball_tree’ will use BallTree - ‘kd_tree’ will use KDTree - ‘brute’ will use a brute-force search. ‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit method. Defaults to "auto". Note: fitting on sparse input will override the setting of this parameter, using brute force. :type algorithm: {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’} :param leaf_size: Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem. Defaults to 30. :type leaf_size: int :param p: Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used. Defaults to 2. :type p: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_neighbors": Integer(2, 12), "weights": ["uniform", "distance"], "algorithm": ["auto", "ball_tree", "kd_tree", "brute"], "leaf_size": Integer(10, 30), "p": Integer(1, 5),} * - **model_family** - ModelFamily.K_NEIGHBORS * - **modifies_features** - True * - **modifies_target** - False * - **name** - KNN Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.KNeighborsClassifier.clone evalml.pipelines.components.KNeighborsClassifier.default_parameters evalml.pipelines.components.KNeighborsClassifier.describe evalml.pipelines.components.KNeighborsClassifier.feature_importance evalml.pipelines.components.KNeighborsClassifier.fit evalml.pipelines.components.KNeighborsClassifier.get_prediction_intervals evalml.pipelines.components.KNeighborsClassifier.load evalml.pipelines.components.KNeighborsClassifier.needs_fitting evalml.pipelines.components.KNeighborsClassifier.parameters evalml.pipelines.components.KNeighborsClassifier.predict evalml.pipelines.components.KNeighborsClassifier.predict_proba evalml.pipelines.components.KNeighborsClassifier.save evalml.pipelines.components.KNeighborsClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's matching the input number of features as feature_importance is not defined for KNN classifiers. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LabelEncoder(positive_label=None, random_seed=0, **kwargs) A transformer that encodes target labels using values between 0 and num_classes - 1. :param positive_label: The label for the class that should be treated as positive (1) for binary classification problems. Ignored for multiclass problems. Defaults to None. :type positive_label: int, str :param random_seed: Seed for the random number generator. Defaults to 0. Ignored. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Label Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LabelEncoder.clone evalml.pipelines.components.LabelEncoder.default_parameters evalml.pipelines.components.LabelEncoder.describe evalml.pipelines.components.LabelEncoder.fit evalml.pipelines.components.LabelEncoder.fit_transform evalml.pipelines.components.LabelEncoder.inverse_transform evalml.pipelines.components.LabelEncoder.load evalml.pipelines.components.LabelEncoder.needs_fitting evalml.pipelines.components.LabelEncoder.parameters evalml.pipelines.components.LabelEncoder.save evalml.pipelines.components.LabelEncoder.transform evalml.pipelines.components.LabelEncoder.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the label encoder. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input `y` is None. .. py:method:: fit_transform(self, X, y) Fit and transform data using the label encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: The original features and an encoded version of the target. :rtype: pd.DataFrame, pd.Series .. py:method:: inverse_transform(self, y) Decodes the target data. :param y: Target data. :type y: pd.Series :returns: The decoded version of the target. :rtype: pd.Series :raises ValueError: If input `y` is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform the target using the fitted label encoder. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: The original features and an encoded version of the target. :rtype: pd.DataFrame, pd.Series :raises ValueError: If input `y` is None. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LightGBMClassifier(boosting_type='gbdt', learning_rate=0.1, n_estimators=100, max_depth=0, num_leaves=31, min_child_samples=20, bagging_fraction=0.9, bagging_freq=0, n_jobs=-1, random_seed=0, **kwargs) LightGBM Classifier. :param boosting_type: Type of boosting to use. Defaults to "gbdt". - 'gbdt' uses traditional Gradient Boosting Decision Tree - "dart", uses Dropouts meet Multiple Additive Regression Trees - "goss", uses Gradient-based One-Side Sampling - "rf", uses Random Forest :type boosting_type: string :param learning_rate: Boosting learning rate. Defaults to 0.1. :type learning_rate: float :param n_estimators: Number of boosted trees to fit. Defaults to 100. :type n_estimators: int :param max_depth: Maximum tree depth for base learners, <=0 means no limit. Defaults to 0. :type max_depth: int :param num_leaves: Maximum tree leaves for base learners. Defaults to 31. :type num_leaves: int :param min_child_samples: Minimum number of data needed in a child (leaf). Defaults to 20. :type min_child_samples: int :param bagging_fraction: LightGBM will randomly select a subset of features on each iteration (tree) without resampling if this is smaller than 1.0. For example, if set to 0.8, LightGBM will select 80% of features before training each tree. This can be used to speed up training and deal with overfitting. Defaults to 0.9. :type bagging_fraction: float :param bagging_freq: Frequency for bagging. 0 means bagging is disabled. k means perform bagging at every k iteration. Every k-th iteration, LightGBM will randomly select bagging_fraction * 100 % of the data to use for the next k iterations. Defaults to 0. :type bagging_freq: int :param n_jobs: Number of threads to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "learning_rate": Real(0.000001, 1), "boosting_type": ["gbdt", "dart", "goss", "rf"], "n_estimators": Integer(10, 100), "max_depth": Integer(0, 10), "num_leaves": Integer(2, 100), "min_child_samples": Integer(1, 100), "bagging_fraction": Real(0.000001, 1), "bagging_freq": Integer(0, 1),} * - **model_family** - ModelFamily.LIGHTGBM * - **modifies_features** - True * - **modifies_target** - False * - **name** - LightGBM Classifier * - **SEED_MAX** - SEED_BOUNDS.max_bound * - **SEED_MIN** - 0 * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LightGBMClassifier.clone evalml.pipelines.components.LightGBMClassifier.default_parameters evalml.pipelines.components.LightGBMClassifier.describe evalml.pipelines.components.LightGBMClassifier.feature_importance evalml.pipelines.components.LightGBMClassifier.fit evalml.pipelines.components.LightGBMClassifier.get_prediction_intervals evalml.pipelines.components.LightGBMClassifier.load evalml.pipelines.components.LightGBMClassifier.needs_fitting evalml.pipelines.components.LightGBMClassifier.parameters evalml.pipelines.components.LightGBMClassifier.predict evalml.pipelines.components.LightGBMClassifier.predict_proba evalml.pipelines.components.LightGBMClassifier.save evalml.pipelines.components.LightGBMClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits LightGBM classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make prediction probabilities using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LightGBMRegressor(boosting_type='gbdt', learning_rate=0.1, n_estimators=20, max_depth=0, num_leaves=31, min_child_samples=20, bagging_fraction=0.9, bagging_freq=0, n_jobs=-1, random_seed=0, **kwargs) LightGBM Regressor. :param boosting_type: Type of boosting to use. Defaults to "gbdt". - 'gbdt' uses traditional Gradient Boosting Decision Tree - "dart", uses Dropouts meet Multiple Additive Regression Trees - "goss", uses Gradient-based One-Side Sampling - "rf", uses Random Forest :type boosting_type: string :param learning_rate: Boosting learning rate. Defaults to 0.1. :type learning_rate: float :param n_estimators: Number of boosted trees to fit. Defaults to 100. :type n_estimators: int :param max_depth: Maximum tree depth for base learners, <=0 means no limit. Defaults to 0. :type max_depth: int :param num_leaves: Maximum tree leaves for base learners. Defaults to 31. :type num_leaves: int :param min_child_samples: Minimum number of data needed in a child (leaf). Defaults to 20. :type min_child_samples: int :param bagging_fraction: LightGBM will randomly select a subset of features on each iteration (tree) without resampling if this is smaller than 1.0. For example, if set to 0.8, LightGBM will select 80% of features before training each tree. This can be used to speed up training and deal with overfitting. Defaults to 0.9. :type bagging_fraction: float :param bagging_freq: Frequency for bagging. 0 means bagging is disabled. k means perform bagging at every k iteration. Every k-th iteration, LightGBM will randomly select bagging_fraction * 100 % of the data to use for the next k iterations. Defaults to 0. :type bagging_freq: int :param n_jobs: Number of threads to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "learning_rate": Real(0.000001, 1), "boosting_type": ["gbdt", "dart", "goss", "rf"], "n_estimators": Integer(10, 100), "max_depth": Integer(0, 10), "num_leaves": Integer(2, 100), "min_child_samples": Integer(1, 100), "bagging_fraction": Real(0.000001, 1), "bagging_freq": Integer(0, 1),} * - **model_family** - ModelFamily.LIGHTGBM * - **modifies_features** - True * - **modifies_target** - False * - **name** - LightGBM Regressor * - **SEED_MAX** - SEED_BOUNDS.max_bound * - **SEED_MIN** - 0 * - **supported_problem_types** - [ProblemTypes.REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LightGBMRegressor.clone evalml.pipelines.components.LightGBMRegressor.default_parameters evalml.pipelines.components.LightGBMRegressor.describe evalml.pipelines.components.LightGBMRegressor.feature_importance evalml.pipelines.components.LightGBMRegressor.fit evalml.pipelines.components.LightGBMRegressor.get_prediction_intervals evalml.pipelines.components.LightGBMRegressor.load evalml.pipelines.components.LightGBMRegressor.needs_fitting evalml.pipelines.components.LightGBMRegressor.parameters evalml.pipelines.components.LightGBMRegressor.predict evalml.pipelines.components.LightGBMRegressor.predict_proba evalml.pipelines.components.LightGBMRegressor.save evalml.pipelines.components.LightGBMRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits LightGBM regressor to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using fitted LightGBM regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LinearDiscriminantAnalysis(n_components=None, random_seed=0, **kwargs) Reduces the number of features by using Linear Discriminant Analysis. :param n_components: The number of features to maintain after computation. Defaults to None. :type n_components: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Linear Discriminant Analysis Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LinearDiscriminantAnalysis.clone evalml.pipelines.components.LinearDiscriminantAnalysis.default_parameters evalml.pipelines.components.LinearDiscriminantAnalysis.describe evalml.pipelines.components.LinearDiscriminantAnalysis.fit evalml.pipelines.components.LinearDiscriminantAnalysis.fit_transform evalml.pipelines.components.LinearDiscriminantAnalysis.load evalml.pipelines.components.LinearDiscriminantAnalysis.needs_fitting evalml.pipelines.components.LinearDiscriminantAnalysis.parameters evalml.pipelines.components.LinearDiscriminantAnalysis.save evalml.pipelines.components.LinearDiscriminantAnalysis.transform evalml.pipelines.components.LinearDiscriminantAnalysis.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If input data is not all numeric. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LinearRegressor(fit_intercept=True, n_jobs=-1, random_seed=0, **kwargs) Linear Regressor. :param fit_intercept: Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered). Defaults to True. :type fit_intercept: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "fit_intercept": [True, False],} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Linear Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LinearRegressor.clone evalml.pipelines.components.LinearRegressor.default_parameters evalml.pipelines.components.LinearRegressor.describe evalml.pipelines.components.LinearRegressor.feature_importance evalml.pipelines.components.LinearRegressor.fit evalml.pipelines.components.LinearRegressor.get_prediction_intervals evalml.pipelines.components.LinearRegressor.load evalml.pipelines.components.LinearRegressor.needs_fitting evalml.pipelines.components.LinearRegressor.parameters evalml.pipelines.components.LinearRegressor.predict evalml.pipelines.components.LinearRegressor.predict_proba evalml.pipelines.components.LinearRegressor.save evalml.pipelines.components.LinearRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted linear regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LogisticRegressionClassifier(penalty='l2', C=1.0, multi_class='auto', solver='lbfgs', n_jobs=-1, random_seed=0, **kwargs) Logistic Regression Classifier. :param penalty: The norm used in penalization. Defaults to "l2". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "lbfgs". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "penalty": ["l2"], "C": Real(0.01, 10),} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Logistic Regression Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LogisticRegressionClassifier.clone evalml.pipelines.components.LogisticRegressionClassifier.default_parameters evalml.pipelines.components.LogisticRegressionClassifier.describe evalml.pipelines.components.LogisticRegressionClassifier.feature_importance evalml.pipelines.components.LogisticRegressionClassifier.fit evalml.pipelines.components.LogisticRegressionClassifier.get_prediction_intervals evalml.pipelines.components.LogisticRegressionClassifier.load evalml.pipelines.components.LogisticRegressionClassifier.needs_fitting evalml.pipelines.components.LogisticRegressionClassifier.parameters evalml.pipelines.components.LogisticRegressionClassifier.predict evalml.pipelines.components.LogisticRegressionClassifier.predict_proba evalml.pipelines.components.LogisticRegressionClassifier.save evalml.pipelines.components.LogisticRegressionClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted logistic regression classifier. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LogTransformer(random_seed=0) Applies a log transformation to the target data. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Log Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LogTransformer.clone evalml.pipelines.components.LogTransformer.default_parameters evalml.pipelines.components.LogTransformer.describe evalml.pipelines.components.LogTransformer.fit evalml.pipelines.components.LogTransformer.fit_transform evalml.pipelines.components.LogTransformer.inverse_transform evalml.pipelines.components.LogTransformer.load evalml.pipelines.components.LogTransformer.needs_fitting evalml.pipelines.components.LogTransformer.parameters evalml.pipelines.components.LogTransformer.save evalml.pipelines.components.LogTransformer.transform evalml.pipelines.components.LogTransformer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the LogTransformer. :param X: Ignored. :type X: pd.DataFrame or np.ndarray :param y: Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Log transforms the target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to log transform. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is log transformed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: inverse_transform(self, y) Apply exponential to target data. :param y: Target variable. :type y: pd.Series :returns: Target with exponential applied. :rtype: pd.Series .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Log transforms the target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target data to log transform. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is log transformed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LSA(random_seed=0, **kwargs) Transformer to calculate the Latent Semantic Analysis Values of text input. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - LSA Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LSA.clone evalml.pipelines.components.LSA.default_parameters evalml.pipelines.components.LSA.describe evalml.pipelines.components.LSA.fit evalml.pipelines.components.LSA.fit_transform evalml.pipelines.components.LSA.load evalml.pipelines.components.LSA.needs_fitting evalml.pipelines.components.LSA.parameters evalml.pipelines.components.LSA.save evalml.pipelines.components.LSA.transform evalml.pipelines.components.LSA.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the input data. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by applying the LSA pipeline. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X. The original column is removed and replaced with two columns of the format `LSA(original_column_name)[feature_number]`, where `feature_number` is 0 or 1. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: NaturalLanguageFeaturizer(random_seed=0, **kwargs) Transformer that can automatically featurize text columns using featuretools' nlp_primitives. Since models cannot handle non-numeric data, any text must be broken down into features that provide useful information about that text. This component splits each text column into several informative features: Diversity Score, Mean Characters per Word, Polarity Score, LSA (Latent Semantic Analysis), Number of Characters, and Number of Words. Calling transform on this component will replace any text columns in the given dataset with these numeric columns. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Natural Language Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.NaturalLanguageFeaturizer.clone evalml.pipelines.components.NaturalLanguageFeaturizer.default_parameters evalml.pipelines.components.NaturalLanguageFeaturizer.describe evalml.pipelines.components.NaturalLanguageFeaturizer.fit evalml.pipelines.components.NaturalLanguageFeaturizer.fit_transform evalml.pipelines.components.NaturalLanguageFeaturizer.load evalml.pipelines.components.NaturalLanguageFeaturizer.needs_fitting evalml.pipelines.components.NaturalLanguageFeaturizer.parameters evalml.pipelines.components.NaturalLanguageFeaturizer.save evalml.pipelines.components.NaturalLanguageFeaturizer.transform evalml.pipelines.components.NaturalLanguageFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by creating new features using existing text columns. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: OneHotEncoder(top_n=10, features_to_encode=None, categories=None, drop='if_binary', handle_unknown='ignore', handle_missing='error', random_seed=0, **kwargs) A transformer that encodes categorical features in a one-hot numeric array. :param top_n: Number of categories per column to encode. If None, all categories will be encoded. Otherwise, the `n` most frequent will be encoded and all others will be dropped. Defaults to 10. :type top_n: int :param features_to_encode: List of columns to encode. All other columns will remain untouched. If None, all appropriate columns will be encoded. Defaults to None. :type features_to_encode: list[str] :param categories: A two dimensional list of categories, where `categories[i]` is a list of the categories for the column at index `i`. This can also be `None`, or `"auto"` if `top_n` is not None. Defaults to None. :type categories: list :param drop: Method ("first" or "if_binary") to use to drop one category per feature. Can also be a list specifying which categories to drop for each feature. Defaults to 'if_binary'. :type drop: string, list :param handle_unknown: Whether to ignore or error for unknown categories for a feature encountered during `fit` or `transform`. If either `top_n` or `categories` is used to limit the number of categories per column, this must be "ignore". Defaults to "ignore". :type handle_unknown: string :param handle_missing: Options for how to handle missing (NaN) values encountered during `fit` or `transform`. If this is set to "as_category" and NaN values are within the `n` most frequent, "nan" values will be encoded as their own column. If this is set to "error", any missing values encountered will raise an error. Defaults to "error". :type handle_missing: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - One Hot Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.OneHotEncoder.categories evalml.pipelines.components.OneHotEncoder.clone evalml.pipelines.components.OneHotEncoder.default_parameters evalml.pipelines.components.OneHotEncoder.describe evalml.pipelines.components.OneHotEncoder.fit evalml.pipelines.components.OneHotEncoder.fit_transform evalml.pipelines.components.OneHotEncoder.get_feature_names evalml.pipelines.components.OneHotEncoder.load evalml.pipelines.components.OneHotEncoder.needs_fitting evalml.pipelines.components.OneHotEncoder.parameters evalml.pipelines.components.OneHotEncoder.save evalml.pipelines.components.OneHotEncoder.transform evalml.pipelines.components.OneHotEncoder.update_parameters .. py:method:: categories(self, feature_name) Returns a list of the unique categories to be encoded for the particular feature, in order. :param feature_name: The name of any feature provided to one-hot encoder during fit. :type feature_name: str :returns: The unique categories, in the same dtype as they were provided during fit. :rtype: np.ndarray :raises ValueError: If feature was not provided to one-hot encoder as a training feature. .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the one-hot encoder component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If encoding a column failed. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: get_feature_names(self) Return feature names for the categorical features after fitting. Feature names are formatted as {column name}_{category name}. In the event of a duplicate name, an integer will be added at the end of the feature name to distinguish it. For example, consider a dataframe with a column called "A" and category "x_y" and another column called "A_x" with "y". In this example, the feature names would be "A_x_y" and "A_x_y_1". :returns: The feature names after encoding, provided in the same order as input_features. :rtype: np.ndarray .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) One-hot encode the input data. :param X: Features to one-hot encode. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series :returns: Transformed data, where each categorical feature has been encoded into numerical columns using one-hot encoding. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: OrdinalEncoder(features_to_encode=None, categories=None, handle_unknown='error', unknown_value=None, encoded_missing_value=None, random_seed=0, **kwargs) A transformer that encodes ordinal features as an array of ordinal integers representing the relative order of categories. :param features_to_encode: List of columns to encode. All other columns will remain untouched. If None, all appropriate columns will be encoded. Defaults to None. The order of columns does not matter. :type features_to_encode: list[str] :param categories: A dictionary mapping column names to their categories in the dataframes passed in at fit and transform. The order of categories specified for a column does not matter. Any category found in the data that is not present in categories will be handled as an unknown value. To not have unknown values raise an error, set handle_unknown to "use_encoded_value". Defaults to None. :type categories: dict[str, list[str]] :param handle_unknown: Whether to ignore or error for unknown categories for a feature encountered during `fit` or `transform`. When set to "error", an error will be raised when an unknown category is found. When set to "use_encoded_value", unknown categories will be encoded as the value given for the parameter unknown_value. Defaults to "error." :type handle_unknown: "error" or "use_encoded_value" :param unknown_value: The value to use for unknown categories seen during fit or transform. Required when the parameter handle_unknown is set to "use_encoded_value." The value has to be distinct from the values used to encode any of the categories in fit. Defaults to None. :type unknown_value: int or np.nan :param encoded_missing_value: The value to use for missing (null) values seen during fit or transform. Defaults to np.nan. :type encoded_missing_value: int or np.nan :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Ordinal Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.OrdinalEncoder.categories evalml.pipelines.components.OrdinalEncoder.clone evalml.pipelines.components.OrdinalEncoder.default_parameters evalml.pipelines.components.OrdinalEncoder.describe evalml.pipelines.components.OrdinalEncoder.fit evalml.pipelines.components.OrdinalEncoder.fit_transform evalml.pipelines.components.OrdinalEncoder.get_feature_names evalml.pipelines.components.OrdinalEncoder.load evalml.pipelines.components.OrdinalEncoder.needs_fitting evalml.pipelines.components.OrdinalEncoder.parameters evalml.pipelines.components.OrdinalEncoder.save evalml.pipelines.components.OrdinalEncoder.transform evalml.pipelines.components.OrdinalEncoder.update_parameters .. py:method:: categories(self, feature_name) Returns a list of the unique categories to be encoded for the particular feature, in order. :param feature_name: The name of any feature provided to ordinal encoder during fit. :type feature_name: str :returns: The unique categories, in the same dtype as they were provided during fit. :rtype: np.ndarray :raises ValueError: If feature was not provided to ordinal encoder as a training feature. .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the ordinal encoder component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If encoding a column failed. :raises TypeError: If non-Ordinal columns are specified in features_to_encode. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: get_feature_names(self) Return feature names for the ordinal features after fitting. Feature names are formatted as {column name}_ordinal_encoding. :returns: The feature names after encoding, provided in the same order as input_features. :rtype: np.ndarray .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Ordinally encode the input data. :param X: Features to encode. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series :returns: Transformed data, where each ordinal feature has been encoded into a numerical column where ordinal integers represent the relative order of categories. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: Oversampler(sampling_ratio=0.25, sampling_ratio_dict=None, k_neighbors_default=5, n_jobs=-1, random_seed=0, **kwargs) SMOTE Oversampler component. Will automatically select whether to use SMOTE, SMOTEN, or SMOTENC based on inputs to the component. :param sampling_ratio: This is the goal ratio of the minority to majority class, with range (0, 1]. A value of 0.25 means we want a 1:4 ratio of the minority to majority class after oversampling. We will create the a sampling dictionary using this ratio, with the keys corresponding to the class and the values responding to the number of samples. Defaults to 0.25. :type sampling_ratio: float :param sampling_ratio_dict: A dictionary specifying the desired balanced ratio for each target value. For instance, in a binary case where class 1 is the minority, we could specify: `sampling_ratio_dict={0: 0.5, 1: 1}`, which means we would undersample class 0 to have twice the number of samples as class 1 (minority:majority ratio = 0.5), and don't sample class 1. Overrides sampling_ratio if provided. Defaults to None. :type sampling_ratio_dict: dict :param k_neighbors_default: The number of nearest neighbors used to construct synthetic samples. This is the default value used, but the actual k_neighbors value might be smaller if there are less samples. Defaults to 5. :type k_neighbors_default: int :param n_jobs: The number of CPU cores to use. Defaults to -1. :type n_jobs: int :param random_seed: The seed to use for random sampling. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **modifies_features** - True * - **modifies_target** - True * - **name** - Oversampler * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Oversampler.clone evalml.pipelines.components.Oversampler.default_parameters evalml.pipelines.components.Oversampler.describe evalml.pipelines.components.Oversampler.fit evalml.pipelines.components.Oversampler.fit_transform evalml.pipelines.components.Oversampler.load evalml.pipelines.components.Oversampler.needs_fitting evalml.pipelines.components.Oversampler.parameters evalml.pipelines.components.Oversampler.save evalml.pipelines.components.Oversampler.transform evalml.pipelines.components.Oversampler.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits oversampler to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y) Fit and transform data using the sampler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms the input data by Oversampling the data. :param X: Training features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: Transformed features and target. :rtype: pd.DataFrame, pd.Series .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: PCA(variance=0.95, n_components=None, random_seed=0, **kwargs) Reduces the number of features by using Principal Component Analysis (PCA). :param variance: The percentage of the original data variance that should be preserved when reducing the number of features. Defaults to 0.95. :type variance: float :param n_components: The number of features to maintain after computing SVD. Defaults to None, but will override variance variable if set. :type n_components: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - Real(0.25, 1)}:type: {"variance" * - **modifies_features** - True * - **modifies_target** - False * - **name** - PCA Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PCA.clone evalml.pipelines.components.PCA.default_parameters evalml.pipelines.components.PCA.describe evalml.pipelines.components.PCA.fit evalml.pipelines.components.PCA.fit_transform evalml.pipelines.components.PCA.load evalml.pipelines.components.PCA.needs_fitting evalml.pipelines.components.PCA.parameters evalml.pipelines.components.PCA.save evalml.pipelines.components.PCA.transform evalml.pipelines.components.PCA.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If input data is not all numeric. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using fitted PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: PerColumnImputer(impute_strategies=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy per column. :param impute_strategies: Column and {"impute_strategy": strategy, "fill_value":value} pairings. Valid values for impute strategy include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. Defaults to None, which uses "most_frequent" for all columns. When impute_strategy == "constant", fill_value is used to replace missing data. When None, uses 0 when imputing numerical data and "missing_value" for strings or object data types. :type impute_strategies: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Per Column Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PerColumnImputer.clone evalml.pipelines.components.PerColumnImputer.default_parameters evalml.pipelines.components.PerColumnImputer.describe evalml.pipelines.components.PerColumnImputer.fit evalml.pipelines.components.PerColumnImputer.fit_transform evalml.pipelines.components.PerColumnImputer.load evalml.pipelines.components.PerColumnImputer.needs_fitting evalml.pipelines.components.PerColumnImputer.parameters evalml.pipelines.components.PerColumnImputer.save evalml.pipelines.components.PerColumnImputer.transform evalml.pipelines.components.PerColumnImputer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputers on input data. :param X: The input training data of shape [n_samples, n_features] to fit. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by imputing missing values. :param X: The input training data of shape [n_samples, n_features] to transform. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: PolynomialDecomposer(time_index: str = None, degree: int = 1, period: int = -1, random_seed: int = 0, **kwargs) Removes trends and seasonality from time series by fitting a polynomial and moving average to the data. Scikit-learn's PolynomialForecaster is used to generate the additive trend portion of the target data. A polynomial will be fit to the data during fit. That additive polynomial trend will be removed during fit so that statsmodel's seasonal_decompose can determine the addititve seasonality of the data by using rolling averages over the series' inferred periodicity. For example, daily time series data will generate rolling averages over the first week of data, normalize out the mean and return those 7 averages repeated over the entire length of the given series. Those seven averages, repeated as many times as necessary to match the length of the given target data, will be used as the seasonal signal of the data. :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param degree: Degree for the polynomial. If 1, linear model is fit to the data. If 2, quadratic model is fit, etc. Defaults to 1. :type degree: int :param period: The number of entries in the time series data that corresponds to one period of a cyclic signal. For instance, if data is known to possess a weekly seasonal signal, and if the data is daily data, period should be 7. For daily data with a yearly seasonal signal, period should be 365. Defaults to -1, which uses the statsmodels libarary's freq_to_period function. https://github.com/statsmodels/statsmodels/blob/main/statsmodels/tsa/tsatools.py :type period: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "degree": Integer(1, 3)} * - **invalid_frequencies** - [] * - **modifies_features** - False * - **modifies_target** - True * - **name** - Polynomial Decomposer * - **needs_fitting** - True * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PolynomialDecomposer.clone evalml.pipelines.components.PolynomialDecomposer.default_parameters evalml.pipelines.components.PolynomialDecomposer.describe evalml.pipelines.components.PolynomialDecomposer.determine_periodicity evalml.pipelines.components.PolynomialDecomposer.fit evalml.pipelines.components.PolynomialDecomposer.fit_transform evalml.pipelines.components.PolynomialDecomposer.get_trend_dataframe evalml.pipelines.components.PolynomialDecomposer.inverse_transform evalml.pipelines.components.PolynomialDecomposer.is_freq_valid evalml.pipelines.components.PolynomialDecomposer.load evalml.pipelines.components.PolynomialDecomposer.parameters evalml.pipelines.components.PolynomialDecomposer.plot_decomposition evalml.pipelines.components.PolynomialDecomposer.save evalml.pipelines.components.PolynomialDecomposer.set_period evalml.pipelines.components.PolynomialDecomposer.transform evalml.pipelines.components.PolynomialDecomposer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: determine_periodicity(cls, X: pandas.DataFrame, y: pandas.Series, acf_threshold: float = 0.01, rel_max_order: int = 5) :classmethod: Function that uses autocorrelative methods to determine the likely most signficant period of the seasonal signal. :param X: The feature data of the time series problem. :type X: pandas.DataFrame :param y: The target data of a time series problem. :type y: pandas.Series :param acf_threshold: The threshold for the autocorrelation function to determine the period. Any values below the threshold are considered to be 0 and will not be considered for the period. Defaults to 0.01. :type acf_threshold: float :param rel_max_order: The order of the relative maximum to determine the period. Defaults to 5. :type rel_max_order: int :returns: The integer number of entries in time series data over which the seasonal part of the target data repeats. If the time series data is in days, then this is the number of days that it takes the target's seasonal signal to repeat. Note: the target data can contain multiple seasonal signals. This function will only return the stronger. E.g. if the target has both weekly and yearly seasonality, the function may return either "7" or "365", depending on which seasonality is more strongly autocorrelated. If no period is detected, returns None. :rtype: int .. py:method:: fit(self, X: pandas.DataFrame, y: pandas.Series = None) -> PolynomialDecomposer Fits the PolynomialDecomposer and determine the seasonal signal. Currently only fits the polynomial detrender. The seasonality is determined by removing the trend from the signal and using statsmodels' seasonal_decompose(). Both the trend and seasonality are currently assumed to be additive. :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: self :raises NotImplementedError: If the input data has a frequency of "month-begin". This isn't supported by statsmodels decompose as the freqstr "MS" is misinterpreted as milliseconds. :raises ValueError: If y is None. :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data .. py:method:: fit_transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Removes fitted trend and seasonality from target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the fitted trend removed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: get_trend_dataframe(self, X: pandas.DataFrame, y: pandas.Series) -> list[pandas.DataFrame] Return a list of dataframes with 4 columns: signal, trend, seasonality, residual. Scikit-learn's PolynomialForecaster is used to generate the trend portion of the target data. statsmodel's seasonal_decompose is used to generate the seasonality of the data. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :returns: Each DataFrame contains the columns "signal", "trend", "seasonality" and "residual," with the latter 3 column values being the decomposed elements of the target data. The "signal" column is simply the input target signal but reindexed with a datetime index to match the input features. :rtype: list of pd.DataFrame :raises TypeError: If X does not have time-series data in the index. :raises ValueError: If time series index of X does not have an inferred frequency. :raises ValueError: If the forecaster associated with the detrender has not been fit yet. :raises TypeError: If y is not provided as a pandas Series or DataFrame. .. py:method:: inverse_transform(self, y_t: pandas.Series) -> tuple[pandas.DataFrame, pandas.Series] Adds back fitted trend and seasonality to target variable. The polynomial trend is added back into the signal, calling the detrender's inverse_transform(). Then, the seasonality is projected forward to and added back into the signal. :param y_t: Target variable. :type y_t: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the trend and seasonality added back in. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If y is None. .. py:method:: is_freq_valid(cls, freq: str) :classmethod: Determines if the given string represents a valid frequency for this decomposer. :param freq: A frequency to validate. See the pandas docs at https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases for options. :type freq: str :returns: boolean representing whether the frequency is valid or not. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: plot_decomposition(self, X: pandas.DataFrame, y: pandas.Series, show: bool = False) -> tuple[matplotlib.pyplot.Figure, list] Plots the decomposition of the target signal. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :param show: Whether to display the plot or not. Defaults to False. :type show: bool :returns: The figure and axes that have the decompositions plotted on them :rtype: matplotlib.pyplot.Figure, list[matplotlib.pyplot.Axes] .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: set_period(self, X: pandas.DataFrame, y: pandas.Series, acf_threshold: float = 0.01, rel_max_order: int = 5) Function to set the component's seasonal period based on the target's seasonality. :param X: The feature data of the time series problem. :type X: pandas.DataFrame :param y: The target data of a time series problem. :type y: pandas.Series :param acf_threshold: The threshold for the autocorrelation function to determine the period. Any values below the threshold are considered to be 0 and will not be considered for the period. Defaults to 0.01. :type acf_threshold: float :param rel_max_order: The order of the relative maximum to determine the period. Defaults to 5. :type rel_max_order: int .. py:method:: transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Transforms the target data by removing the polynomial trend and rolling average seasonality. Applies the fit polynomial detrender to the target data, removing the additive polynomial trend. Then, utilizes the first period's worth of seasonal data determined in the .fit() function to extrapolate the seasonal signal of the data to be transformed. This seasonal signal is also assumed to be additive and is removed. :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is detrended and deseasonalized. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ProphetRegressor(time_index: Optional[Hashable] = None, changepoint_prior_scale: float = 0.05, seasonality_prior_scale: int = 10, holidays_prior_scale: int = 10, seasonality_mode: str = 'additive', stan_backend: str = 'CMDSTANPY', interval_width: float = 0.95, random_seed: Union[int, float] = 0, **kwargs) Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. More information here: https://facebook.github.io/prophet/ :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param changepoint_prior_scale: Determines the strength of the sparse prior for fitting on rate changes. Increasing this value increases the flexibility of the trend. Defaults to 0.05. :type changepoint_prior_scale: float :param seasonality_prior_scale: Similar to changepoint_prior_scale. Adjusts the extent to which the seasonality model will fit the data. Defaults to 10. :type seasonality_prior_scale: int :param holidays_prior_scale: Similar to changepoint_prior_scale. Adjusts the extent to which holidays will fit the data. Defaults to 10. :type holidays_prior_scale: int :param seasonality_mode: Determines how this component fits the seasonality. Options are "additive" and "multiplicative". Defaults to "additive". :type seasonality_mode: str :param stan_backend: Determines the backend that should be used to run Prophet. Options are "CMDSTANPY" and "PYSTAN". Defaults to "CMDSTANPY". :type stan_backend: str :param interval_width: Determines the confidence of the prediction interval range when calling `get_prediction_intervals`. Accepts values in the range (0,1). Defaults to 0.95. :type interval_width: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "changepoint_prior_scale": Real(0.001, 0.5), "seasonality_prior_scale": Real(0.01, 10), "holidays_prior_scale": Real(0.01, 10), "seasonality_mode": ["additive", "multiplicative"],} * - **model_family** - ModelFamily.PROPHET * - **modifies_features** - True * - **modifies_target** - False * - **name** - Prophet Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ProphetRegressor.build_prophet_df evalml.pipelines.components.ProphetRegressor.clone evalml.pipelines.components.ProphetRegressor.default_parameters evalml.pipelines.components.ProphetRegressor.describe evalml.pipelines.components.ProphetRegressor.feature_importance evalml.pipelines.components.ProphetRegressor.fit evalml.pipelines.components.ProphetRegressor.get_params evalml.pipelines.components.ProphetRegressor.get_prediction_intervals evalml.pipelines.components.ProphetRegressor.load evalml.pipelines.components.ProphetRegressor.needs_fitting evalml.pipelines.components.ProphetRegressor.parameters evalml.pipelines.components.ProphetRegressor.predict evalml.pipelines.components.ProphetRegressor.predict_proba evalml.pipelines.components.ProphetRegressor.save evalml.pipelines.components.ProphetRegressor.update_parameters .. py:method:: build_prophet_df(X: pandas.DataFrame, y: Optional[pandas.Series] = None, time_index: str = 'ds') -> pandas.DataFrame :staticmethod: Build the Prophet data to pass fit and predict on. .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) -> dict Returns the default parameters for this component. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> numpy.ndarray :property: Returns array of 0's with len(1) as feature_importance is not defined for Prophet regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits Prophet regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_params(self) -> dict Get parameters for the Prophet regressor. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted ProphetRegressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: List[float] :param predictions: Not used for Prophet estimator. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) -> pandas.Series Make predictions using fitted Prophet regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RandomForestClassifier(n_estimators=100, max_depth=6, n_jobs=-1, random_seed=0, **kwargs) Random Forest Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_depth": Integer(1, 10),} * - **model_family** - ModelFamily.RANDOM_FOREST * - **modifies_features** - True * - **modifies_target** - False * - **name** - Random Forest Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RandomForestClassifier.clone evalml.pipelines.components.RandomForestClassifier.default_parameters evalml.pipelines.components.RandomForestClassifier.describe evalml.pipelines.components.RandomForestClassifier.feature_importance evalml.pipelines.components.RandomForestClassifier.fit evalml.pipelines.components.RandomForestClassifier.get_prediction_intervals evalml.pipelines.components.RandomForestClassifier.load evalml.pipelines.components.RandomForestClassifier.needs_fitting evalml.pipelines.components.RandomForestClassifier.parameters evalml.pipelines.components.RandomForestClassifier.predict evalml.pipelines.components.RandomForestClassifier.predict_proba evalml.pipelines.components.RandomForestClassifier.save evalml.pipelines.components.RandomForestClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RandomForestRegressor(n_estimators: int = 100, max_depth: int = 6, n_jobs: int = -1, random_seed: Union[int, float] = 0, **kwargs) Random Forest Regressor. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_depth": Integer(1, 32),} * - **model_family** - ModelFamily.RANDOM_FOREST * - **modifies_features** - True * - **modifies_target** - False * - **name** - Random Forest Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RandomForestRegressor.clone evalml.pipelines.components.RandomForestRegressor.default_parameters evalml.pipelines.components.RandomForestRegressor.describe evalml.pipelines.components.RandomForestRegressor.feature_importance evalml.pipelines.components.RandomForestRegressor.fit evalml.pipelines.components.RandomForestRegressor.get_prediction_intervals evalml.pipelines.components.RandomForestRegressor.load evalml.pipelines.components.RandomForestRegressor.needs_fitting evalml.pipelines.components.RandomForestRegressor.parameters evalml.pipelines.components.RandomForestRegressor.predict evalml.pipelines.components.RandomForestRegressor.predict_proba evalml.pipelines.components.RandomForestRegressor.save evalml.pipelines.components.RandomForestRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted RandomForestRegressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Optional. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ReplaceNullableTypes(random_seed=0, **kwargs) Transformer to replace features with the new nullable dtypes with a dtype that is compatible in EvalML. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **modifies_features** - True * - **modifies_target** - {} * - **name** - Replace Nullable Types Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ReplaceNullableTypes.clone evalml.pipelines.components.ReplaceNullableTypes.default_parameters evalml.pipelines.components.ReplaceNullableTypes.describe evalml.pipelines.components.ReplaceNullableTypes.fit evalml.pipelines.components.ReplaceNullableTypes.fit_transform evalml.pipelines.components.ReplaceNullableTypes.load evalml.pipelines.components.ReplaceNullableTypes.needs_fitting evalml.pipelines.components.ReplaceNullableTypes.parameters evalml.pipelines.components.ReplaceNullableTypes.save evalml.pipelines.components.ReplaceNullableTypes.transform evalml.pipelines.components.ReplaceNullableTypes.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Substitutes non-nullable types for the new pandas nullable types in the data and target data. :param X: Input features. :type X: pd.DataFrame, optional :param y: Target data. :type y: pd.Series :returns: The input features and target data with the non-nullable types set. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data by replacing columns that contain nullable types with the appropriate replacement type. "float64" for nullable integers and "category" for nullable booleans. :param X: Data to transform :type X: pd.DataFrame :param y: Target data to transform :type y: pd.Series, optional :returns: Transformed X pd.Series: Transformed y :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RFClassifierRFESelector(step=0.2, min_features_to_select=1, cv=None, scoring=None, n_jobs=-1, n_estimators=10, max_depth=None, random_seed=0, **kwargs) Selects relevant features using recursive feature elimination with a Random Forest Classifier. :param step: The number of features to eliminate in each iteration. If an integer is specified this will represent the number of features to eliminate. If a float is specified this represents the percentage of features to eliminate each iteration. The last iteration may drop fewer than this number of features in order to satisfy the min_features_to_select constraint. Defaults to 0.2. :type step: int, float :param min_features_to_select: The minimum number of features to return. Defaults to 1. :type min_features_to_select: int :param cv: Number of folds to use for the cross-validation splitting strategy. Defaults to None which will use 5 folds. :type cv: int or None :param scoring: A string or scorer callable object to specify the scoring method. :type scoring: str, callable or None :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param n_estimators: The number of trees in the forest. Defaults to 10. :type n_estimators: int :param max_depth: Maximum tree depth for base learners. Defaults to None. :type max_depth: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "step": Real(0.05, 0.25)} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RFE Selector with RF Classifier * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFClassifierRFESelector.clone evalml.pipelines.components.RFClassifierRFESelector.default_parameters evalml.pipelines.components.RFClassifierRFESelector.describe evalml.pipelines.components.RFClassifierRFESelector.fit evalml.pipelines.components.RFClassifierRFESelector.fit_transform evalml.pipelines.components.RFClassifierRFESelector.get_names evalml.pipelines.components.RFClassifierRFESelector.load evalml.pipelines.components.RFClassifierRFESelector.needs_fitting evalml.pipelines.components.RFClassifierRFESelector.parameters evalml.pipelines.components.RFClassifierRFESelector.save evalml.pipelines.components.RFClassifierRFESelector.transform evalml.pipelines.components.RFClassifierRFESelector.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RFClassifierSelectFromModel(number_features=None, n_estimators=10, max_depth=None, percent_features=0.5, threshold='median', n_jobs=-1, random_seed=0, **kwargs) Selects top features based on importance weights using a Random Forest classifier. :param number_features: The maximum number of features to select. If both percent_features and number_features are specified, take the greater number of features. Defaults to None. :type number_features: int :param n_estimators: The number of trees in the forest. Defaults to 10. :type n_estimators: int :param max_depth: Maximum tree depth for base learners. Defaults to None. :type max_depth: int :param percent_features: Percentage of features to use. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. :type percent_features: float :param threshold: The threshold value to use for feature selection. Features whose importance is greater or equal are kept while the others are discarded. If "median", then the threshold value is the median of the feature importances. A scaling factor (e.g., "1.25*mean") may also be used. Defaults to median. :type threshold: string or float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "percent_features": Real(0.01, 1), "threshold": ["mean", "median"],} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RF Classifier Select From Model * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFClassifierSelectFromModel.clone evalml.pipelines.components.RFClassifierSelectFromModel.default_parameters evalml.pipelines.components.RFClassifierSelectFromModel.describe evalml.pipelines.components.RFClassifierSelectFromModel.fit evalml.pipelines.components.RFClassifierSelectFromModel.fit_transform evalml.pipelines.components.RFClassifierSelectFromModel.get_names evalml.pipelines.components.RFClassifierSelectFromModel.load evalml.pipelines.components.RFClassifierSelectFromModel.needs_fitting evalml.pipelines.components.RFClassifierSelectFromModel.parameters evalml.pipelines.components.RFClassifierSelectFromModel.save evalml.pipelines.components.RFClassifierSelectFromModel.transform evalml.pipelines.components.RFClassifierSelectFromModel.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RFRegressorRFESelector(step=0.2, min_features_to_select=1, cv=None, scoring=None, n_jobs=-1, n_estimators=10, max_depth=None, random_seed=0, **kwargs) Selects relevant features using recursive feature elimination with a Random Forest Regressor. :param step: The number of features to eliminate in each iteration. If an integer is specified this will represent the number of features to eliminate. If a float is specified this represents the percentage of features to eliminate each iteration. The last iteration may drop fewer than this number of features in order to satisfy the min_features_to_select constraint. Defaults to 0.2. :type step: int, float :param min_features_to_select: The minimum number of features to return. Defaults to 1. :type min_features_to_select: int :param cv: Number of folds to use for the cross-validation splitting strategy. Defaults to None which will use 5 folds. :type cv: int or None :param scoring: A string or scorer callable object to specify the scoring method. :type scoring: str, callable or None :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param n_estimators: The number of trees in the forest. Defaults to 10. :type n_estimators: int :param max_depth: Maximum tree depth for base learners. Defaults to None. :type max_depth: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "step": Real(0.05, 0.25)} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RFE Selector with RF Regressor * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFRegressorRFESelector.clone evalml.pipelines.components.RFRegressorRFESelector.default_parameters evalml.pipelines.components.RFRegressorRFESelector.describe evalml.pipelines.components.RFRegressorRFESelector.fit evalml.pipelines.components.RFRegressorRFESelector.fit_transform evalml.pipelines.components.RFRegressorRFESelector.get_names evalml.pipelines.components.RFRegressorRFESelector.load evalml.pipelines.components.RFRegressorRFESelector.needs_fitting evalml.pipelines.components.RFRegressorRFESelector.parameters evalml.pipelines.components.RFRegressorRFESelector.save evalml.pipelines.components.RFRegressorRFESelector.transform evalml.pipelines.components.RFRegressorRFESelector.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RFRegressorSelectFromModel(number_features=None, n_estimators=10, max_depth=None, percent_features=0.5, threshold='median', n_jobs=-1, random_seed=0, **kwargs) Selects top features based on importance weights using a Random Forest regressor. :param number_features: The maximum number of features to select. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. :type number_features: int :param n_estimators: The number of trees in the forest. Defaults to 10. :type n_estimators: int :param max_depth: Maximum tree depth for base learners. Defaults to None. :type max_depth: int :param percent_features: Percentage of features to use. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. :type percent_features: float :param threshold: The threshold value to use for feature selection. Features whose importance is greater or equal are kept while the others are discarded. If "median", then the threshold value is the median of the feature importances. A scaling factor (e.g., "1.25*mean") may also be used. Defaults to median. :type threshold: string or float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "percent_features": Real(0.01, 1), "threshold": ["mean", "median"],} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RF Regressor Select From Model * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFRegressorSelectFromModel.clone evalml.pipelines.components.RFRegressorSelectFromModel.default_parameters evalml.pipelines.components.RFRegressorSelectFromModel.describe evalml.pipelines.components.RFRegressorSelectFromModel.fit evalml.pipelines.components.RFRegressorSelectFromModel.fit_transform evalml.pipelines.components.RFRegressorSelectFromModel.get_names evalml.pipelines.components.RFRegressorSelectFromModel.load evalml.pipelines.components.RFRegressorSelectFromModel.needs_fitting evalml.pipelines.components.RFRegressorSelectFromModel.parameters evalml.pipelines.components.RFRegressorSelectFromModel.save evalml.pipelines.components.RFRegressorSelectFromModel.transform evalml.pipelines.components.RFRegressorSelectFromModel.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SelectByType(column_types=None, exclude=False, random_seed=0, **kwargs) Selects columns by specified Woodwork logical type or semantic tag in input data. :param column_types: List of Woodwork types or tags, used to determine which columns to select or exclude. :type column_types: string, ww.LogicalType, list(string), list(ww.LogicalType) :param exclude: If true, exclude the column_types instead of including them. Defaults to False. :type exclude: bool :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Select Columns By Type Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SelectByType.clone evalml.pipelines.components.SelectByType.default_parameters evalml.pipelines.components.SelectByType.describe evalml.pipelines.components.SelectByType.fit evalml.pipelines.components.SelectByType.fit_transform evalml.pipelines.components.SelectByType.load evalml.pipelines.components.SelectByType.parameters evalml.pipelines.components.SelectByType.save evalml.pipelines.components.SelectByType.transform evalml.pipelines.components.SelectByType.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, ignored :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by selecting columns. :param X: Data to transform. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: Transformed X. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SelectColumns(columns=None, random_seed=0, **kwargs) Selects specified columns in input data. :param columns: List of column names, used to determine which columns to select. If columns are not present, they will not be selected. :type columns: list(string) :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Select Columns Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SelectColumns.clone evalml.pipelines.components.SelectColumns.default_parameters evalml.pipelines.components.SelectColumns.describe evalml.pipelines.components.SelectColumns.fit evalml.pipelines.components.SelectColumns.fit_transform evalml.pipelines.components.SelectColumns.load evalml.pipelines.components.SelectColumns.parameters evalml.pipelines.components.SelectColumns.save evalml.pipelines.components.SelectColumns.transform evalml.pipelines.components.SelectColumns.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using fitted column selector component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SimpleImputer(impute_strategy='most_frequent', fill_value=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy. Natural language columns are ignored. :param impute_strategy: Impute strategy to use. Valid values include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. :type impute_strategy: string :param fill_value: When impute_strategy == "constant", fill_value is used to replace missing data. Defaults to 0 when imputing numerical data and "missing_value" for strings or object data types. :type fill_value: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "impute_strategy": ["mean", "median", "most_frequent"]} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Simple Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SimpleImputer.clone evalml.pipelines.components.SimpleImputer.default_parameters evalml.pipelines.components.SimpleImputer.describe evalml.pipelines.components.SimpleImputer.fit evalml.pipelines.components.SimpleImputer.fit_transform evalml.pipelines.components.SimpleImputer.load evalml.pipelines.components.SimpleImputer.needs_fitting evalml.pipelines.components.SimpleImputer.parameters evalml.pipelines.components.SimpleImputer.save evalml.pipelines.components.SimpleImputer.transform evalml.pipelines.components.SimpleImputer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: the input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: the target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises ValueError: if the SimpleImputer receives a dataframe with both Boolean and Categorical data. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input by imputing missing values. 'None' and np.nan values are treated as the same. :param X: Data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: StackedEnsembleBase(final_estimator=None, n_jobs=-1, random_seed=0, **kwargs) Stacked Ensemble Base Class. :param final_estimator: The estimator used to combine the base estimators. :type final_estimator: Estimator or subclass :param n_jobs: Integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs greater than -1, (n_cpus + 1 + n_jobs) are used. Defaults to -1. - Note: there could be some multi-process errors thrown for values of `n_jobs != 1`. If this is the case, please use `n_jobs = 1`. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **model_family** - ModelFamily.ENSEMBLE * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StackedEnsembleBase.clone evalml.pipelines.components.StackedEnsembleBase.default_parameters evalml.pipelines.components.StackedEnsembleBase.describe evalml.pipelines.components.StackedEnsembleBase.feature_importance evalml.pipelines.components.StackedEnsembleBase.fit evalml.pipelines.components.StackedEnsembleBase.get_prediction_intervals evalml.pipelines.components.StackedEnsembleBase.load evalml.pipelines.components.StackedEnsembleBase.name evalml.pipelines.components.StackedEnsembleBase.needs_fitting evalml.pipelines.components.StackedEnsembleBase.parameters evalml.pipelines.components.StackedEnsembleBase.predict evalml.pipelines.components.StackedEnsembleBase.predict_proba evalml.pipelines.components.StackedEnsembleBase.save evalml.pipelines.components.StackedEnsembleBase.supported_problem_types evalml.pipelines.components.StackedEnsembleBase.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for stacked ensemble classes. :returns: default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Not implemented for StackedEnsembleClassifier and StackedEnsembleRegressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: supported_problem_types(cls) :property: Problem types this estimator supports. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: StackedEnsembleClassifier(final_estimator=None, n_jobs=-1, random_seed=0, **kwargs) Stacked Ensemble Classifier. :param final_estimator: The classifier used to combine the base estimators. If None, uses ElasticNetClassifier. :type final_estimator: Estimator or subclass :param n_jobs: Integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Defaults to -1. - Note: there could be some multi-process errors thrown for values of `n_jobs != 1`. If this is the case, please use `n_jobs = 1`. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> from evalml.pipelines.component_graph import ComponentGraph >>> from evalml.pipelines.components.estimators.classifiers.decision_tree_classifier import DecisionTreeClassifier >>> from evalml.pipelines.components.estimators.classifiers.elasticnet_classifier import ElasticNetClassifier ... >>> component_graph = { ... "Decision Tree": [DecisionTreeClassifier(random_seed=3), "X", "y"], ... "Decision Tree B": [DecisionTreeClassifier(random_seed=4), "X", "y"], ... "Stacked Ensemble": [ ... StackedEnsembleClassifier(n_jobs=1, final_estimator=DecisionTreeClassifier()), ... "Decision Tree.x", ... "Decision Tree B.x", ... "y", ... ], ... } ... >>> cg = ComponentGraph(component_graph) >>> assert cg.default_parameters == { ... 'Decision Tree Classifier': {'criterion': 'gini', ... 'max_features': 'auto', ... 'max_depth': 6, ... 'min_samples_split': 2, ... 'min_weight_fraction_leaf': 0.0}, ... 'Stacked Ensemble Classifier': {'final_estimator': ElasticNetClassifier, ... 'n_jobs': -1}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.ENSEMBLE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Stacked Ensemble Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StackedEnsembleClassifier.clone evalml.pipelines.components.StackedEnsembleClassifier.default_parameters evalml.pipelines.components.StackedEnsembleClassifier.describe evalml.pipelines.components.StackedEnsembleClassifier.feature_importance evalml.pipelines.components.StackedEnsembleClassifier.fit evalml.pipelines.components.StackedEnsembleClassifier.get_prediction_intervals evalml.pipelines.components.StackedEnsembleClassifier.load evalml.pipelines.components.StackedEnsembleClassifier.needs_fitting evalml.pipelines.components.StackedEnsembleClassifier.parameters evalml.pipelines.components.StackedEnsembleClassifier.predict evalml.pipelines.components.StackedEnsembleClassifier.predict_proba evalml.pipelines.components.StackedEnsembleClassifier.save evalml.pipelines.components.StackedEnsembleClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for stacked ensemble classes. :returns: default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Not implemented for StackedEnsembleClassifier and StackedEnsembleRegressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: StackedEnsembleRegressor(final_estimator=None, n_jobs=-1, random_seed=0, **kwargs) Stacked Ensemble Regressor. :param final_estimator: The regressor used to combine the base estimators. If None, uses ElasticNetRegressor. :type final_estimator: Estimator or subclass :param n_jobs: Integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs greater than -1, (n_cpus + 1 + n_jobs) are used. Defaults to -1. - Note: there could be some multi-process errors thrown for values of `n_jobs != 1`. If this is the case, please use `n_jobs = 1`. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> from evalml.pipelines.component_graph import ComponentGraph >>> from evalml.pipelines.components.estimators.regressors.rf_regressor import RandomForestRegressor >>> from evalml.pipelines.components.estimators.regressors.elasticnet_regressor import ElasticNetRegressor ... >>> component_graph = { ... "Random Forest": [RandomForestRegressor(random_seed=3), "X", "y"], ... "Random Forest B": [RandomForestRegressor(random_seed=4), "X", "y"], ... "Stacked Ensemble": [ ... StackedEnsembleRegressor(n_jobs=1, final_estimator=RandomForestRegressor()), ... "Random Forest.x", ... "Random Forest B.x", ... "y", ... ], ... } ... >>> cg = ComponentGraph(component_graph) >>> assert cg.default_parameters == { ... 'Random Forest Regressor': {'n_estimators': 100, ... 'max_depth': 6, ... 'n_jobs': -1}, ... 'Stacked Ensemble Regressor': {'final_estimator': ElasticNetRegressor, ... 'n_jobs': -1}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.ENSEMBLE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Stacked Ensemble Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StackedEnsembleRegressor.clone evalml.pipelines.components.StackedEnsembleRegressor.default_parameters evalml.pipelines.components.StackedEnsembleRegressor.describe evalml.pipelines.components.StackedEnsembleRegressor.feature_importance evalml.pipelines.components.StackedEnsembleRegressor.fit evalml.pipelines.components.StackedEnsembleRegressor.get_prediction_intervals evalml.pipelines.components.StackedEnsembleRegressor.load evalml.pipelines.components.StackedEnsembleRegressor.needs_fitting evalml.pipelines.components.StackedEnsembleRegressor.parameters evalml.pipelines.components.StackedEnsembleRegressor.predict evalml.pipelines.components.StackedEnsembleRegressor.predict_proba evalml.pipelines.components.StackedEnsembleRegressor.save evalml.pipelines.components.StackedEnsembleRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for stacked ensemble classes. :returns: default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Not implemented for StackedEnsembleClassifier and StackedEnsembleRegressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: StandardScaler(random_seed=0, **kwargs) A transformer that standardizes input features by removing the mean and scaling to unit variance. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Standard Scaler * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StandardScaler.clone evalml.pipelines.components.StandardScaler.default_parameters evalml.pipelines.components.StandardScaler.describe evalml.pipelines.components.StandardScaler.fit evalml.pipelines.components.StandardScaler.fit_transform evalml.pipelines.components.StandardScaler.load evalml.pipelines.components.StandardScaler.needs_fitting evalml.pipelines.components.StandardScaler.parameters evalml.pipelines.components.StandardScaler.save evalml.pipelines.components.StandardScaler.transform evalml.pipelines.components.StandardScaler.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the standard scalar on the given data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the standard scaler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted standard scaler. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: STLDecomposer(time_index: str = None, degree: int = 1, period: int = None, seasonal_smoother: int = 7, random_seed: int = 0, **kwargs) Removes trends and seasonality from time series using the STL algorithm. https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.STL.html :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param degree: Not currently used. STL 3x "degree-like" values. None are able to be set at this time. Defaults to 1. :type degree: int :param period: The number of entries in the time series data that corresponds to one period of a cyclic signal. For instance, if data is known to possess a weekly seasonal signal, and if the data is daily data, the period should likely be 7. For daily data with a yearly seasonal signal, the period should likely be 365. If None, statsmodels will infer the period based on the frequency. Defaults to None. :type period: int :param seasonal_smoother: The length of the seasonal smoother used by the underlying STL algorithm. For compatibility, must be odd. If an even number is provided, the next, highest odd number will be used. Defaults to 7. :type seasonal_smoother: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **invalid_frequencies** - [] * - **modifies_features** - False * - **modifies_target** - True * - **name** - STL Decomposer * - **needs_fitting** - True * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.STLDecomposer.clone evalml.pipelines.components.STLDecomposer.default_parameters evalml.pipelines.components.STLDecomposer.describe evalml.pipelines.components.STLDecomposer.determine_periodicity evalml.pipelines.components.STLDecomposer.fit evalml.pipelines.components.STLDecomposer.fit_transform evalml.pipelines.components.STLDecomposer.get_trend_dataframe evalml.pipelines.components.STLDecomposer.get_trend_prediction_intervals evalml.pipelines.components.STLDecomposer.inverse_transform evalml.pipelines.components.STLDecomposer.is_freq_valid evalml.pipelines.components.STLDecomposer.load evalml.pipelines.components.STLDecomposer.parameters evalml.pipelines.components.STLDecomposer.plot_decomposition evalml.pipelines.components.STLDecomposer.save evalml.pipelines.components.STLDecomposer.set_period evalml.pipelines.components.STLDecomposer.transform evalml.pipelines.components.STLDecomposer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: determine_periodicity(cls, X: pandas.DataFrame, y: pandas.Series, acf_threshold: float = 0.01, rel_max_order: int = 5) :classmethod: Function that uses autocorrelative methods to determine the likely most signficant period of the seasonal signal. :param X: The feature data of the time series problem. :type X: pandas.DataFrame :param y: The target data of a time series problem. :type y: pandas.Series :param acf_threshold: The threshold for the autocorrelation function to determine the period. Any values below the threshold are considered to be 0 and will not be considered for the period. Defaults to 0.01. :type acf_threshold: float :param rel_max_order: The order of the relative maximum to determine the period. Defaults to 5. :type rel_max_order: int :returns: The integer number of entries in time series data over which the seasonal part of the target data repeats. If the time series data is in days, then this is the number of days that it takes the target's seasonal signal to repeat. Note: the target data can contain multiple seasonal signals. This function will only return the stronger. E.g. if the target has both weekly and yearly seasonality, the function may return either "7" or "365", depending on which seasonality is more strongly autocorrelated. If no period is detected, returns None. :rtype: int .. py:method:: fit(self, X: pandas.DataFrame, y: pandas.Series = None) -> STLDecomposer Fits the STLDecomposer and determine the seasonal signal. Instantiates a statsmodels STL decompose object with the component's stored parameters and fits it. Since the statsmodels object does not fit the sklearn api, it is not saved during __init__() in _component_obj and will be re-instantiated each time fit is called. To emulate the sklearn API, when the STL decomposer is fit, the full seasonal component, a single period sample of the seasonal component, the full trend-cycle component and the residual are saved. y(t) = S(t) + T(t) + R(t) :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: self :raises ValueError: If y is None. :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data .. py:method:: fit_transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Removes fitted trend and seasonality from target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the fitted trend removed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: get_trend_dataframe(self, X, y) Return a list of dataframes with 4 columns: signal, trend, seasonality, residual. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :returns: Each DataFrame contains the columns "signal", "trend", "seasonality" and "residual," with the latter 3 column values being the decomposed elements of the target data. The "signal" column is simply the input target signal but reindexed with a datetime index to match the input features. :rtype: list of pd.DataFrame :raises TypeError: If X does not have time-series data in the index. :raises ValueError: If time series index of X does not have an inferred frequency. :raises ValueError: If the forecaster associated with the detrender has not been fit yet. :raises TypeError: If y is not provided as a pandas Series or DataFrame. .. py:method:: get_trend_prediction_intervals(self, y, coverage=None) Calculate the prediction intervals for the trend data. :param y: Target data. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict of pd.Series .. py:method:: inverse_transform(self, y_t: pandas.Series) -> tuple[pandas.DataFrame, pandas.Series] Adds back fitted trend and seasonality to target variable. The STL trend is projected to cover the entire requested target range, then added back into the signal. Then, the seasonality is projected forward to and added back into the signal. :param y_t: Target variable. :type y_t: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the trend and seasonality added back in. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If y is None. .. py:method:: is_freq_valid(cls, freq: str) :classmethod: Determines if the given string represents a valid frequency for this decomposer. :param freq: A frequency to validate. See the pandas docs at https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases for options. :type freq: str :returns: boolean representing whether the frequency is valid or not. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: plot_decomposition(self, X: pandas.DataFrame, y: pandas.Series, show: bool = False) -> tuple[matplotlib.pyplot.Figure, list] Plots the decomposition of the target signal. :param X: Input data with time series data in index. :type X: pd.DataFrame :param y: Target variable data provided as a Series for univariate problems or a DataFrame for multivariate problems. :type y: pd.Series or pd.DataFrame :param show: Whether to display the plot or not. Defaults to False. :type show: bool :returns: The figure and axes that have the decompositions plotted on them :rtype: matplotlib.pyplot.Figure, list[matplotlib.pyplot.Axes] .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: set_period(self, X: pandas.DataFrame, y: pandas.Series, acf_threshold: float = 0.01, rel_max_order: int = 5) Function to set the component's seasonal period based on the target's seasonality. :param X: The feature data of the time series problem. :type X: pandas.DataFrame :param y: The target data of a time series problem. :type y: pandas.Series :param acf_threshold: The threshold for the autocorrelation function to determine the period. Any values below the threshold are considered to be 0 and will not be considered for the period. Defaults to 0.01. :type acf_threshold: float :param rel_max_order: The order of the relative maximum to determine the period. Defaults to 5. :type rel_max_order: int .. py:method:: transform(self, X: pandas.DataFrame, y: pandas.Series = None) -> tuple[pandas.DataFrame, pandas.Series] Transforms the target data by removing the STL trend and seasonality. Uses an ARIMA model to project forward the addititve trend and removes it. Then, utilizes the first period's worth of seasonal data determined in the .fit() function to extrapolate the seasonal signal of the data to be transformed. This seasonal signal is also assumed to be additive and is removed. :param X: Conditionally used to build datetime index. :type X: pd.DataFrame, optional :param y: Target variable to detrend and deseasonalize. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is detrended and deseasonalized. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If target data doesn't have DatetimeIndex AND no Datetime features in features data .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SVMClassifier(C=1.0, kernel='rbf', gamma='auto', probability=True, random_seed=0, **kwargs) Support Vector Machine Classifier. :param C: The regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty. Defaults to 1.0. :type C: float :param kernel: Specifies the kernel type to be used in the algorithm. Defaults to "rbf". :type kernel: {"poly", "rbf", "sigmoid"} :param gamma: Kernel coefficient for "rbf", "poly" and "sigmoid". Defaults to "auto". - If gamma='scale' is passed then it uses 1 / (n_features * X.var()) as value of gamma - If "auto" (default), uses 1 / n_features :type gamma: {"scale", "auto"} or float :param probability: Whether to enable probability estimates. Defaults to True. :type probability: boolean :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0, 10), "kernel": ["poly", "rbf", "sigmoid"], "gamma": ["scale", "auto"],} * - **model_family** - ModelFamily.SVM * - **modifies_features** - True * - **modifies_target** - False * - **name** - SVM Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SVMClassifier.clone evalml.pipelines.components.SVMClassifier.default_parameters evalml.pipelines.components.SVMClassifier.describe evalml.pipelines.components.SVMClassifier.feature_importance evalml.pipelines.components.SVMClassifier.fit evalml.pipelines.components.SVMClassifier.get_prediction_intervals evalml.pipelines.components.SVMClassifier.load evalml.pipelines.components.SVMClassifier.needs_fitting evalml.pipelines.components.SVMClassifier.parameters evalml.pipelines.components.SVMClassifier.predict evalml.pipelines.components.SVMClassifier.predict_proba evalml.pipelines.components.SVMClassifier.save evalml.pipelines.components.SVMClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance only works with linear kernels. If the kernel isn't linear, we return a numpy array of zeros. :returns: Feature importance of fitted SVM classifier or a numpy array of zeroes if the kernel is not linear. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SVMRegressor(C=1.0, kernel='rbf', gamma='auto', random_seed=0, **kwargs) Support Vector Machine Regressor. :param C: The regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty. Defaults to 1.0. :type C: float :param kernel: Specifies the kernel type to be used in the algorithm. Defaults to "rbf". :type kernel: {"poly", "rbf", "sigmoid"} :param gamma: Kernel coefficient for "rbf", "poly" and "sigmoid". Defaults to "auto". - If gamma='scale' is passed then it uses 1 / (n_features * X.var()) as value of gamma - If "auto" (default), uses 1 / n_features :type gamma: {"scale", "auto"} or float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0, 10), "kernel": ["poly", "rbf", "sigmoid"], "gamma": ["scale", "auto"],} * - **model_family** - ModelFamily.SVM * - **modifies_features** - True * - **modifies_target** - False * - **name** - SVM Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SVMRegressor.clone evalml.pipelines.components.SVMRegressor.default_parameters evalml.pipelines.components.SVMRegressor.describe evalml.pipelines.components.SVMRegressor.feature_importance evalml.pipelines.components.SVMRegressor.fit evalml.pipelines.components.SVMRegressor.get_prediction_intervals evalml.pipelines.components.SVMRegressor.load evalml.pipelines.components.SVMRegressor.needs_fitting evalml.pipelines.components.SVMRegressor.parameters evalml.pipelines.components.SVMRegressor.predict evalml.pipelines.components.SVMRegressor.predict_proba evalml.pipelines.components.SVMRegressor.save evalml.pipelines.components.SVMRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted SVM regresor. Only works with linear kernels. If the kernel isn't linear, we return a numpy array of zeros. :returns: The feature importance of the fitted SVM regressor, or an array of zeroes if the kernel is not linear. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TargetEncoder(cols=None, smoothing=1, handle_unknown='value', handle_missing='value', random_seed=0, **kwargs) A transformer that encodes categorical features into target encodings. :param cols: Columns to encode. If None, all string columns will be encoded, otherwise only the columns provided will be encoded. Defaults to None :type cols: list :param smoothing: The smoothing factor to apply. The larger this value is, the more influence the expected target value has on the resulting target encodings. Must be strictly larger than 0. Defaults to 1.0 :type smoothing: float :param handle_unknown: Determines how to handle unknown categories for a feature encountered. Options are 'value', 'error', nd 'return_nan'. Defaults to 'value', which replaces with the target mean :type handle_unknown: string :param handle_missing: Determines how to handle missing values encountered during `fit` or `transform`. Options are 'value', 'error', and 'return_nan'. Defaults to 'value', which replaces with the target mean :type handle_missing: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Target Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TargetEncoder.clone evalml.pipelines.components.TargetEncoder.default_parameters evalml.pipelines.components.TargetEncoder.describe evalml.pipelines.components.TargetEncoder.fit evalml.pipelines.components.TargetEncoder.fit_transform evalml.pipelines.components.TargetEncoder.get_feature_names evalml.pipelines.components.TargetEncoder.load evalml.pipelines.components.TargetEncoder.needs_fitting evalml.pipelines.components.TargetEncoder.parameters evalml.pipelines.components.TargetEncoder.save evalml.pipelines.components.TargetEncoder.transform evalml.pipelines.components.TargetEncoder.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y) Fit and transform data using the target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_feature_names(self) Return feature names for the input features after fitting. :returns: The feature names after encoding. :rtype: np.array .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TargetImputer(impute_strategy='most_frequent', fill_value=None, random_seed=0, **kwargs) Imputes missing target data according to a specified imputation strategy. :param impute_strategy: Impute strategy to use. Valid values include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. Defaults to "most_frequent". :type impute_strategy: string :param fill_value: When impute_strategy == "constant", fill_value is used to replace missing data. Defaults to None which uses 0 when imputing numerical data and "missing_value" for strings or object data types. :type fill_value: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "impute_strategy": ["mean", "median", "most_frequent"]} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Target Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TargetImputer.clone evalml.pipelines.components.TargetImputer.default_parameters evalml.pipelines.components.TargetImputer.describe evalml.pipelines.components.TargetImputer.fit evalml.pipelines.components.TargetImputer.fit_transform evalml.pipelines.components.TargetImputer.load evalml.pipelines.components.TargetImputer.needs_fitting evalml.pipelines.components.TargetImputer.parameters evalml.pipelines.components.TargetImputer.save evalml.pipelines.components.TargetImputer.transform evalml.pipelines.components.TargetImputer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits imputer to target data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises TypeError: If target is filled with all null values. .. py:method:: fit_transform(self, X, y) Fits on and transforms the input target data. :param X: Features. Ignored. :type X: pd.DataFrame :param y: Target data to impute. :type y: pd.Series :returns: The original X, transformed y :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y) Transforms input target data by imputing missing values. 'None' and np.nan values are treated as the same. :param X: Features. Ignored. :type X: pd.DataFrame :param y: Target data to impute. :type y: pd.Series :returns: The original X, transformed y :rtype: (pd.DataFrame, pd.Series) .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TimeSeriesBaselineEstimator(gap=1, forecast_horizon=1, random_seed=0, **kwargs) Time series estimator that predicts using the naive forecasting approach. This is useful as a simple baseline estimator for time series problems. :param gap: Gap between prediction date and target date and must be a positive integer. If gap is 0, target date will be shifted ahead by 1 time period. Defaults to 1. :type gap: int :param forecast_horizon: Number of time steps the model is expected to predict. :type forecast_horizon: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Time Series Baseline Estimator * - **supported_problem_types** - [ ProblemTypes.TIME_SERIES_REGRESSION, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesBaselineEstimator.clone evalml.pipelines.components.TimeSeriesBaselineEstimator.default_parameters evalml.pipelines.components.TimeSeriesBaselineEstimator.describe evalml.pipelines.components.TimeSeriesBaselineEstimator.feature_importance evalml.pipelines.components.TimeSeriesBaselineEstimator.fit evalml.pipelines.components.TimeSeriesBaselineEstimator.get_prediction_intervals evalml.pipelines.components.TimeSeriesBaselineEstimator.load evalml.pipelines.components.TimeSeriesBaselineEstimator.needs_fitting evalml.pipelines.components.TimeSeriesBaselineEstimator.parameters evalml.pipelines.components.TimeSeriesBaselineEstimator.predict evalml.pipelines.components.TimeSeriesBaselineEstimator.predict_proba evalml.pipelines.components.TimeSeriesBaselineEstimator.save evalml.pipelines.components.TimeSeriesBaselineEstimator.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline estimators do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes. :rtype: np.ndarray (float) .. py:method:: fit(self, X, y=None) Fits time series baseline estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input y is None. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using fitted time series baseline estimator. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises ValueError: If input y is None. .. py:method:: predict_proba(self, X) Make prediction probabilities using fitted time series baseline estimator. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame :raises ValueError: If input y is None. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TimeSeriesFeaturizer(time_index=None, max_delay=2, gap=0, forecast_horizon=1, conf_level=0.05, rolling_window_size=0.25, delay_features=True, delay_target=True, random_seed=0, **kwargs) Transformer that delays input features and target variable for time series problems. This component uses an algorithm based on the autocorrelation values of the target variable to determine which lags to select from the set of all possible lags. The algorithm is based on the idea that the local maxima of the autocorrelation function indicate the lags that have the most impact on the present time. The algorithm computes the autocorrelation values and finds the local maxima, called "peaks", that are significant at the given conf_level. Since lags in the range [0, 10] tend to be predictive but not local maxima, the union of the peaks is taken with the significant lags in the range [0, 10]. At the end, only selected lags in the range [0, max_delay] are used. Parametrizing the algorithm by conf_level lets the AutoMLAlgorithm tune the set of lags chosen so that the chances of finding a good set of lags is higher. Using conf_level value of 1 selects all possible lags. :param time_index: Name of the column containing the datetime information used to order the data. Ignored. :type time_index: str :param max_delay: Maximum number of time units to delay each feature. Defaults to 2. :type max_delay: int :param forecast_horizon: The number of time periods the pipeline is expected to forecast. :type forecast_horizon: int :param conf_level: Float in range (0, 1] that determines the confidence interval size used to select which lags to compute from the set of [1, max_delay]. A delay of 1 will always be computed. If 1, selects all possible lags in the set of [1, max_delay], inclusive. :type conf_level: float :param rolling_window_size: Float in range (0, 1] that determines the size of the window used for rolling features. Size is computed as rolling_window_size * max_delay. :type rolling_window_size: float :param delay_features: Whether to delay the input features. Defaults to True. :type delay_features: bool :param delay_target: Whether to delay the target. Defaults to True. :type delay_target: bool :param gap: The number of time units between when the features are collected and when the target is collected. For example, if you are predicting the next time step's target, gap=1. This is only needed because when gap=0, we need to be sure to start the lagging of the target variable at 1. Defaults to 1. :type gap: int :param random_seed: Seed for the random number generator. This transformer performs the same regardless of the random seed provided. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - Real(0.001, 1.0), "rolling_window_size": Real(0.001, 1.0)}:type: {"conf_level" * - **modifies_features** - True * - **modifies_target** - False * - **name** - Time Series Featurizer * - **needs_fitting** - True * - **target_colname_prefix** - target_delay_{} * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesFeaturizer.clone evalml.pipelines.components.TimeSeriesFeaturizer.default_parameters evalml.pipelines.components.TimeSeriesFeaturizer.describe evalml.pipelines.components.TimeSeriesFeaturizer.fit evalml.pipelines.components.TimeSeriesFeaturizer.fit_transform evalml.pipelines.components.TimeSeriesFeaturizer.load evalml.pipelines.components.TimeSeriesFeaturizer.parameters evalml.pipelines.components.TimeSeriesFeaturizer.save evalml.pipelines.components.TimeSeriesFeaturizer.transform evalml.pipelines.components.TimeSeriesFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the DelayFeatureTransformer. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises ValueError: if self.time_index is None .. py:method:: fit_transform(self, X, y=None) Fit the component and transform the input data. :param X: Data to transform. :type X: pd.DataFrame :param y: Target. :type y: pd.Series, or None :returns: Transformed X. :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Computes the delayed values and rolling means for X and y. The chosen delays are determined by the autocorrelation function of the target variable. See the class docstring for more information on how they are chosen. If y is None, all possible lags are chosen. If y is not None, it will also compute the delayed values for the target variable. The rolling means for all numeric features in X and y, if y is numeric, are also returned. :param X: Data to transform. None is expected when only the target variable is being used. :type X: pd.DataFrame or None :param y: Target. :type y: pd.Series, or None :returns: Transformed X. No original features are returned. :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TimeSeriesImputer(categorical_impute_strategy='forwards_fill', numeric_impute_strategy='interpolate', target_impute_strategy='forwards_fill', random_seed=0, **kwargs) Imputes missing data according to a specified timeseries-specific imputation strategy. This Transformer should be used after the `TimeSeriesRegularizer` in order to impute the missing values that were added to X and y (if passed). :param categorical_impute_strategy: Impute strategy to use for string, object, boolean, categorical dtypes. Valid values include "backwards_fill" and "forwards_fill". Defaults to "forwards_fill". :type categorical_impute_strategy: string :param numeric_impute_strategy: Impute strategy to use for numeric columns. Valid values include "backwards_fill", "forwards_fill", and "interpolate". Defaults to "interpolate". :type numeric_impute_strategy: string :param target_impute_strategy: Impute strategy to use for the target column. Valid values include "backwards_fill", "forwards_fill", and "interpolate". Defaults to "forwards_fill". :type target_impute_strategy: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :raises ValueError: If categorical_impute_strategy, numeric_impute_strategy, or target_impute_strategy is not one of the valid values. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "categorical_impute_strategy": ["backwards_fill", "forwards_fill"], "numeric_impute_strategy": ["backwards_fill", "forwards_fill", "interpolate"], "target_impute_strategy": ["backwards_fill", "forwards_fill", "interpolate"],} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Time Series Imputer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesImputer.clone evalml.pipelines.components.TimeSeriesImputer.default_parameters evalml.pipelines.components.TimeSeriesImputer.describe evalml.pipelines.components.TimeSeriesImputer.fit evalml.pipelines.components.TimeSeriesImputer.fit_transform evalml.pipelines.components.TimeSeriesImputer.load evalml.pipelines.components.TimeSeriesImputer.needs_fitting evalml.pipelines.components.TimeSeriesImputer.parameters evalml.pipelines.components.TimeSeriesImputer.save evalml.pipelines.components.TimeSeriesImputer.transform evalml.pipelines.components.TimeSeriesImputer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. If a value is missing at the beginning or end of a column, that value will be imputed using backwards fill or forwards fill as necessary, respectively. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame, np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by imputing missing values using specified timeseries-specific strategies. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: Data to transform. :type X: pd.DataFrame :param y: Optionally, target data to transform. :type y: pd.Series, optional :returns: Transformed X and y :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: TimeSeriesRegularizer(time_index=None, frequency_payload=None, window_length=4, threshold=0.4, random_seed=0, **kwargs) Transformer that regularizes an inconsistently spaced datetime column. If X is passed in to fit/transform, the column `time_index` will be checked for an inferrable offset frequency. If the `time_index` column is perfectly inferrable then this Transformer will do nothing and return the original X and y. If X does not have a perfectly inferrable frequency but one can be estimated, then X and y will be reformatted based on the estimated frequency for `time_index`. In the original X and y passed: - Missing datetime values will be added and will have their corresponding columns in X and y set to None. - Duplicate datetime values will be dropped. - Extra datetime values will be dropped. - If it can be determined that a duplicate or extra value is misaligned, then it will be repositioned to take the place of a missing value. This Transformer should be used before the `TimeSeriesImputer` in order to impute the missing values that were added to X and y (if passed). :param time_index: Name of the column containing the datetime information used to order the data, required. Defaults to None. :type time_index: string :param frequency_payload: Payload returned from Woodwork's infer_frequency function where debug is True. Defaults to None. :type frequency_payload: tuple :param window_length: The size of the rolling window over which inference is conducted to determine the prevalence of uninferrable frequencies. :type window_length: int :param Lower values make this component more sensitive to recognizing numerous faulty datetime values. Defaults to 5.: :param threshold: The minimum percentage of windows that need to have been able to infer a frequency. Lower values make this component more :type threshold: float :param sensitive to recognizing numerous faulty datetime values. Defaults to 0.8.: :param random_seed: Seed for the random number generator. This transformer performs the same regardless of the random seed provided. :type random_seed: int :param Defaults to 0.: :raises ValueError: if the frequency_payload parameter has not been passed a tuple **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Time Series Regularizer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesRegularizer.clone evalml.pipelines.components.TimeSeriesRegularizer.default_parameters evalml.pipelines.components.TimeSeriesRegularizer.describe evalml.pipelines.components.TimeSeriesRegularizer.fit evalml.pipelines.components.TimeSeriesRegularizer.fit_transform evalml.pipelines.components.TimeSeriesRegularizer.load evalml.pipelines.components.TimeSeriesRegularizer.needs_fitting evalml.pipelines.components.TimeSeriesRegularizer.parameters evalml.pipelines.components.TimeSeriesRegularizer.save evalml.pipelines.components.TimeSeriesRegularizer.transform evalml.pipelines.components.TimeSeriesRegularizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the TimeSeriesRegularizer. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: if self.time_index is None, if X and y have different lengths, if `time_index` in X does not have an offset frequency that can be estimated :raises TypeError: if the `time_index` column is not of type Datetime :raises KeyError: if the `time_index` column doesn't exist .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Regularizes a dataframe and target data to an inferrable offset frequency. A 'clean' X and y (if y was passed in) are created based on an inferrable offset frequency and matching datetime values with the original X and y are imputed into the clean X and y. Datetime values identified as misaligned are shifted into their appropriate position. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Data with an inferrable `time_index` offset frequency. :rtype: (pd.DataFrame, pd.Series) .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: Transformer(parameters=None, component_obj=None, random_seed=0, **kwargs) A component that may or may not need fitting that transforms data. These components are used before an estimator. To implement a new Transformer, define your own class which is a subclass of Transformer, including a name and a list of acceptable ranges for any parameters to be tuned during the automl search (hyperparameters). Define an `__init__` method which sets up any necessary state and objects. Make sure your `__init__` only uses standard keyword arguments and calls `super().__init__()` with a parameters dict. You may also override the `fit`, `transform`, `fit_transform` and other methods in this class if appropriate. To see some examples, check out the definitions of any Transformer component. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Transformer.clone evalml.pipelines.components.Transformer.default_parameters evalml.pipelines.components.Transformer.describe evalml.pipelines.components.Transformer.fit evalml.pipelines.components.Transformer.fit_transform evalml.pipelines.components.Transformer.load evalml.pipelines.components.Transformer.name evalml.pipelines.components.Transformer.needs_fitting evalml.pipelines.components.Transformer.parameters evalml.pipelines.components.Transformer.save evalml.pipelines.components.Transformer.transform evalml.pipelines.components.Transformer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) :abstractmethod: Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: Undersampler(sampling_ratio=0.25, sampling_ratio_dict=None, min_samples=100, min_percentage=0.1, random_seed=0, **kwargs) Initializes an undersampling transformer to downsample the majority classes in the dataset. This component is only run during training and not during predict. :param sampling_ratio: The smallest minority:majority ratio that is accepted as 'balanced'. For instance, a 1:4 ratio would be represented as 0.25, while a 1:1 ratio is 1.0. Must be between 0 and 1, inclusive. Defaults to 0.25. :type sampling_ratio: float :param sampling_ratio_dict: A dictionary specifying the desired balanced ratio for each target value. For instance, in a binary case where class 1 is the minority, we could specify: `sampling_ratio_dict={0: 0.5, 1: 1}`, which means we would undersample class 0 to have twice the number of samples as class 1 (minority:majority ratio = 0.5), and don't sample class 1. Overrides sampling_ratio if provided. Defaults to None. :type sampling_ratio_dict: dict :param min_samples: The minimum number of samples that we must have for any class, pre or post sampling. If a class must be downsampled, it will not be downsampled past this value. To determine severe imbalance, the minority class must occur less often than this and must have a class ratio below min_percentage. Must be greater than 0. Defaults to 100. :type min_samples: int :param min_percentage: The minimum percentage of the minimum class to total dataset that we tolerate, as long as it is above min_samples. If min_percentage and min_samples are not met, treat this as severely imbalanced, and we will not resample the data. Must be between 0 and 0.5, inclusive. Defaults to 0.1. :type min_percentage: float :param random_seed: The seed to use for random sampling. Defaults to 0. :type random_seed: int :raises ValueError: If sampling_ratio is not in the range (0, 1]. :raises ValueError: If min_sample is not greater than 0. :raises ValueError: If min_percentage is not between 0 and 0.5, inclusive. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Undersampler * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Undersampler.clone evalml.pipelines.components.Undersampler.default_parameters evalml.pipelines.components.Undersampler.describe evalml.pipelines.components.Undersampler.fit evalml.pipelines.components.Undersampler.fit_resample evalml.pipelines.components.Undersampler.fit_transform evalml.pipelines.components.Undersampler.load evalml.pipelines.components.Undersampler.needs_fitting evalml.pipelines.components.Undersampler.parameters evalml.pipelines.components.Undersampler.save evalml.pipelines.components.Undersampler.transform evalml.pipelines.components.Undersampler.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the sampler to the data. :param X: Input features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: fit_resample(self, X, y) Resampling technique for this sampler. :param X: Training data to fit and resample. :type X: pd.DataFrame :param y: Training data targets to fit and resample. :type y: pd.Series :returns: Indices to keep for training data. :rtype: list .. py:method:: fit_transform(self, X, y) Fit and transform data using the sampler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms the input data by sampling the data. :param X: Training features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: Transformed features and target. :rtype: pd.DataFrame, pd.Series .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: URLFeaturizer(random_seed=0, **kwargs) Transformer that can automatically extract features from URL. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - URL Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.URLFeaturizer.clone evalml.pipelines.components.URLFeaturizer.default_parameters evalml.pipelines.components.URLFeaturizer.describe evalml.pipelines.components.URLFeaturizer.fit evalml.pipelines.components.URLFeaturizer.fit_transform evalml.pipelines.components.URLFeaturizer.load evalml.pipelines.components.URLFeaturizer.needs_fitting evalml.pipelines.components.URLFeaturizer.parameters evalml.pipelines.components.URLFeaturizer.save evalml.pipelines.components.URLFeaturizer.transform evalml.pipelines.components.URLFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: VowpalWabbitBinaryClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Binary Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param passes: Number of training passes. Defaults to 1. :type passes: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Binary Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.TIME_SERIES_BINARY,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitBinaryClassifier.clone evalml.pipelines.components.VowpalWabbitBinaryClassifier.default_parameters evalml.pipelines.components.VowpalWabbitBinaryClassifier.describe evalml.pipelines.components.VowpalWabbitBinaryClassifier.feature_importance evalml.pipelines.components.VowpalWabbitBinaryClassifier.fit evalml.pipelines.components.VowpalWabbitBinaryClassifier.get_prediction_intervals evalml.pipelines.components.VowpalWabbitBinaryClassifier.load evalml.pipelines.components.VowpalWabbitBinaryClassifier.needs_fitting evalml.pipelines.components.VowpalWabbitBinaryClassifier.parameters evalml.pipelines.components.VowpalWabbitBinaryClassifier.predict evalml.pipelines.components.VowpalWabbitBinaryClassifier.predict_proba evalml.pipelines.components.VowpalWabbitBinaryClassifier.save evalml.pipelines.components.VowpalWabbitBinaryClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: VowpalWabbitMulticlassClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Multiclass Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Multiclass Classifier * - **supported_problem_types** - [ ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitMulticlassClassifier.clone evalml.pipelines.components.VowpalWabbitMulticlassClassifier.default_parameters evalml.pipelines.components.VowpalWabbitMulticlassClassifier.describe evalml.pipelines.components.VowpalWabbitMulticlassClassifier.feature_importance evalml.pipelines.components.VowpalWabbitMulticlassClassifier.fit evalml.pipelines.components.VowpalWabbitMulticlassClassifier.get_prediction_intervals evalml.pipelines.components.VowpalWabbitMulticlassClassifier.load evalml.pipelines.components.VowpalWabbitMulticlassClassifier.needs_fitting evalml.pipelines.components.VowpalWabbitMulticlassClassifier.parameters evalml.pipelines.components.VowpalWabbitMulticlassClassifier.predict evalml.pipelines.components.VowpalWabbitMulticlassClassifier.predict_proba evalml.pipelines.components.VowpalWabbitMulticlassClassifier.save evalml.pipelines.components.VowpalWabbitMulticlassClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: VowpalWabbitRegressor(learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Regressor. :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param passes: Number of training passes. Defaults to 1. :type passes: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitRegressor.clone evalml.pipelines.components.VowpalWabbitRegressor.default_parameters evalml.pipelines.components.VowpalWabbitRegressor.describe evalml.pipelines.components.VowpalWabbitRegressor.feature_importance evalml.pipelines.components.VowpalWabbitRegressor.fit evalml.pipelines.components.VowpalWabbitRegressor.get_prediction_intervals evalml.pipelines.components.VowpalWabbitRegressor.load evalml.pipelines.components.VowpalWabbitRegressor.needs_fitting evalml.pipelines.components.VowpalWabbitRegressor.parameters evalml.pipelines.components.VowpalWabbitRegressor.predict evalml.pipelines.components.VowpalWabbitRegressor.predict_proba evalml.pipelines.components.VowpalWabbitRegressor.save evalml.pipelines.components.VowpalWabbitRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: XGBoostClassifier(eta=0.1, max_depth=6, min_child_weight=1, n_estimators=100, random_seed=0, eval_metric='logloss', n_jobs=12, **kwargs) XGBoost Classifier. :param eta: Boosting learning rate. Defaults to 0.1. :type eta: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param min_child_weight: Minimum sum of instance weight (hessian) needed in a child. Defaults to 1.0 :type min_child_weight: float :param n_estimators: Number of gradient boosted trees. Equivalent to number of boosting rounds. Defaults to 100. :type n_estimators: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to 12. :type n_jobs: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "eta": Real(0.000001, 1), "max_depth": Integer(1, 10), "min_child_weight": Real(1, 10), "n_estimators": Integer(1, 1000),} * - **model_family** - ModelFamily.XGBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - XGBoost Classifier * - **SEED_MAX** - None * - **SEED_MIN** - None * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.XGBoostClassifier.clone evalml.pipelines.components.XGBoostClassifier.default_parameters evalml.pipelines.components.XGBoostClassifier.describe evalml.pipelines.components.XGBoostClassifier.feature_importance evalml.pipelines.components.XGBoostClassifier.fit evalml.pipelines.components.XGBoostClassifier.get_prediction_intervals evalml.pipelines.components.XGBoostClassifier.load evalml.pipelines.components.XGBoostClassifier.needs_fitting evalml.pipelines.components.XGBoostClassifier.parameters evalml.pipelines.components.XGBoostClassifier.predict evalml.pipelines.components.XGBoostClassifier.predict_proba evalml.pipelines.components.XGBoostClassifier.save evalml.pipelines.components.XGBoostClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted XGBoost classifier. .. py:method:: fit(self, X, y=None) Fits XGBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted XGBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: XGBoostRegressor(eta: float = 0.1, max_depth: int = 6, min_child_weight: int = 1, n_estimators: int = 100, random_seed: Union[int, float] = 0, n_jobs: int = 12, **kwargs) XGBoost Regressor. :param eta: Boosting learning rate. Defaults to 0.1. :type eta: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param min_child_weight: Minimum sum of instance weight (hessian) needed in a child. Defaults to 1.0 :type min_child_weight: float :param n_estimators: Number of gradient boosted trees. Equivalent to number of boosting rounds. Defaults to 100. :type n_estimators: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to 12. :type n_jobs: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "eta": Real(0.000001, 1), "max_depth": Integer(1, 20), "min_child_weight": Real(1, 10), "n_estimators": Integer(1, 1000),} * - **model_family** - ModelFamily.XGBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - XGBoost Regressor * - **SEED_MAX** - None * - **SEED_MIN** - None * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.XGBoostRegressor.clone evalml.pipelines.components.XGBoostRegressor.default_parameters evalml.pipelines.components.XGBoostRegressor.describe evalml.pipelines.components.XGBoostRegressor.feature_importance evalml.pipelines.components.XGBoostRegressor.fit evalml.pipelines.components.XGBoostRegressor.get_prediction_intervals evalml.pipelines.components.XGBoostRegressor.load evalml.pipelines.components.XGBoostRegressor.needs_fitting evalml.pipelines.components.XGBoostRegressor.parameters evalml.pipelines.components.XGBoostRegressor.predict evalml.pipelines.components.XGBoostRegressor.predict_proba evalml.pipelines.components.XGBoostRegressor.save evalml.pipelines.components.XGBoostRegressor.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Feature importance of fitted XGBoost regressor. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits XGBoost regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted XGBoostRegressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: List[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using fitted XGBoost regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional