components ===================================== .. py:module:: evalml.pipelines.components .. autoapi-nested-parse:: EvalML component classes. Subpackages ----------- .. toctree:: :titlesonly: :maxdepth: 3 ensemble/index.rst estimators/index.rst transformers/index.rst Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 component_base/index.rst component_base_meta/index.rst utils/index.rst Package Contents ---------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.ARIMARegressor evalml.pipelines.components.BaselineClassifier evalml.pipelines.components.BaselineRegressor evalml.pipelines.components.CatBoostClassifier evalml.pipelines.components.CatBoostRegressor evalml.pipelines.components.ComponentBase evalml.pipelines.components.ComponentBaseMeta evalml.pipelines.components.DateTimeFeaturizer evalml.pipelines.components.DecisionTreeClassifier evalml.pipelines.components.DecisionTreeRegressor evalml.pipelines.components.DFSTransformer evalml.pipelines.components.DropColumns evalml.pipelines.components.DropNaNRowsTransformer evalml.pipelines.components.DropNullColumns evalml.pipelines.components.DropRowsTransformer evalml.pipelines.components.ElasticNetClassifier evalml.pipelines.components.ElasticNetRegressor evalml.pipelines.components.EmailFeaturizer evalml.pipelines.components.Estimator evalml.pipelines.components.ExponentialSmoothingRegressor evalml.pipelines.components.ExtraTreesClassifier evalml.pipelines.components.ExtraTreesRegressor evalml.pipelines.components.FeatureSelector evalml.pipelines.components.Imputer evalml.pipelines.components.KNeighborsClassifier evalml.pipelines.components.LabelEncoder evalml.pipelines.components.LightGBMClassifier evalml.pipelines.components.LightGBMRegressor evalml.pipelines.components.LinearDiscriminantAnalysis evalml.pipelines.components.LinearRegressor evalml.pipelines.components.LogisticRegressionClassifier evalml.pipelines.components.LogTransformer evalml.pipelines.components.LSA evalml.pipelines.components.NaturalLanguageFeaturizer evalml.pipelines.components.OneHotEncoder evalml.pipelines.components.Oversampler evalml.pipelines.components.PCA evalml.pipelines.components.PerColumnImputer evalml.pipelines.components.PolynomialDetrender evalml.pipelines.components.ProphetRegressor evalml.pipelines.components.RandomForestClassifier evalml.pipelines.components.RandomForestRegressor evalml.pipelines.components.ReplaceNullableTypes evalml.pipelines.components.RFClassifierSelectFromModel evalml.pipelines.components.RFRegressorSelectFromModel evalml.pipelines.components.SelectByType evalml.pipelines.components.SelectColumns evalml.pipelines.components.SimpleImputer evalml.pipelines.components.StackedEnsembleClassifier evalml.pipelines.components.StackedEnsembleRegressor evalml.pipelines.components.StandardScaler evalml.pipelines.components.SVMClassifier evalml.pipelines.components.SVMRegressor evalml.pipelines.components.TargetEncoder evalml.pipelines.components.TargetImputer evalml.pipelines.components.TimeSeriesBaselineEstimator evalml.pipelines.components.TimeSeriesFeaturizer evalml.pipelines.components.TimeSeriesImputer evalml.pipelines.components.TimeSeriesRegularizer evalml.pipelines.components.Transformer evalml.pipelines.components.Undersampler evalml.pipelines.components.URLFeaturizer evalml.pipelines.components.VowpalWabbitBinaryClassifier evalml.pipelines.components.VowpalWabbitMulticlassClassifier evalml.pipelines.components.VowpalWabbitRegressor evalml.pipelines.components.XGBoostClassifier evalml.pipelines.components.XGBoostRegressor Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: ARIMARegressor(time_index=None, trend=None, start_p=2, d=0, start_q=2, max_p=5, max_d=2, max_q=5, seasonal=True, sp=1, n_jobs=-1, random_seed=0, maxiter=10, use_covariates=True, **kwargs) Autoregressive Integrated Moving Average Model. The three parameters (p, d, q) are the AR order, the degree of differencing, and the MA order. More information here: https://www.statsmodels.org/devel/generated/statsmodels.tsa.arima.model.ARIMA.html. Currently ARIMARegressor isn't supported via conda install. It's recommended that it be installed via PyPI. :param time_index: Specifies the name of the column in X that provides the datetime objects. Defaults to None. :type time_index: str :param trend: Controls the deterministic trend. Options are ['n', 'c', 't', 'ct'] where 'c' is a constant term, 't' indicates a linear trend, and 'ct' is both. Can also be an iterable when defining a polynomial, such as [1, 1, 0, 1]. :type trend: str :param start_p: Minimum Autoregressive order. Defaults to 2. :type start_p: int :param d: Minimum Differencing degree. Defaults to 0. :type d: int :param start_q: Minimum Moving Average order. Defaults to 2. :type start_q: int :param max_p: Maximum Autoregressive order. Defaults to 5. :type max_p: int :param max_d: Maximum Differencing degree. Defaults to 2. :type max_d: int :param max_q: Maximum Moving Average order. Defaults to 5. :type max_q: int :param seasonal: Whether to fit a seasonal model to ARIMA. Defaults to True. :type seasonal: boolean :param sp: Period for seasonal differencing, specifically the number of periods in each season. If "detect", this model will automatically detect this parameter (given the time series is a standard frequency) and will fall back to 1 (no seasonality) if it cannot be detected. Defaults to 1. :type sp: int or str :param n_jobs: Non-negative integer describing level of parallelism used for pipelines. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "start_p": Integer(1, 3), "d": Integer(0, 2), "start_q": Integer(1, 3), "max_p": Integer(3, 10), "max_d": Integer(2, 5), "max_q": Integer(3, 10), "seasonal": [True, False],} * - **model_family** - ModelFamily.ARIMA * - **modifies_features** - True * - **modifies_target** - False * - **name** - ARIMA Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ARIMARegressor.clone evalml.pipelines.components.ARIMARegressor.default_parameters evalml.pipelines.components.ARIMARegressor.describe evalml.pipelines.components.ARIMARegressor.feature_importance evalml.pipelines.components.ARIMARegressor.fit evalml.pipelines.components.ARIMARegressor.load evalml.pipelines.components.ARIMARegressor.needs_fitting evalml.pipelines.components.ARIMARegressor.parameters evalml.pipelines.components.ARIMARegressor.predict evalml.pipelines.components.ARIMARegressor.predict_proba evalml.pipelines.components.ARIMARegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's with a length of 1 as feature_importance is not defined for ARIMA regressor. .. py:method:: fit(self, X, y=None) Fits ARIMA regressor to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y was not passed in. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X, y=None) Make predictions using fitted ARIMA regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series :raises ValueError: If X was passed to `fit` but not passed in `predict`. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: BaselineClassifier(strategy='mode', random_seed=0, **kwargs) Classifier that predicts using the specified strategy. This is useful as a simple baseline classifier to compare with other classifiers. :param strategy: Method used to predict. Valid options are "mode", "random" and "random_weighted". Defaults to "mode". :type strategy: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Baseline Classifier * - **supported_problem_types** - [ProblemTypes.BINARY, ProblemTypes.MULTICLASS] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.BaselineClassifier.classes_ evalml.pipelines.components.BaselineClassifier.clone evalml.pipelines.components.BaselineClassifier.default_parameters evalml.pipelines.components.BaselineClassifier.describe evalml.pipelines.components.BaselineClassifier.feature_importance evalml.pipelines.components.BaselineClassifier.fit evalml.pipelines.components.BaselineClassifier.load evalml.pipelines.components.BaselineClassifier.needs_fitting evalml.pipelines.components.BaselineClassifier.parameters evalml.pipelines.components.BaselineClassifier.predict evalml.pipelines.components.BaselineClassifier.predict_proba evalml.pipelines.components.BaselineClassifier.save .. py:method:: classes_(self) :property: Returns class labels. Will return None before fitting. :returns: Class names :rtype: list[str] or list(float) .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline classifiers do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes :rtype: pd.Series .. py:method:: fit(self, X, y=None) Fits baseline classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make prediction probabilities using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: BaselineRegressor(strategy='mean', random_seed=0, **kwargs) Baseline regressor that uses a simple strategy to make predictions. This is useful as a simple baseline regressor to compare with other regressors. :param strategy: Method used to predict. Valid options are "mean", "median". Defaults to "mean". :type strategy: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Baseline Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.BaselineRegressor.clone evalml.pipelines.components.BaselineRegressor.default_parameters evalml.pipelines.components.BaselineRegressor.describe evalml.pipelines.components.BaselineRegressor.feature_importance evalml.pipelines.components.BaselineRegressor.fit evalml.pipelines.components.BaselineRegressor.load evalml.pipelines.components.BaselineRegressor.needs_fitting evalml.pipelines.components.BaselineRegressor.parameters evalml.pipelines.components.BaselineRegressor.predict evalml.pipelines.components.BaselineRegressor.predict_proba evalml.pipelines.components.BaselineRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline regressors do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes. :rtype: np.ndarray (float) .. py:method:: fit(self, X, y=None) Fits baseline regression component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input y is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the baseline regression strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: CatBoostClassifier(n_estimators=10, eta=0.03, max_depth=6, bootstrap_type=None, silent=True, allow_writing_files=False, random_seed=0, n_jobs=-1, **kwargs) CatBoost Classifier, a classifier that uses gradient-boosting on decision trees. CatBoost is an open-source library and natively supports categorical features. For more information, check out https://catboost.ai/ :param n_estimators: The maximum number of trees to build. Defaults to 10. :type n_estimators: float :param eta: The learning rate. Defaults to 0.03. :type eta: float :param max_depth: The maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param bootstrap_type: Defines the method for sampling the weights of objects. Available methods are 'Bayesian', 'Bernoulli', 'MVS'. Defaults to None. :type bootstrap_type: string :param silent: Whether to use the "silent" logging mode. Defaults to True. :type silent: boolean :param allow_writing_files: Whether to allow writing snapshot files while training. Defaults to False. :type allow_writing_files: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(4, 100), "eta": Real(0.000001, 1), "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.CATBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - CatBoost Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.CatBoostClassifier.clone evalml.pipelines.components.CatBoostClassifier.default_parameters evalml.pipelines.components.CatBoostClassifier.describe evalml.pipelines.components.CatBoostClassifier.feature_importance evalml.pipelines.components.CatBoostClassifier.fit evalml.pipelines.components.CatBoostClassifier.load evalml.pipelines.components.CatBoostClassifier.needs_fitting evalml.pipelines.components.CatBoostClassifier.parameters evalml.pipelines.components.CatBoostClassifier.predict evalml.pipelines.components.CatBoostClassifier.predict_proba evalml.pipelines.components.CatBoostClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted CatBoost classifier. .. py:method:: fit(self, X, y=None) Fits CatBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: CatBoostRegressor(n_estimators=10, eta=0.03, max_depth=6, bootstrap_type=None, silent=False, allow_writing_files=False, random_seed=0, n_jobs=-1, **kwargs) CatBoost Regressor, a regressor that uses gradient-boosting on decision trees. CatBoost is an open-source library and natively supports categorical features. For more information, check out https://catboost.ai/ :param n_estimators: The maximum number of trees to build. Defaults to 10. :type n_estimators: float :param eta: The learning rate. Defaults to 0.03. :type eta: float :param max_depth: The maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param bootstrap_type: Defines the method for sampling the weights of objects. Available methods are 'Bayesian', 'Bernoulli', 'MVS'. Defaults to None. :type bootstrap_type: string :param silent: Whether to use the "silent" logging mode. Defaults to True. :type silent: boolean :param allow_writing_files: Whether to allow writing snapshot files while training. Defaults to False. :type allow_writing_files: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(4, 100), "eta": Real(0.000001, 1), "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.CATBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - CatBoost Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.CatBoostRegressor.clone evalml.pipelines.components.CatBoostRegressor.default_parameters evalml.pipelines.components.CatBoostRegressor.describe evalml.pipelines.components.CatBoostRegressor.feature_importance evalml.pipelines.components.CatBoostRegressor.fit evalml.pipelines.components.CatBoostRegressor.load evalml.pipelines.components.CatBoostRegressor.needs_fitting evalml.pipelines.components.CatBoostRegressor.parameters evalml.pipelines.components.CatBoostRegressor.predict evalml.pipelines.components.CatBoostRegressor.predict_proba evalml.pipelines.components.CatBoostRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted CatBoost regressor. .. py:method:: fit(self, X, y=None) Fits CatBoost regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: ComponentBase(parameters=None, component_obj=None, random_seed=0, **kwargs) Base class for all components. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ComponentBase.clone evalml.pipelines.components.ComponentBase.default_parameters evalml.pipelines.components.ComponentBase.describe evalml.pipelines.components.ComponentBase.fit evalml.pipelines.components.ComponentBase.load evalml.pipelines.components.ComponentBase.modifies_features evalml.pipelines.components.ComponentBase.modifies_target evalml.pipelines.components.ComponentBase.name evalml.pipelines.components.ComponentBase.needs_fitting evalml.pipelines.components.ComponentBase.parameters evalml.pipelines.components.ComponentBase.save evalml.pipelines.components.ComponentBase.training_only .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: modifies_features(cls) :property: Returns whether this component modifies (subsets or transforms) the features variable during transform. For Estimator objects, this attribute determines if the return value from `predict` or `predict_proba` should be used as features or targets. .. py:method:: modifies_target(cls) :property: Returns whether this component modifies (subsets or transforms) the target variable during transform. For Estimator objects, this attribute determines if the return value from `predict` or `predict_proba` should be used as features or targets. .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: training_only(cls) :property: Returns whether or not this component should be evaluated during training-time only, or during both training and prediction time. .. py:class:: ComponentBaseMeta Metaclass that overrides creating a new component by wrapping methods with validators and setters. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **FIT_METHODS** - ['fit', 'fit_transform'] * - **METHODS_TO_CHECK** - ['predict', 'predict_proba', 'transform', 'inverse_transform'] * - **PROPERTIES_TO_CHECK** - ['feature_importance'] **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ComponentBaseMeta.check_for_fit evalml.pipelines.components.ComponentBaseMeta.register evalml.pipelines.components.ComponentBaseMeta.set_fit .. py:method:: check_for_fit(cls, method) :classmethod: `check_for_fit` wraps a method that validates if `self._is_fitted` is `True`. It raises an exception if `False` and calls and returns the wrapped method if `True`. :param method: Method to wrap. :type method: callable :returns: The wrapped method. :raises ComponentNotYetFittedError: If component is not yet fitted. .. py:method:: register(cls, subclass) Register a virtual subclass of an ABC. Returns the subclass, to allow usage as a class decorator. .. py:method:: set_fit(cls, method) :classmethod: Wrapper for the fit method. .. py:class:: DateTimeFeaturizer(features_to_extract=None, encode_as_categories=False, time_index=None, random_seed=0, **kwargs) Transformer that can automatically extract features from datetime columns. :param features_to_extract: List of features to extract. Valid options include "year", "month", "day_of_week", "hour". Defaults to None. :type features_to_extract: list :param encode_as_categories: Whether day-of-week and month features should be encoded as pandas "category" dtype. This allows OneHotEncoders to encode these features. Defaults to False. :type encode_as_categories: bool :param time_index: Name of the column containing the datetime information used to order the data. Ignored. :type time_index: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - DateTime Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DateTimeFeaturizer.clone evalml.pipelines.components.DateTimeFeaturizer.default_parameters evalml.pipelines.components.DateTimeFeaturizer.describe evalml.pipelines.components.DateTimeFeaturizer.fit evalml.pipelines.components.DateTimeFeaturizer.fit_transform evalml.pipelines.components.DateTimeFeaturizer.get_feature_names evalml.pipelines.components.DateTimeFeaturizer.load evalml.pipelines.components.DateTimeFeaturizer.needs_fitting evalml.pipelines.components.DateTimeFeaturizer.parameters evalml.pipelines.components.DateTimeFeaturizer.save evalml.pipelines.components.DateTimeFeaturizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fit the datetime featurizer component. :param X: Input features. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: get_feature_names(self) Gets the categories of each datetime feature. :returns: Dictionary, where each key-value pair is a column name and a dictionary mapping the unique feature values to their integer encoding. :rtype: dict .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by creating new features using existing DateTime columns, and then dropping those DateTime columns. :param X: Input features. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: DecisionTreeClassifier(criterion='gini', max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, random_seed=0, **kwargs) Decision Tree Classifier. :param criterion: The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "entropy" for the information gain. Defaults to "gini". :type criterion: {"gini", "entropy"} :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. Defaults to 2. :type min_samples_split: int or float :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "criterion": ["gini", "entropy"], "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.DECISION_TREE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Decision Tree Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DecisionTreeClassifier.clone evalml.pipelines.components.DecisionTreeClassifier.default_parameters evalml.pipelines.components.DecisionTreeClassifier.describe evalml.pipelines.components.DecisionTreeClassifier.feature_importance evalml.pipelines.components.DecisionTreeClassifier.fit evalml.pipelines.components.DecisionTreeClassifier.load evalml.pipelines.components.DecisionTreeClassifier.needs_fitting evalml.pipelines.components.DecisionTreeClassifier.parameters evalml.pipelines.components.DecisionTreeClassifier.predict evalml.pipelines.components.DecisionTreeClassifier.predict_proba evalml.pipelines.components.DecisionTreeClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: DecisionTreeRegressor(criterion='mse', max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, random_seed=0, **kwargs) Decision Tree Regressor. :param criterion: The function to measure the quality of a split. Supported criteria are: - "mse" for the mean squared error, which is equal to variance reduction as feature selection criterion and minimizes the L2 loss using the mean of each terminal node - "friedman_mse", which uses mean squared error with Friedman"s improvement score for potential splits - "mae" for the mean absolute error, which minimizes the L1 loss using the median of each terminal node, - "poisson" which uses reduction in Poisson deviance to find splits. :type criterion: {"mse", "friedman_mse", "mae", "poisson"} :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. Defaults to 2. :type min_samples_split: int or float :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "criterion": ["mse", "friedman_mse", "mae"], "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.DECISION_TREE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Decision Tree Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DecisionTreeRegressor.clone evalml.pipelines.components.DecisionTreeRegressor.default_parameters evalml.pipelines.components.DecisionTreeRegressor.describe evalml.pipelines.components.DecisionTreeRegressor.feature_importance evalml.pipelines.components.DecisionTreeRegressor.fit evalml.pipelines.components.DecisionTreeRegressor.load evalml.pipelines.components.DecisionTreeRegressor.needs_fitting evalml.pipelines.components.DecisionTreeRegressor.parameters evalml.pipelines.components.DecisionTreeRegressor.predict evalml.pipelines.components.DecisionTreeRegressor.predict_proba evalml.pipelines.components.DecisionTreeRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: DFSTransformer(index='index', features=None, random_seed=0, **kwargs) Featuretools DFS component that generates features for the input features. :param index: The name of the column that contains the indices. If no column with this name exists, then featuretools.EntitySet() creates a column with this name to serve as the index column. Defaults to 'index'. :type index: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param features: List of features to run DFS on. Defaults to None. Features will only be computed if the columns used by the feature exist in the input and if the feature itself is not in input. :type features: list **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - DFS Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DFSTransformer.clone evalml.pipelines.components.DFSTransformer.default_parameters evalml.pipelines.components.DFSTransformer.describe evalml.pipelines.components.DFSTransformer.fit evalml.pipelines.components.DFSTransformer.fit_transform evalml.pipelines.components.DFSTransformer.load evalml.pipelines.components.DFSTransformer.needs_fitting evalml.pipelines.components.DFSTransformer.parameters evalml.pipelines.components.DFSTransformer.save evalml.pipelines.components.DFSTransformer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the DFSTransformer Transformer component. :param X: The input data to transform, of shape [n_samples, n_features]. :type X: pd.DataFrame, np.array :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Computes the feature matrix for the input X using featuretools' dfs algorithm. :param X: The input training data to transform. Has shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: Ignored. :type y: pd.Series, optional :returns: Feature matrix :rtype: pd.DataFrame .. py:class:: DropColumns(columns=None, random_seed=0, **kwargs) Drops specified columns in input data. :param columns: List of column names, used to determine which columns to drop. :type columns: list(string) :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Drop Columns Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropColumns.clone evalml.pipelines.components.DropColumns.default_parameters evalml.pipelines.components.DropColumns.describe evalml.pipelines.components.DropColumns.fit evalml.pipelines.components.DropColumns.fit_transform evalml.pipelines.components.DropColumns.load evalml.pipelines.components.DropColumns.parameters evalml.pipelines.components.DropColumns.save evalml.pipelines.components.DropColumns.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, ignored :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by dropping columns. :param X: Data to transform. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: Transformed X. :rtype: pd.DataFrame .. py:class:: DropNaNRowsTransformer(parameters=None, component_obj=None, random_seed=0, **kwargs) Transformer to drop rows with NaN values. :param random_seed: Seed for the random number generator. Is not used by this component. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Drop NaN Rows Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropNaNRowsTransformer.clone evalml.pipelines.components.DropNaNRowsTransformer.default_parameters evalml.pipelines.components.DropNaNRowsTransformer.describe evalml.pipelines.components.DropNaNRowsTransformer.fit evalml.pipelines.components.DropNaNRowsTransformer.fit_transform evalml.pipelines.components.DropNaNRowsTransformer.load evalml.pipelines.components.DropNaNRowsTransformer.needs_fitting evalml.pipelines.components.DropNaNRowsTransformer.parameters evalml.pipelines.components.DropNaNRowsTransformer.save evalml.pipelines.components.DropNaNRowsTransformer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data using fitted component. :param X: Features. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Data with NaN rows dropped. :rtype: (pd.DataFrame, pd.Series) .. py:class:: DropNullColumns(pct_null_threshold=1.0, random_seed=0, **kwargs) Transformer to drop features whose percentage of NaN values exceeds a specified threshold. :param pct_null_threshold: The percentage of NaN values in an input feature to drop. Must be a value between [0, 1] inclusive. If equal to 0.0, will drop columns with any null values. If equal to 1.0, will drop columns with all null values. Defaults to 0.95. :type pct_null_threshold: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Drop Null Columns Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropNullColumns.clone evalml.pipelines.components.DropNullColumns.default_parameters evalml.pipelines.components.DropNullColumns.describe evalml.pipelines.components.DropNullColumns.fit evalml.pipelines.components.DropNullColumns.fit_transform evalml.pipelines.components.DropNullColumns.load evalml.pipelines.components.DropNullColumns.needs_fitting evalml.pipelines.components.DropNullColumns.parameters evalml.pipelines.components.DropNullColumns.save evalml.pipelines.components.DropNullColumns.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by dropping columns that exceed the threshold of null values. :param X: Data to transform :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: DropRowsTransformer(indices_to_drop=None, random_seed=0) Transformer to drop rows specified by row indices. :param indices_to_drop: List of indices to drop in the input data. Defaults to None. :type indices_to_drop: list :param random_seed: Seed for the random number generator. Is not used by this component. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Drop Rows Transformer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.DropRowsTransformer.clone evalml.pipelines.components.DropRowsTransformer.default_parameters evalml.pipelines.components.DropRowsTransformer.describe evalml.pipelines.components.DropRowsTransformer.fit evalml.pipelines.components.DropRowsTransformer.fit_transform evalml.pipelines.components.DropRowsTransformer.load evalml.pipelines.components.DropRowsTransformer.needs_fitting evalml.pipelines.components.DropRowsTransformer.parameters evalml.pipelines.components.DropRowsTransformer.save evalml.pipelines.components.DropRowsTransformer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If indices to drop do not exist in input features or target. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data using fitted component. :param X: Features. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Data with row indices dropped. :rtype: (pd.DataFrame, pd.Series) .. py:class:: ElasticNetClassifier(penalty='elasticnet', C=1.0, l1_ratio=0.15, multi_class='auto', solver='saga', n_jobs=-1, random_seed=0, **kwargs) Elastic Net Classifier. Uses Logistic Regression with elasticnet penalty as the base estimator. :param penalty: The norm used in penalization. Defaults to "elasticnet". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param l1_ratio: The mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2. Defaults to 0.15. :type l1_ratio: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "saga". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0.01, 10), "l1_ratio": Real(0, 1)} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Elastic Net Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ElasticNetClassifier.clone evalml.pipelines.components.ElasticNetClassifier.default_parameters evalml.pipelines.components.ElasticNetClassifier.describe evalml.pipelines.components.ElasticNetClassifier.feature_importance evalml.pipelines.components.ElasticNetClassifier.fit evalml.pipelines.components.ElasticNetClassifier.load evalml.pipelines.components.ElasticNetClassifier.needs_fitting evalml.pipelines.components.ElasticNetClassifier.parameters evalml.pipelines.components.ElasticNetClassifier.predict evalml.pipelines.components.ElasticNetClassifier.predict_proba evalml.pipelines.components.ElasticNetClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted ElasticNet classifier. .. py:method:: fit(self, X, y) Fits ElasticNet classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: ElasticNetRegressor(alpha=0.0001, l1_ratio=0.15, max_iter=1000, normalize=False, random_seed=0, **kwargs) Elastic Net Regressor. :param alpha: Constant that multiplies the penalty terms. Defaults to 0.0001. :type alpha: float :param l1_ratio: The mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2. Defaults to 0.15. :type l1_ratio: float :param max_iter: The maximum number of iterations. Defaults to 1000. :type max_iter: int :param normalize: If True, the regressors will be normalized before regression by subtracting the mean and dividing by the l2-norm. Defaults to False. :type normalize: boolean :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "alpha": Real(0, 1), "l1_ratio": Real(0, 1),} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Elastic Net Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ElasticNetRegressor.clone evalml.pipelines.components.ElasticNetRegressor.default_parameters evalml.pipelines.components.ElasticNetRegressor.describe evalml.pipelines.components.ElasticNetRegressor.feature_importance evalml.pipelines.components.ElasticNetRegressor.fit evalml.pipelines.components.ElasticNetRegressor.load evalml.pipelines.components.ElasticNetRegressor.needs_fitting evalml.pipelines.components.ElasticNetRegressor.parameters evalml.pipelines.components.ElasticNetRegressor.predict evalml.pipelines.components.ElasticNetRegressor.predict_proba evalml.pipelines.components.ElasticNetRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted ElasticNet regressor. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: EmailFeaturizer(random_seed=0, **kwargs) Transformer that can automatically extract features from emails. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Email Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.EmailFeaturizer.clone evalml.pipelines.components.EmailFeaturizer.default_parameters evalml.pipelines.components.EmailFeaturizer.describe evalml.pipelines.components.EmailFeaturizer.fit evalml.pipelines.components.EmailFeaturizer.fit_transform evalml.pipelines.components.EmailFeaturizer.load evalml.pipelines.components.EmailFeaturizer.needs_fitting evalml.pipelines.components.EmailFeaturizer.parameters evalml.pipelines.components.EmailFeaturizer.save evalml.pipelines.components.EmailFeaturizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:class:: Estimator(parameters=None, component_obj=None, random_seed=0, **kwargs) A component that fits and predicts given data. To implement a new Estimator, define your own class which is a subclass of Estimator, including a name and a list of acceptable ranges for any parameters to be tuned during the automl search (hyperparameters). Define an `__init__` method which sets up any necessary state and objects. Make sure your `__init__` only uses standard keyword arguments and calls `super().__init__()` with a parameters dict. You may also override the `fit`, `transform`, `fit_transform` and other methods in this class if appropriate. To see some examples, check out the definitions of any Estimator component subclass. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **model_family** - ModelFamily.NONE * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Estimator.clone evalml.pipelines.components.Estimator.default_parameters evalml.pipelines.components.Estimator.describe evalml.pipelines.components.Estimator.feature_importance evalml.pipelines.components.Estimator.fit evalml.pipelines.components.Estimator.load evalml.pipelines.components.Estimator.model_family evalml.pipelines.components.Estimator.name evalml.pipelines.components.Estimator.needs_fitting evalml.pipelines.components.Estimator.parameters evalml.pipelines.components.Estimator.predict evalml.pipelines.components.Estimator.predict_proba evalml.pipelines.components.Estimator.save evalml.pipelines.components.Estimator.supported_problem_types .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: model_family(cls) :property: Returns ModelFamily of this component. .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: supported_problem_types(cls) :property: Problem types this estimator supports. .. py:class:: ExponentialSmoothingRegressor(trend=None, damped_trend=False, seasonal=None, sp=2, n_jobs=-1, random_seed=0, **kwargs) Holt-Winters Exponential Smoothing Forecaster. Currently ExponentialSmoothingRegressor isn't supported via conda install. It's recommended that it be installed via PyPI. :param trend: Type of trend component. Defaults to None. :type trend: str :param damped_trend: If the trend component should be damped. Defaults to False. :type damped_trend: bool :param seasonal: Type of seasonal component. Takes one of {“additive”, None}. Can also be multiplicative if :type seasonal: str :param none of the target data is 0: :param but AutoMLSearch wiill not tune for this. Defaults to None.: :param sp: The number of seasonal periods to consider. Defaults to 2. :type sp: int :param n_jobs: Non-negative integer describing level of parallelism used for pipelines. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "trend": [None, "additive"], "damped_trend": [True, False], "seasonal": [None, "additive"], "sp": Integer(2, 8),} * - **model_family** - ModelFamily.EXPONENTIAL_SMOOTHING * - **modifies_features** - True * - **modifies_target** - False * - **name** - Exponential Smoothing Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExponentialSmoothingRegressor.clone evalml.pipelines.components.ExponentialSmoothingRegressor.default_parameters evalml.pipelines.components.ExponentialSmoothingRegressor.describe evalml.pipelines.components.ExponentialSmoothingRegressor.feature_importance evalml.pipelines.components.ExponentialSmoothingRegressor.fit evalml.pipelines.components.ExponentialSmoothingRegressor.load evalml.pipelines.components.ExponentialSmoothingRegressor.needs_fitting evalml.pipelines.components.ExponentialSmoothingRegressor.parameters evalml.pipelines.components.ExponentialSmoothingRegressor.predict evalml.pipelines.components.ExponentialSmoothingRegressor.predict_proba evalml.pipelines.components.ExponentialSmoothingRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's with a length of 1 as feature_importance is not defined for Exponential Smoothing regressor. .. py:method:: fit(self, X, y=None) Fits Exponential Smoothing Regressor to data. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y was not passed in. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X, y=None) Make predictions using fitted Exponential Smoothing regressor. :param X: Data of shape [n_samples, n_features]. Ignored except to set forecast horizon. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: ExtraTreesClassifier(n_estimators=100, max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, n_jobs=-1, random_seed=0, **kwargs) Extra Trees Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. :type min_samples_split: int or float :param Defaults to 2.: :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.EXTRA_TREES * - **modifies_features** - True * - **modifies_target** - False * - **name** - Extra Trees Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExtraTreesClassifier.clone evalml.pipelines.components.ExtraTreesClassifier.default_parameters evalml.pipelines.components.ExtraTreesClassifier.describe evalml.pipelines.components.ExtraTreesClassifier.feature_importance evalml.pipelines.components.ExtraTreesClassifier.fit evalml.pipelines.components.ExtraTreesClassifier.load evalml.pipelines.components.ExtraTreesClassifier.needs_fitting evalml.pipelines.components.ExtraTreesClassifier.parameters evalml.pipelines.components.ExtraTreesClassifier.predict evalml.pipelines.components.ExtraTreesClassifier.predict_proba evalml.pipelines.components.ExtraTreesClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: ExtraTreesRegressor(n_estimators=100, max_features='auto', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, n_jobs=-1, random_seed=0, **kwargs) Extra Trees Regressor. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "auto", then max_features=sqrt(n_features). - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. Defaults to "auto". :type max_features: int, float or {"auto", "sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. :type min_samples_split: int or float :param Defaults to 2.: :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_features": ["auto", "sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.EXTRA_TREES * - **modifies_features** - True * - **modifies_target** - False * - **name** - Extra Trees Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ExtraTreesRegressor.clone evalml.pipelines.components.ExtraTreesRegressor.default_parameters evalml.pipelines.components.ExtraTreesRegressor.describe evalml.pipelines.components.ExtraTreesRegressor.feature_importance evalml.pipelines.components.ExtraTreesRegressor.fit evalml.pipelines.components.ExtraTreesRegressor.load evalml.pipelines.components.ExtraTreesRegressor.needs_fitting evalml.pipelines.components.ExtraTreesRegressor.parameters evalml.pipelines.components.ExtraTreesRegressor.predict evalml.pipelines.components.ExtraTreesRegressor.predict_proba evalml.pipelines.components.ExtraTreesRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: FeatureSelector(parameters=None, component_obj=None, random_seed=0, **kwargs) Selects top features based on importance weights. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.FeatureSelector.clone evalml.pipelines.components.FeatureSelector.default_parameters evalml.pipelines.components.FeatureSelector.describe evalml.pipelines.components.FeatureSelector.fit evalml.pipelines.components.FeatureSelector.fit_transform evalml.pipelines.components.FeatureSelector.get_names evalml.pipelines.components.FeatureSelector.load evalml.pipelines.components.FeatureSelector.name evalml.pipelines.components.FeatureSelector.needs_fitting evalml.pipelines.components.FeatureSelector.parameters evalml.pipelines.components.FeatureSelector.save evalml.pipelines.components.FeatureSelector.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:class:: Imputer(categorical_impute_strategy='most_frequent', categorical_fill_value=None, numeric_impute_strategy='mean', numeric_fill_value=None, boolean_impute_strategy='most_frequent', boolean_fill_value=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy. :param categorical_impute_strategy: Impute strategy to use for string, object, boolean, categorical dtypes. Valid values include "most_frequent" and "constant". :type categorical_impute_strategy: string :param numeric_impute_strategy: Impute strategy to use for numeric columns. Valid values include "mean", "median", "most_frequent", and "constant". :type numeric_impute_strategy: string :param boolean_impute_strategy: Impute strategy to use for boolean columns. Valid values include "most_frequent" and "constant". :type boolean_impute_strategy: string :param categorical_fill_value: When categorical_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with the string "missing_value". :type categorical_fill_value: string :param numeric_fill_value: When numeric_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with 0. :type numeric_fill_value: int, float :param boolean_fill_value: When boolean_impute_strategy == "constant", fill_value is used to replace missing data. The default value of None will fill with True. :type boolean_fill_value: bool :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "categorical_impute_strategy": ["most_frequent"], "numeric_impute_strategy": ["mean", "median", "most_frequent", "knn"], "boolean_impute_strategy": ["most_frequent", "knn"]} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Imputer.clone evalml.pipelines.components.Imputer.default_parameters evalml.pipelines.components.Imputer.describe evalml.pipelines.components.Imputer.fit evalml.pipelines.components.Imputer.fit_transform evalml.pipelines.components.Imputer.load evalml.pipelines.components.Imputer.needs_fitting evalml.pipelines.components.Imputer.parameters evalml.pipelines.components.Imputer.save evalml.pipelines.components.Imputer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame, np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by imputing missing values. :param X: Data to transform :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, random_seed=0, **kwargs) K-Nearest Neighbors Classifier. :param n_neighbors: Number of neighbors to use by default. Defaults to 5. :type n_neighbors: int :param weights: Weight function used in prediction. Can be: - ‘uniform’ : uniform weights. All points in each neighborhood are weighted equally. - ‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away. - [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights. Defaults to "uniform". :type weights: {‘uniform’, ‘distance’} or callable :param algorithm: Algorithm used to compute the nearest neighbors: - ‘ball_tree’ will use BallTree - ‘kd_tree’ will use KDTree - ‘brute’ will use a brute-force search. ‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit method. Defaults to "auto". Note: fitting on sparse input will override the setting of this parameter, using brute force. :type algorithm: {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’} :param leaf_size: Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem. Defaults to 30. :type leaf_size: int :param p: Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used. Defaults to 2. :type p: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_neighbors": Integer(2, 12), "weights": ["uniform", "distance"], "algorithm": ["auto", "ball_tree", "kd_tree", "brute"], "leaf_size": Integer(10, 30), "p": Integer(1, 5),} * - **model_family** - ModelFamily.K_NEIGHBORS * - **modifies_features** - True * - **modifies_target** - False * - **name** - KNN Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.KNeighborsClassifier.clone evalml.pipelines.components.KNeighborsClassifier.default_parameters evalml.pipelines.components.KNeighborsClassifier.describe evalml.pipelines.components.KNeighborsClassifier.feature_importance evalml.pipelines.components.KNeighborsClassifier.fit evalml.pipelines.components.KNeighborsClassifier.load evalml.pipelines.components.KNeighborsClassifier.needs_fitting evalml.pipelines.components.KNeighborsClassifier.parameters evalml.pipelines.components.KNeighborsClassifier.predict evalml.pipelines.components.KNeighborsClassifier.predict_proba evalml.pipelines.components.KNeighborsClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's matching the input number of features as feature_importance is not defined for KNN classifiers. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: LabelEncoder(positive_label=None, random_seed=0, **kwargs) A transformer that encodes target labels using values between 0 and num_classes - 1. :param positive_label: The label for the class that should be treated as positive (1) for binary classification problems. Ignored for multiclass problems. Defaults to None. :type positive_label: int, str :param random_seed: Seed for the random number generator. Defaults to 0. Ignored. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Label Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LabelEncoder.clone evalml.pipelines.components.LabelEncoder.default_parameters evalml.pipelines.components.LabelEncoder.describe evalml.pipelines.components.LabelEncoder.fit evalml.pipelines.components.LabelEncoder.fit_transform evalml.pipelines.components.LabelEncoder.inverse_transform evalml.pipelines.components.LabelEncoder.load evalml.pipelines.components.LabelEncoder.needs_fitting evalml.pipelines.components.LabelEncoder.parameters evalml.pipelines.components.LabelEncoder.save evalml.pipelines.components.LabelEncoder.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the label encoder. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input `y` is None. .. py:method:: fit_transform(self, X, y) Fit and transform data using the label encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: The original features and an encoded version of the target. :rtype: pd.DataFrame, pd.Series .. py:method:: inverse_transform(self, y) Decodes the target data. :param y: Target data. :type y: pd.Series :returns: The decoded version of the target. :rtype: pd.Series :raises ValueError: If input `y` is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform the target using the fitted label encoder. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: The original features and an encoded version of the target. :rtype: pd.DataFrame, pd.Series :raises ValueError: If input `y` is None. .. py:class:: LightGBMClassifier(boosting_type='gbdt', learning_rate=0.1, n_estimators=100, max_depth=0, num_leaves=31, min_child_samples=20, bagging_fraction=0.9, bagging_freq=0, n_jobs=-1, random_seed=0, **kwargs) LightGBM Classifier. :param boosting_type: Type of boosting to use. Defaults to "gbdt". - 'gbdt' uses traditional Gradient Boosting Decision Tree - "dart", uses Dropouts meet Multiple Additive Regression Trees - "goss", uses Gradient-based One-Side Sampling - "rf", uses Random Forest :type boosting_type: string :param learning_rate: Boosting learning rate. Defaults to 0.1. :type learning_rate: float :param n_estimators: Number of boosted trees to fit. Defaults to 100. :type n_estimators: int :param max_depth: Maximum tree depth for base learners, <=0 means no limit. Defaults to 0. :type max_depth: int :param num_leaves: Maximum tree leaves for base learners. Defaults to 31. :type num_leaves: int :param min_child_samples: Minimum number of data needed in a child (leaf). Defaults to 20. :type min_child_samples: int :param bagging_fraction: LightGBM will randomly select a subset of features on each iteration (tree) without resampling if this is smaller than 1.0. For example, if set to 0.8, LightGBM will select 80% of features before training each tree. This can be used to speed up training and deal with overfitting. Defaults to 0.9. :type bagging_fraction: float :param bagging_freq: Frequency for bagging. 0 means bagging is disabled. k means perform bagging at every k iteration. Every k-th iteration, LightGBM will randomly select bagging_fraction * 100 % of the data to use for the next k iterations. Defaults to 0. :type bagging_freq: int :param n_jobs: Number of threads to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "learning_rate": Real(0.000001, 1), "boosting_type": ["gbdt", "dart", "goss", "rf"], "n_estimators": Integer(10, 100), "max_depth": Integer(0, 10), "num_leaves": Integer(2, 100), "min_child_samples": Integer(1, 100), "bagging_fraction": Real(0.000001, 1), "bagging_freq": Integer(0, 1),} * - **model_family** - ModelFamily.LIGHTGBM * - **modifies_features** - True * - **modifies_target** - False * - **name** - LightGBM Classifier * - **SEED_MAX** - SEED_BOUNDS.max_bound * - **SEED_MIN** - 0 * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LightGBMClassifier.clone evalml.pipelines.components.LightGBMClassifier.default_parameters evalml.pipelines.components.LightGBMClassifier.describe evalml.pipelines.components.LightGBMClassifier.feature_importance evalml.pipelines.components.LightGBMClassifier.fit evalml.pipelines.components.LightGBMClassifier.load evalml.pipelines.components.LightGBMClassifier.needs_fitting evalml.pipelines.components.LightGBMClassifier.parameters evalml.pipelines.components.LightGBMClassifier.predict evalml.pipelines.components.LightGBMClassifier.predict_proba evalml.pipelines.components.LightGBMClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits LightGBM classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make prediction probabilities using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: LightGBMRegressor(boosting_type='gbdt', learning_rate=0.1, n_estimators=20, max_depth=0, num_leaves=31, min_child_samples=20, bagging_fraction=0.9, bagging_freq=0, n_jobs=-1, random_seed=0, **kwargs) LightGBM Regressor. :param boosting_type: Type of boosting to use. Defaults to "gbdt". - 'gbdt' uses traditional Gradient Boosting Decision Tree - "dart", uses Dropouts meet Multiple Additive Regression Trees - "goss", uses Gradient-based One-Side Sampling - "rf", uses Random Forest :type boosting_type: string :param learning_rate: Boosting learning rate. Defaults to 0.1. :type learning_rate: float :param n_estimators: Number of boosted trees to fit. Defaults to 100. :type n_estimators: int :param max_depth: Maximum tree depth for base learners, <=0 means no limit. Defaults to 0. :type max_depth: int :param num_leaves: Maximum tree leaves for base learners. Defaults to 31. :type num_leaves: int :param min_child_samples: Minimum number of data needed in a child (leaf). Defaults to 20. :type min_child_samples: int :param bagging_fraction: LightGBM will randomly select a subset of features on each iteration (tree) without resampling if this is smaller than 1.0. For example, if set to 0.8, LightGBM will select 80% of features before training each tree. This can be used to speed up training and deal with overfitting. Defaults to 0.9. :type bagging_fraction: float :param bagging_freq: Frequency for bagging. 0 means bagging is disabled. k means perform bagging at every k iteration. Every k-th iteration, LightGBM will randomly select bagging_fraction * 100 % of the data to use for the next k iterations. Defaults to 0. :type bagging_freq: int :param n_jobs: Number of threads to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "learning_rate": Real(0.000001, 1), "boosting_type": ["gbdt", "dart", "goss", "rf"], "n_estimators": Integer(10, 100), "max_depth": Integer(0, 10), "num_leaves": Integer(2, 100), "min_child_samples": Integer(1, 100), "bagging_fraction": Real(0.000001, 1), "bagging_freq": Integer(0, 1),} * - **model_family** - ModelFamily.LIGHTGBM * - **modifies_features** - True * - **modifies_target** - False * - **name** - LightGBM Regressor * - **SEED_MAX** - SEED_BOUNDS.max_bound * - **SEED_MIN** - 0 * - **supported_problem_types** - [ProblemTypes.REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LightGBMRegressor.clone evalml.pipelines.components.LightGBMRegressor.default_parameters evalml.pipelines.components.LightGBMRegressor.describe evalml.pipelines.components.LightGBMRegressor.feature_importance evalml.pipelines.components.LightGBMRegressor.fit evalml.pipelines.components.LightGBMRegressor.load evalml.pipelines.components.LightGBMRegressor.needs_fitting evalml.pipelines.components.LightGBMRegressor.parameters evalml.pipelines.components.LightGBMRegressor.predict evalml.pipelines.components.LightGBMRegressor.predict_proba evalml.pipelines.components.LightGBMRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits LightGBM regressor to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using fitted LightGBM regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: LinearDiscriminantAnalysis(n_components=None, random_seed=0, **kwargs) Reduces the number of features by using Linear Discriminant Analysis. :param n_components: The number of features to maintain after computation. Defaults to None. :type n_components: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Linear Discriminant Analysis Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LinearDiscriminantAnalysis.clone evalml.pipelines.components.LinearDiscriminantAnalysis.default_parameters evalml.pipelines.components.LinearDiscriminantAnalysis.describe evalml.pipelines.components.LinearDiscriminantAnalysis.fit evalml.pipelines.components.LinearDiscriminantAnalysis.fit_transform evalml.pipelines.components.LinearDiscriminantAnalysis.load evalml.pipelines.components.LinearDiscriminantAnalysis.needs_fitting evalml.pipelines.components.LinearDiscriminantAnalysis.parameters evalml.pipelines.components.LinearDiscriminantAnalysis.save evalml.pipelines.components.LinearDiscriminantAnalysis.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If input data is not all numeric. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted LDA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:class:: LinearRegressor(fit_intercept=True, normalize=False, n_jobs=-1, random_seed=0, **kwargs) Linear Regressor. :param fit_intercept: Whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (i.e. data is expected to be centered). Defaults to True. :type fit_intercept: boolean :param normalize: If True, the regressors will be normalized before regression by subtracting the mean and dividing by the l2-norm. This parameter is ignored when fit_intercept is set to False. Defaults to False. :type normalize: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "fit_intercept": [True, False], "normalize": [True, False]} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Linear Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LinearRegressor.clone evalml.pipelines.components.LinearRegressor.default_parameters evalml.pipelines.components.LinearRegressor.describe evalml.pipelines.components.LinearRegressor.feature_importance evalml.pipelines.components.LinearRegressor.fit evalml.pipelines.components.LinearRegressor.load evalml.pipelines.components.LinearRegressor.needs_fitting evalml.pipelines.components.LinearRegressor.parameters evalml.pipelines.components.LinearRegressor.predict evalml.pipelines.components.LinearRegressor.predict_proba evalml.pipelines.components.LinearRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted linear regressor. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: LogisticRegressionClassifier(penalty='l2', C=1.0, multi_class='auto', solver='lbfgs', n_jobs=-1, random_seed=0, **kwargs) Logistic Regression Classifier. :param penalty: The norm used in penalization. Defaults to "l2". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "lbfgs". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "penalty": ["l2"], "C": Real(0.01, 10),} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Logistic Regression Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LogisticRegressionClassifier.clone evalml.pipelines.components.LogisticRegressionClassifier.default_parameters evalml.pipelines.components.LogisticRegressionClassifier.describe evalml.pipelines.components.LogisticRegressionClassifier.feature_importance evalml.pipelines.components.LogisticRegressionClassifier.fit evalml.pipelines.components.LogisticRegressionClassifier.load evalml.pipelines.components.LogisticRegressionClassifier.needs_fitting evalml.pipelines.components.LogisticRegressionClassifier.parameters evalml.pipelines.components.LogisticRegressionClassifier.predict evalml.pipelines.components.LogisticRegressionClassifier.predict_proba evalml.pipelines.components.LogisticRegressionClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted logistic regression classifier. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: LogTransformer(random_seed=0) Applies a log transformation to the target data. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Log Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LogTransformer.clone evalml.pipelines.components.LogTransformer.default_parameters evalml.pipelines.components.LogTransformer.describe evalml.pipelines.components.LogTransformer.fit evalml.pipelines.components.LogTransformer.fit_transform evalml.pipelines.components.LogTransformer.inverse_transform evalml.pipelines.components.LogTransformer.load evalml.pipelines.components.LogTransformer.needs_fitting evalml.pipelines.components.LogTransformer.parameters evalml.pipelines.components.LogTransformer.save evalml.pipelines.components.LogTransformer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the LogTransformer. :param X: Ignored. :type X: pd.DataFrame or np.ndarray :param y: Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Log transforms the target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to log transform. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is log transformed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: inverse_transform(self, y) Apply exponential to target data. :param y: Target variable. :type y: pd.Series :returns: Target with exponential applied. :rtype: pd.Series .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Log transforms the target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target data to log transform. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is log transformed. :rtype: tuple of pd.DataFrame, pd.Series .. py:class:: LSA(random_seed=0, **kwargs) Transformer to calculate the Latent Semantic Analysis Values of text input. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - LSA Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.LSA.clone evalml.pipelines.components.LSA.default_parameters evalml.pipelines.components.LSA.describe evalml.pipelines.components.LSA.fit evalml.pipelines.components.LSA.fit_transform evalml.pipelines.components.LSA.load evalml.pipelines.components.LSA.needs_fitting evalml.pipelines.components.LSA.parameters evalml.pipelines.components.LSA.save evalml.pipelines.components.LSA.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the input data. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by applying the LSA pipeline. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X. The original column is removed and replaced with two columns of the format `LSA(original_column_name)[feature_number]`, where `feature_number` is 0 or 1. :rtype: pd.DataFrame .. py:class:: NaturalLanguageFeaturizer(random_seed=0, **kwargs) Transformer that can automatically featurize text columns using featuretools' nlp_primitives. Since models cannot handle non-numeric data, any text must be broken down into features that provide useful information about that text. This component splits each text column into several informative features: Diversity Score, Mean Characters per Word, Polarity Score, LSA (Latent Semantic Analysis), Number of Characters, and Number of Words. Calling transform on this component will replace any text columns in the given dataset with these numeric columns. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Natural Language Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.NaturalLanguageFeaturizer.clone evalml.pipelines.components.NaturalLanguageFeaturizer.default_parameters evalml.pipelines.components.NaturalLanguageFeaturizer.describe evalml.pipelines.components.NaturalLanguageFeaturizer.fit evalml.pipelines.components.NaturalLanguageFeaturizer.fit_transform evalml.pipelines.components.NaturalLanguageFeaturizer.load evalml.pipelines.components.NaturalLanguageFeaturizer.needs_fitting evalml.pipelines.components.NaturalLanguageFeaturizer.parameters evalml.pipelines.components.NaturalLanguageFeaturizer.save evalml.pipelines.components.NaturalLanguageFeaturizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by creating new features using existing text columns. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: OneHotEncoder(top_n=10, features_to_encode=None, categories=None, drop='if_binary', handle_unknown='ignore', handle_missing='error', random_seed=0, **kwargs) A transformer that encodes categorical features in a one-hot numeric array. :param top_n: Number of categories per column to encode. If None, all categories will be encoded. Otherwise, the `n` most frequent will be encoded and all others will be dropped. Defaults to 10. :type top_n: int :param features_to_encode: List of columns to encode. All other columns will remain untouched. If None, all appropriate columns will be encoded. Defaults to None. :type features_to_encode: list[str] :param categories: A two dimensional list of categories, where `categories[i]` is a list of the categories for the column at index `i`. This can also be `None`, or `"auto"` if `top_n` is not None. Defaults to None. :type categories: list :param drop: Method ("first" or "if_binary") to use to drop one category per feature. Can also be a list specifying which categories to drop for each feature. Defaults to 'if_binary'. :type drop: string, list :param handle_unknown: Whether to ignore or error for unknown categories for a feature encountered during `fit` or `transform`. If either `top_n` or `categories` is used to limit the number of categories per column, this must be "ignore". Defaults to "ignore". :type handle_unknown: string :param handle_missing: Options for how to handle missing (NaN) values encountered during `fit` or `transform`. If this is set to "as_category" and NaN values are within the `n` most frequent, "nan" values will be encoded as their own column. If this is set to "error", any missing values encountered will raise an error. Defaults to "error". :type handle_missing: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - One Hot Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.OneHotEncoder.categories evalml.pipelines.components.OneHotEncoder.clone evalml.pipelines.components.OneHotEncoder.default_parameters evalml.pipelines.components.OneHotEncoder.describe evalml.pipelines.components.OneHotEncoder.fit evalml.pipelines.components.OneHotEncoder.fit_transform evalml.pipelines.components.OneHotEncoder.get_feature_names evalml.pipelines.components.OneHotEncoder.load evalml.pipelines.components.OneHotEncoder.needs_fitting evalml.pipelines.components.OneHotEncoder.parameters evalml.pipelines.components.OneHotEncoder.save evalml.pipelines.components.OneHotEncoder.transform .. py:method:: categories(self, feature_name) Returns a list of the unique categories to be encoded for the particular feature, in order. :param feature_name: The name of any feature provided to one-hot encoder during fit. :type feature_name: str :returns: The unique categories, in the same dtype as they were provided during fit. :rtype: np.ndarray :raises ValueError: If feature was not provided to one-hot encoder as a training feature. .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the one-hot encoder component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If encoding a column failed. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: get_feature_names(self) Return feature names for the categorical features after fitting. Feature names are formatted as {column name}_{category name}. In the event of a duplicate name, an integer will be added at the end of the feature name to distinguish it. For example, consider a dataframe with a column called "A" and category "x_y" and another column called "A_x" with "y". In this example, the feature names would be "A_x_y" and "A_x_y_1". :returns: The feature names after encoding, provided in the same order as input_features. :rtype: np.ndarray .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) One-hot encode the input data. :param X: Features to one-hot encode. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series :returns: Transformed data, where each categorical feature has been encoded into numerical columns using one-hot encoding. :rtype: pd.DataFrame .. py:class:: Oversampler(sampling_ratio=0.25, sampling_ratio_dict=None, k_neighbors_default=5, n_jobs=-1, random_seed=0, **kwargs) SMOTE Oversampler component. Will automatically select whether to use SMOTE, SMOTEN, or SMOTENC based on inputs to the component. :param sampling_ratio: This is the goal ratio of the minority to majority class, with range (0, 1]. A value of 0.25 means we want a 1:4 ratio of the minority to majority class after oversampling. We will create the a sampling dictionary using this ratio, with the keys corresponding to the class and the values responding to the number of samples. Defaults to 0.25. :type sampling_ratio: float :param sampling_ratio_dict: A dictionary specifying the desired balanced ratio for each target value. For instance, in a binary case where class 1 is the minority, we could specify: `sampling_ratio_dict={0: 0.5, 1: 1}`, which means we would undersample class 0 to have twice the number of samples as class 1 (minority:majority ratio = 0.5), and don't sample class 1. Overrides sampling_ratio if provided. Defaults to None. :type sampling_ratio_dict: dict :param k_neighbors_default: The number of nearest neighbors used to construct synthetic samples. This is the default value used, but the actual k_neighbors value might be smaller if there are less samples. Defaults to 5. :type k_neighbors_default: int :param n_jobs: The number of CPU cores to use. Defaults to -1. :type n_jobs: int :param random_seed: The seed to use for random sampling. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Oversampler * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Oversampler.clone evalml.pipelines.components.Oversampler.default_parameters evalml.pipelines.components.Oversampler.describe evalml.pipelines.components.Oversampler.fit evalml.pipelines.components.Oversampler.fit_transform evalml.pipelines.components.Oversampler.load evalml.pipelines.components.Oversampler.needs_fitting evalml.pipelines.components.Oversampler.parameters evalml.pipelines.components.Oversampler.save evalml.pipelines.components.Oversampler.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits oversampler to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y) Fit and transform data using the sampler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms the input data by sampling the data. :param X: Training features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: Transformed features and target. :rtype: pd.DataFrame, pd.Series .. py:class:: PCA(variance=0.95, n_components=None, random_seed=0, **kwargs) Reduces the number of features by using Principal Component Analysis (PCA). :param variance: The percentage of the original data variance that should be preserved when reducing the number of features. Defaults to 0.95. :type variance: float :param n_components: The number of features to maintain after computing SVD. Defaults to None, but will override variance variable if set. :type n_components: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - Real(0.25, 1)}:type: {"variance" * - **modifies_features** - True * - **modifies_target** - False * - **name** - PCA Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PCA.clone evalml.pipelines.components.PCA.default_parameters evalml.pipelines.components.PCA.describe evalml.pipelines.components.PCA.fit evalml.pipelines.components.PCA.fit_transform evalml.pipelines.components.PCA.load evalml.pipelines.components.PCA.needs_fitting evalml.pipelines.components.PCA.parameters evalml.pipelines.components.PCA.save evalml.pipelines.components.PCA.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: If input data is not all numeric. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using fitted PCA component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame :raises ValueError: If input data is not all numeric. .. py:class:: PerColumnImputer(impute_strategies=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy per column. :param impute_strategies: Column and {"impute_strategy": strategy, "fill_value":value} pairings. Valid values for impute strategy include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. Defaults to None, which uses "most_frequent" for all columns. When impute_strategy == "constant", fill_value is used to replace missing data. When None, uses 0 when imputing numerical data and "missing_value" for strings or object data types. :type impute_strategies: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Per Column Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PerColumnImputer.clone evalml.pipelines.components.PerColumnImputer.default_parameters evalml.pipelines.components.PerColumnImputer.describe evalml.pipelines.components.PerColumnImputer.fit evalml.pipelines.components.PerColumnImputer.fit_transform evalml.pipelines.components.PerColumnImputer.load evalml.pipelines.components.PerColumnImputer.needs_fitting evalml.pipelines.components.PerColumnImputer.parameters evalml.pipelines.components.PerColumnImputer.save evalml.pipelines.components.PerColumnImputer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputers on input data. :param X: The input training data of shape [n_samples, n_features] to fit. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. Ignored. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by imputing missing values. :param X: The input training data of shape [n_samples, n_features] to transform. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: PolynomialDetrender(degree=1, random_seed=0, **kwargs) Removes trends from time series by fitting a polynomial to the data. :param degree: Degree for the polynomial. If 1, linear model is fit to the data. If 2, quadratic model is fit, etc. Defaults to 1. :type degree: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "degree": Integer(1, 3)} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Polynomial Detrender * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.PolynomialDetrender.clone evalml.pipelines.components.PolynomialDetrender.default_parameters evalml.pipelines.components.PolynomialDetrender.describe evalml.pipelines.components.PolynomialDetrender.fit evalml.pipelines.components.PolynomialDetrender.fit_transform evalml.pipelines.components.PolynomialDetrender.inverse_transform evalml.pipelines.components.PolynomialDetrender.load evalml.pipelines.components.PolynomialDetrender.needs_fitting evalml.pipelines.components.PolynomialDetrender.parameters evalml.pipelines.components.PolynomialDetrender.save evalml.pipelines.components.PolynomialDetrender.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the PolynomialDetrender. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: fit_transform(self, X, y=None) Removes fitted trend from target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the fitted trend removed. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: inverse_transform(self, y) Adds back fitted trend to target variable. :param y: Target variable. :type y: pd.Series :returns: The first element are the input features returned without modification. The second element is the target variable y with the trend added back. :rtype: tuple of pd.DataFrame, pd.Series :raises ValueError: If y is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Removes fitted trend from target variable. :param X: Ignored. :type X: pd.DataFrame, optional :param y: Target variable to detrend. :type y: pd.Series :returns: The input features are returned without modification. The target variable y is detrended :rtype: tuple of pd.DataFrame, pd.Series .. py:class:: ProphetRegressor(time_index=None, changepoint_prior_scale=0.05, seasonality_prior_scale=10, holidays_prior_scale=10, seasonality_mode='additive', random_seed=0, stan_backend='CMDSTANPY', **kwargs) Prophet is a procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. More information here: https://facebook.github.io/prophet/ **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "changepoint_prior_scale": Real(0.001, 0.5), "seasonality_prior_scale": Real(0.01, 10), "holidays_prior_scale": Real(0.01, 10), "seasonality_mode": ["additive", "multiplicative"],} * - **model_family** - ModelFamily.PROPHET * - **modifies_features** - True * - **modifies_target** - False * - **name** - Prophet Regressor * - **supported_problem_types** - [ProblemTypes.TIME_SERIES_REGRESSION] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ProphetRegressor.build_prophet_df evalml.pipelines.components.ProphetRegressor.clone evalml.pipelines.components.ProphetRegressor.default_parameters evalml.pipelines.components.ProphetRegressor.describe evalml.pipelines.components.ProphetRegressor.feature_importance evalml.pipelines.components.ProphetRegressor.fit evalml.pipelines.components.ProphetRegressor.get_params evalml.pipelines.components.ProphetRegressor.load evalml.pipelines.components.ProphetRegressor.needs_fitting evalml.pipelines.components.ProphetRegressor.parameters evalml.pipelines.components.ProphetRegressor.predict evalml.pipelines.components.ProphetRegressor.predict_proba evalml.pipelines.components.ProphetRegressor.save .. py:method:: build_prophet_df(X, y=None, time_index='ds') :staticmethod: Build the Prophet data to pass fit and predict on. .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's with len(1) as feature_importance is not defined for Prophet regressor. .. py:method:: fit(self, X, y=None) Fits Prophet regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_params(self) Get parameters for the Prophet regressor. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X, y=None) Make predictions using fitted Prophet regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: RandomForestClassifier(n_estimators=100, max_depth=6, n_jobs=-1, random_seed=0, **kwargs) Random Forest Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_depth": Integer(1, 10),} * - **model_family** - ModelFamily.RANDOM_FOREST * - **modifies_features** - True * - **modifies_target** - False * - **name** - Random Forest Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RandomForestClassifier.clone evalml.pipelines.components.RandomForestClassifier.default_parameters evalml.pipelines.components.RandomForestClassifier.describe evalml.pipelines.components.RandomForestClassifier.feature_importance evalml.pipelines.components.RandomForestClassifier.fit evalml.pipelines.components.RandomForestClassifier.load evalml.pipelines.components.RandomForestClassifier.needs_fitting evalml.pipelines.components.RandomForestClassifier.parameters evalml.pipelines.components.RandomForestClassifier.predict evalml.pipelines.components.RandomForestClassifier.predict_proba evalml.pipelines.components.RandomForestClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: RandomForestRegressor(n_estimators=100, max_depth=6, n_jobs=-1, random_seed=0, **kwargs) Random Forest Regressor. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_depth": Integer(1, 32),} * - **model_family** - ModelFamily.RANDOM_FOREST * - **modifies_features** - True * - **modifies_target** - False * - **name** - Random Forest Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RandomForestRegressor.clone evalml.pipelines.components.RandomForestRegressor.default_parameters evalml.pipelines.components.RandomForestRegressor.describe evalml.pipelines.components.RandomForestRegressor.feature_importance evalml.pipelines.components.RandomForestRegressor.fit evalml.pipelines.components.RandomForestRegressor.load evalml.pipelines.components.RandomForestRegressor.needs_fitting evalml.pipelines.components.RandomForestRegressor.parameters evalml.pipelines.components.RandomForestRegressor.predict evalml.pipelines.components.RandomForestRegressor.predict_proba evalml.pipelines.components.RandomForestRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: ReplaceNullableTypes(random_seed=0, **kwargs) Transformer to replace features with the new nullable dtypes with a dtype that is compatible in EvalML. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **modifies_features** - True * - **modifies_target** - {} * - **name** - Replace Nullable Types Transformer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.ReplaceNullableTypes.clone evalml.pipelines.components.ReplaceNullableTypes.default_parameters evalml.pipelines.components.ReplaceNullableTypes.describe evalml.pipelines.components.ReplaceNullableTypes.fit evalml.pipelines.components.ReplaceNullableTypes.fit_transform evalml.pipelines.components.ReplaceNullableTypes.load evalml.pipelines.components.ReplaceNullableTypes.needs_fitting evalml.pipelines.components.ReplaceNullableTypes.parameters evalml.pipelines.components.ReplaceNullableTypes.save evalml.pipelines.components.ReplaceNullableTypes.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Substitutes non-nullable types for the new pandas nullable types in the data and target data. :param X: Input features. :type X: pd.DataFrame, optional :param y: Target data. :type y: pd.Series :returns: The input features and target data with the non-nullable types set. :rtype: tuple of pd.DataFrame, pd.Series .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data by replacing columns that contain nullable types with the appropriate replacement type. "float64" for nullable integers and "category" for nullable booleans. :param X: Data to transform :type X: pd.DataFrame :param y: Target data to transform :type y: pd.Series, optional :returns: Transformed X pd.Series: Transformed y :rtype: pd.DataFrame .. py:class:: RFClassifierSelectFromModel(number_features=None, n_estimators=10, max_depth=None, percent_features=0.5, threshold='median', n_jobs=-1, random_seed=0, **kwargs) Selects top features based on importance weights using a Random Forest classifier. :param number_features: The maximum number of features to select. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. Defaults to None. :type number_features: int :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param percent_features: Percentage of features to use. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. :type percent_features: float :param threshold: The threshold value to use for feature selection. Features whose importance is greater or equal are kept while the others are discarded. If "median", then the threshold value is the median of the feature importances. A scaling factor (e.g., "1.25*mean") may also be used. Defaults to -np.inf. :type threshold: string or float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "percent_features": Real(0.01, 1), "threshold": ["mean", "median"],} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RF Classifier Select From Model * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFClassifierSelectFromModel.clone evalml.pipelines.components.RFClassifierSelectFromModel.default_parameters evalml.pipelines.components.RFClassifierSelectFromModel.describe evalml.pipelines.components.RFClassifierSelectFromModel.fit evalml.pipelines.components.RFClassifierSelectFromModel.fit_transform evalml.pipelines.components.RFClassifierSelectFromModel.get_names evalml.pipelines.components.RFClassifierSelectFromModel.load evalml.pipelines.components.RFClassifierSelectFromModel.needs_fitting evalml.pipelines.components.RFClassifierSelectFromModel.parameters evalml.pipelines.components.RFClassifierSelectFromModel.save evalml.pipelines.components.RFClassifierSelectFromModel.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:class:: RFRegressorSelectFromModel(number_features=None, n_estimators=10, max_depth=None, percent_features=0.5, threshold='median', n_jobs=-1, random_seed=0, **kwargs) Selects top features based on importance weights using a Random Forest regressor. :param number_features: The maximum number of features to select. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. Defaults to None. :type number_features: int :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param percent_features: Percentage of features to use. If both percent_features and number_features are specified, take the greater number of features. Defaults to 0.5. :type percent_features: float :param threshold: The threshold value to use for feature selection. Features whose importance is greater or equal are kept while the others are discarded. If "median", then the threshold value is the median of the feature importances. A scaling factor (e.g., "1.25*mean") may also be used. Defaults to -np.inf. :type threshold: string or float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "percent_features": Real(0.01, 1), "threshold": ["mean", "median"],} * - **modifies_features** - True * - **modifies_target** - False * - **name** - RF Regressor Select From Model * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.RFRegressorSelectFromModel.clone evalml.pipelines.components.RFRegressorSelectFromModel.default_parameters evalml.pipelines.components.RFRegressorSelectFromModel.describe evalml.pipelines.components.RFRegressorSelectFromModel.fit evalml.pipelines.components.RFRegressorSelectFromModel.fit_transform evalml.pipelines.components.RFRegressorSelectFromModel.get_names evalml.pipelines.components.RFRegressorSelectFromModel.load evalml.pipelines.components.RFRegressorSelectFromModel.needs_fitting evalml.pipelines.components.RFRegressorSelectFromModel.parameters evalml.pipelines.components.RFRegressorSelectFromModel.save evalml.pipelines.components.RFRegressorSelectFromModel.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the feature selector. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_names(self) Get names of selected features. :returns: List of the names of features selected. :rtype: list[str] .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input data by selecting features. If the component_obj does not have a transform method, will raise an MethodPropertyNotFoundError exception. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If feature selector does not have a transform method or a component_obj that implements transform .. py:class:: SelectByType(column_types=None, exclude=False, random_seed=0, **kwargs) Selects columns by specified Woodwork logical type or semantic tag in input data. :param column_types: List of Woodwork types or tags, used to determine which columns to select or exclude. :type column_types: string, ww.LogicalType, list(string), list(ww.LogicalType) :param exclude: If true, exclude the column_types instead of including them. Defaults to False. :type exclude: bool :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Select Columns By Type Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SelectByType.clone evalml.pipelines.components.SelectByType.default_parameters evalml.pipelines.components.SelectByType.describe evalml.pipelines.components.SelectByType.fit evalml.pipelines.components.SelectByType.fit_transform evalml.pipelines.components.SelectByType.load evalml.pipelines.components.SelectByType.parameters evalml.pipelines.components.SelectByType.save evalml.pipelines.components.SelectByType.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, ignored :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by selecting columns. :param X: Data to transform. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: Transformed X. :rtype: pd.DataFrame .. py:class:: SelectColumns(columns=None, random_seed=0, **kwargs) Selects specified columns in input data. :param columns: List of column names, used to determine which columns to select. If columns are not present, they will not be selected. :type columns: list(string) :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Select Columns Transformer * - **needs_fitting** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SelectColumns.clone evalml.pipelines.components.SelectColumns.default_parameters evalml.pipelines.components.SelectColumns.describe evalml.pipelines.components.SelectColumns.fit evalml.pipelines.components.SelectColumns.fit_transform evalml.pipelines.components.SelectColumns.load evalml.pipelines.components.SelectColumns.parameters evalml.pipelines.components.SelectColumns.save evalml.pipelines.components.SelectColumns.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the transformer by checking if column names are present in the dataset. :param X: Data to check. :type X: pd.DataFrame :param y: Targets. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using fitted column selector component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:class:: SimpleImputer(impute_strategy='most_frequent', fill_value=None, random_seed=0, **kwargs) Imputes missing data according to a specified imputation strategy. Natural language columns are ignored. :param impute_strategy: Impute strategy to use. Valid values include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. :type impute_strategy: string :param fill_value: When impute_strategy == "constant", fill_value is used to replace missing data. Defaults to 0 when imputing numerical data and "missing_value" for strings or object data types. :type fill_value: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "impute_strategy": ["mean", "median", "most_frequent"]} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Simple Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SimpleImputer.clone evalml.pipelines.components.SimpleImputer.default_parameters evalml.pipelines.components.SimpleImputer.describe evalml.pipelines.components.SimpleImputer.fit evalml.pipelines.components.SimpleImputer.fit_transform evalml.pipelines.components.SimpleImputer.load evalml.pipelines.components.SimpleImputer.needs_fitting evalml.pipelines.components.SimpleImputer.parameters evalml.pipelines.components.SimpleImputer.save evalml.pipelines.components.SimpleImputer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: the input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: the target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises ValueError: if the SimpleImputer receives a dataframe with both Boolean and Categorical data. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms input by imputing missing values. 'None' and np.nan values are treated as the same. :param X: Data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:class:: StackedEnsembleClassifier(final_estimator=None, n_jobs=-1, random_seed=0, **kwargs) Stacked Ensemble Classifier. :param final_estimator: The classifier used to combine the base estimators. If None, uses ElasticNetClassifier. :type final_estimator: Estimator or subclass :param n_jobs: Integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Defaults to -1. - Note: there could be some multi-process errors thrown for values of `n_jobs != 1`. If this is the case, please use `n_jobs = 1`. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> from evalml.pipelines.component_graph import ComponentGraph >>> from evalml.pipelines.components.estimators.classifiers.decision_tree_classifier import DecisionTreeClassifier >>> from evalml.pipelines.components.estimators.classifiers.elasticnet_classifier import ElasticNetClassifier ... >>> component_graph = { ... "Decision Tree": [DecisionTreeClassifier(random_seed=3), "X", "y"], ... "Decision Tree B": [DecisionTreeClassifier(random_seed=4), "X", "y"], ... "Stacked Ensemble": [ ... StackedEnsembleClassifier(n_jobs=1, final_estimator=DecisionTreeClassifier()), ... "Decision Tree.x", ... "Decision Tree B.x", ... "y", ... ], ... } ... >>> cg = ComponentGraph(component_graph) >>> assert cg.default_parameters == { ... 'Decision Tree Classifier': {'criterion': 'gini', ... 'max_features': 'auto', ... 'max_depth': 6, ... 'min_samples_split': 2, ... 'min_weight_fraction_leaf': 0.0}, ... 'Stacked Ensemble Classifier': {'final_estimator': ElasticNetClassifier, ... 'n_jobs': -1}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.ENSEMBLE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Stacked Ensemble Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StackedEnsembleClassifier.clone evalml.pipelines.components.StackedEnsembleClassifier.default_parameters evalml.pipelines.components.StackedEnsembleClassifier.describe evalml.pipelines.components.StackedEnsembleClassifier.feature_importance evalml.pipelines.components.StackedEnsembleClassifier.fit evalml.pipelines.components.StackedEnsembleClassifier.load evalml.pipelines.components.StackedEnsembleClassifier.needs_fitting evalml.pipelines.components.StackedEnsembleClassifier.parameters evalml.pipelines.components.StackedEnsembleClassifier.predict evalml.pipelines.components.StackedEnsembleClassifier.predict_proba evalml.pipelines.components.StackedEnsembleClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for stacked ensemble classes. :returns: default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Not implemented for StackedEnsembleClassifier and StackedEnsembleRegressor. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: StackedEnsembleRegressor(final_estimator=None, n_jobs=-1, random_seed=0, **kwargs) Stacked Ensemble Regressor. :param final_estimator: The regressor used to combine the base estimators. If None, uses ElasticNetRegressor. :type final_estimator: Estimator or subclass :param n_jobs: Integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs greater than -1, (n_cpus + 1 + n_jobs) are used. Defaults to -1. - Note: there could be some multi-process errors thrown for values of `n_jobs != 1`. If this is the case, please use `n_jobs = 1`. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> from evalml.pipelines.component_graph import ComponentGraph >>> from evalml.pipelines.components.estimators.regressors.rf_regressor import RandomForestRegressor >>> from evalml.pipelines.components.estimators.regressors.elasticnet_regressor import ElasticNetRegressor ... >>> component_graph = { ... "Random Forest": [RandomForestRegressor(random_seed=3), "X", "y"], ... "Random Forest B": [RandomForestRegressor(random_seed=4), "X", "y"], ... "Stacked Ensemble": [ ... StackedEnsembleRegressor(n_jobs=1, final_estimator=RandomForestRegressor()), ... "Random Forest.x", ... "Random Forest B.x", ... "y", ... ], ... } ... >>> cg = ComponentGraph(component_graph) >>> assert cg.default_parameters == { ... 'Random Forest Regressor': {'n_estimators': 100, ... 'max_depth': 6, ... 'n_jobs': -1}, ... 'Stacked Ensemble Regressor': {'final_estimator': ElasticNetRegressor, ... 'n_jobs': -1}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.ENSEMBLE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Stacked Ensemble Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StackedEnsembleRegressor.clone evalml.pipelines.components.StackedEnsembleRegressor.default_parameters evalml.pipelines.components.StackedEnsembleRegressor.describe evalml.pipelines.components.StackedEnsembleRegressor.feature_importance evalml.pipelines.components.StackedEnsembleRegressor.fit evalml.pipelines.components.StackedEnsembleRegressor.load evalml.pipelines.components.StackedEnsembleRegressor.needs_fitting evalml.pipelines.components.StackedEnsembleRegressor.parameters evalml.pipelines.components.StackedEnsembleRegressor.predict evalml.pipelines.components.StackedEnsembleRegressor.predict_proba evalml.pipelines.components.StackedEnsembleRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for stacked ensemble classes. :returns: default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Not implemented for StackedEnsembleClassifier and StackedEnsembleRegressor. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: StandardScaler(random_seed=0, **kwargs) A transformer that standardizes input features by removing the mean and scaling to unit variance. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Standard Scaler * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.StandardScaler.clone evalml.pipelines.components.StandardScaler.default_parameters evalml.pipelines.components.StandardScaler.describe evalml.pipelines.components.StandardScaler.fit evalml.pipelines.components.StandardScaler.fit_transform evalml.pipelines.components.StandardScaler.load evalml.pipelines.components.StandardScaler.needs_fitting evalml.pipelines.components.StandardScaler.parameters evalml.pipelines.components.StandardScaler.save evalml.pipelines.components.StandardScaler.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the standard scalar on the given data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fit and transform data using the standard scaler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted standard scaler. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:class:: SVMClassifier(C=1.0, kernel='rbf', gamma='auto', probability=True, random_seed=0, **kwargs) Support Vector Machine Classifier. :param C: The regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty. Defaults to 1.0. :type C: float :param kernel: Specifies the kernel type to be used in the algorithm. Defaults to "rbf". :type kernel: {"poly", "rbf", "sigmoid"} :param gamma: Kernel coefficient for "rbf", "poly" and "sigmoid". Defaults to "auto". - If gamma='scale' is passed then it uses 1 / (n_features * X.var()) as value of gamma - If "auto" (default), uses 1 / n_features :type gamma: {"scale", "auto"} or float :param probability: Whether to enable probability estimates. Defaults to True. :type probability: boolean :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0, 10), "kernel": ["poly", "rbf", "sigmoid"], "gamma": ["scale", "auto"],} * - **model_family** - ModelFamily.SVM * - **modifies_features** - True * - **modifies_target** - False * - **name** - SVM Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SVMClassifier.clone evalml.pipelines.components.SVMClassifier.default_parameters evalml.pipelines.components.SVMClassifier.describe evalml.pipelines.components.SVMClassifier.feature_importance evalml.pipelines.components.SVMClassifier.fit evalml.pipelines.components.SVMClassifier.load evalml.pipelines.components.SVMClassifier.needs_fitting evalml.pipelines.components.SVMClassifier.parameters evalml.pipelines.components.SVMClassifier.predict evalml.pipelines.components.SVMClassifier.predict_proba evalml.pipelines.components.SVMClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance only works with linear kernels. If the kernel isn't linear, we return a numpy array of zeros. :returns: Feature importance of fitted SVM classifier or a numpy array of zeroes if the kernel is not linear. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: SVMRegressor(C=1.0, kernel='rbf', gamma='auto', random_seed=0, **kwargs) Support Vector Machine Regressor. :param C: The regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty. Defaults to 1.0. :type C: float :param kernel: Specifies the kernel type to be used in the algorithm. Defaults to "rbf". :type kernel: {"poly", "rbf", "sigmoid"} :param gamma: Kernel coefficient for "rbf", "poly" and "sigmoid". Defaults to "auto". - If gamma='scale' is passed then it uses 1 / (n_features * X.var()) as value of gamma - If "auto" (default), uses 1 / n_features :type gamma: {"scale", "auto"} or float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0, 10), "kernel": ["poly", "rbf", "sigmoid"], "gamma": ["scale", "auto"],} * - **model_family** - ModelFamily.SVM * - **modifies_features** - True * - **modifies_target** - False * - **name** - SVM Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.SVMRegressor.clone evalml.pipelines.components.SVMRegressor.default_parameters evalml.pipelines.components.SVMRegressor.describe evalml.pipelines.components.SVMRegressor.feature_importance evalml.pipelines.components.SVMRegressor.fit evalml.pipelines.components.SVMRegressor.load evalml.pipelines.components.SVMRegressor.needs_fitting evalml.pipelines.components.SVMRegressor.parameters evalml.pipelines.components.SVMRegressor.predict evalml.pipelines.components.SVMRegressor.predict_proba evalml.pipelines.components.SVMRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted SVM regresor. Only works with linear kernels. If the kernel isn't linear, we return a numpy array of zeros. :returns: The feature importance of the fitted SVM regressor, or an array of zeroes if the kernel is not linear. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: TargetEncoder(cols=None, smoothing=1, handle_unknown='value', handle_missing='value', random_seed=0, **kwargs) A transformer that encodes categorical features into target encodings. :param cols: Columns to encode. If None, all string columns will be encoded, otherwise only the columns provided will be encoded. Defaults to None :type cols: list :param smoothing: The smoothing factor to apply. The larger this value is, the more influence the expected target value has on the resulting target encodings. Must be strictly larger than 0. Defaults to 1.0 :type smoothing: float :param handle_unknown: Determines how to handle unknown categories for a feature encountered. Options are 'value', 'error', nd 'return_nan'. Defaults to 'value', which replaces with the target mean :type handle_unknown: string :param handle_missing: Determines how to handle missing values encountered during `fit` or `transform`. Options are 'value', 'error', and 'return_nan'. Defaults to 'value', which replaces with the target mean :type handle_missing: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Target Encoder * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TargetEncoder.clone evalml.pipelines.components.TargetEncoder.default_parameters evalml.pipelines.components.TargetEncoder.describe evalml.pipelines.components.TargetEncoder.fit evalml.pipelines.components.TargetEncoder.fit_transform evalml.pipelines.components.TargetEncoder.get_feature_names evalml.pipelines.components.TargetEncoder.load evalml.pipelines.components.TargetEncoder.needs_fitting evalml.pipelines.components.TargetEncoder.parameters evalml.pipelines.components.TargetEncoder.save evalml.pipelines.components.TargetEncoder.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y) Fit and transform data using the target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:method:: get_feature_names(self) Return feature names for the input features after fitting. :returns: The feature names after encoding. :rtype: np.array .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transform data using the fitted target encoder. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: pd.DataFrame .. py:class:: TargetImputer(impute_strategy='most_frequent', fill_value=None, random_seed=0, **kwargs) Imputes missing target data according to a specified imputation strategy. :param impute_strategy: Impute strategy to use. Valid values include "mean", "median", "most_frequent", "constant" for numerical data, and "most_frequent", "constant" for object data types. Defaults to "most_frequent". :type impute_strategy: string :param fill_value: When impute_strategy == "constant", fill_value is used to replace missing data. Defaults to None which uses 0 when imputing numerical data and "missing_value" for strings or object data types. :type fill_value: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "impute_strategy": ["mean", "median", "most_frequent"]} * - **modifies_features** - False * - **modifies_target** - True * - **name** - Target Imputer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TargetImputer.clone evalml.pipelines.components.TargetImputer.default_parameters evalml.pipelines.components.TargetImputer.describe evalml.pipelines.components.TargetImputer.fit evalml.pipelines.components.TargetImputer.fit_transform evalml.pipelines.components.TargetImputer.load evalml.pipelines.components.TargetImputer.needs_fitting evalml.pipelines.components.TargetImputer.parameters evalml.pipelines.components.TargetImputer.save evalml.pipelines.components.TargetImputer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits imputer to target data. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: The input training data of shape [n_samples, n_features]. Ignored. :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises TypeError: If target is filled with all null values. .. py:method:: fit_transform(self, X, y) Fits on and transforms the input target data. :param X: Features. Ignored. :type X: pd.DataFrame :param y: Target data to impute. :type y: pd.Series :returns: The original X, transformed y :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y) Transforms input target data by imputing missing values. 'None' and np.nan values are treated as the same. :param X: Features. Ignored. :type X: pd.DataFrame :param y: Target data to impute. :type y: pd.Series :returns: The original X, transformed y :rtype: (pd.DataFrame, pd.Series) .. py:class:: TimeSeriesBaselineEstimator(gap=1, forecast_horizon=1, random_seed=0, **kwargs) Time series estimator that predicts using the naive forecasting approach. This is useful as a simple baseline estimator for time series problems. :param gap: Gap between prediction date and target date and must be a positive integer. If gap is 0, target date will be shifted ahead by 1 time period. Defaults to 1. :type gap: int :param forecast_horizon: Number of time steps the model is expected to predict. :type forecast_horizon: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Time Series Baseline Estimator * - **supported_problem_types** - [ ProblemTypes.TIME_SERIES_REGRESSION, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesBaselineEstimator.clone evalml.pipelines.components.TimeSeriesBaselineEstimator.default_parameters evalml.pipelines.components.TimeSeriesBaselineEstimator.describe evalml.pipelines.components.TimeSeriesBaselineEstimator.feature_importance evalml.pipelines.components.TimeSeriesBaselineEstimator.fit evalml.pipelines.components.TimeSeriesBaselineEstimator.load evalml.pipelines.components.TimeSeriesBaselineEstimator.needs_fitting evalml.pipelines.components.TimeSeriesBaselineEstimator.parameters evalml.pipelines.components.TimeSeriesBaselineEstimator.predict evalml.pipelines.components.TimeSeriesBaselineEstimator.predict_proba evalml.pipelines.components.TimeSeriesBaselineEstimator.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline estimators do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes. :rtype: np.ndarray (float) .. py:method:: fit(self, X, y=None) Fits time series baseline estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If input y is None. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using fitted time series baseline estimator. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises ValueError: If input y is None. .. py:method:: predict_proba(self, X) Make prediction probabilities using fitted time series baseline estimator. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame :raises ValueError: If input y is None. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: TimeSeriesFeaturizer(time_index=None, max_delay=2, gap=0, forecast_horizon=1, conf_level=0.05, rolling_window_size=0.25, delay_features=True, delay_target=True, random_seed=0, **kwargs) Transformer that delays input features and target variable for time series problems. This component uses an algorithm based on the autocorrelation values of the target variable to determine which lags to select from the set of all possible lags. The algorithm is based on the idea that the local maxima of the autocorrelation function indicate the lags that have the most impact on the present time. The algorithm computes the autocorrelation values and finds the local maxima, called "peaks", that are significant at the given conf_level. Since lags in the range [0, 10] tend to be predictive but not local maxima, the union of the peaks is taken with the significant lags in the range [0, 10]. At the end, only selected lags in the range [0, max_delay] are used. Parametrizing the algorithm by conf_level lets the AutoMLAlgorithm tune the set of lags chosen so that the chances of finding a good set of lags is higher. Using conf_level value of 1 selects all possible lags. :param time_index: Name of the column containing the datetime information used to order the data. Ignored. :type time_index: str :param max_delay: Maximum number of time units to delay each feature. Defaults to 2. :type max_delay: int :param forecast_horizon: The number of time periods the pipeline is expected to forecast. :type forecast_horizon: int :param conf_level: Float in range (0, 1] that determines the confidence interval size used to select which lags to compute from the set of [1, max_delay]. A delay of 1 will always be computed. If 1, selects all possible lags in the set of [1, max_delay], inclusive. :type conf_level: float :param rolling_window_size: Float in range (0, 1] that determines the size of the window used for rolling features. Size is computed as rolling_window_size * max_delay. :type rolling_window_size: float :param delay_features: Whether to delay the input features. Defaults to True. :type delay_features: bool :param delay_target: Whether to delay the target. Defaults to True. :type delay_target: bool :param gap: The number of time units between when the features are collected and when the target is collected. For example, if you are predicting the next time step's target, gap=1. This is only needed because when gap=0, we need to be sure to start the lagging of the target variable at 1. Defaults to 1. :type gap: int :param random_seed: Seed for the random number generator. This transformer performs the same regardless of the random seed provided. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - Real(0.001, 1.0), "rolling_window_size": Real(0.001, 1.0)}:type: {"conf_level" * - **modifies_features** - True * - **modifies_target** - False * - **name** - Time Series Featurizer * - **needs_fitting** - True * - **target_colname_prefix** - target_delay_{} * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesFeaturizer.clone evalml.pipelines.components.TimeSeriesFeaturizer.default_parameters evalml.pipelines.components.TimeSeriesFeaturizer.describe evalml.pipelines.components.TimeSeriesFeaturizer.fit evalml.pipelines.components.TimeSeriesFeaturizer.fit_transform evalml.pipelines.components.TimeSeriesFeaturizer.load evalml.pipelines.components.TimeSeriesFeaturizer.parameters evalml.pipelines.components.TimeSeriesFeaturizer.save evalml.pipelines.components.TimeSeriesFeaturizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the DelayFeatureTransformer. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises ValueError: if self.time_index is None .. py:method:: fit_transform(self, X, y=None) Fit the component and transform the input data. :param X: Data to transform. :type X: pd.DataFrame :param y: Target. :type y: pd.Series, or None :returns: Transformed X. :rtype: pd.DataFrame .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Computes the delayed values and rolling means for X and y. The chosen delays are determined by the autocorrelation function of the target variable. See the class docstring for more information on how they are chosen. If y is None, all possible lags are chosen. If y is not None, it will also compute the delayed values for the target variable. The rolling means for all numeric features in X and y, if y is numeric, are also returned. :param X: Data to transform. None is expected when only the target variable is being used. :type X: pd.DataFrame or None :param y: Target. :type y: pd.Series, or None :returns: Transformed X. No original features are returned. :rtype: pd.DataFrame .. py:class:: TimeSeriesImputer(categorical_impute_strategy='forwards_fill', numeric_impute_strategy='interpolate', target_impute_strategy='forwards_fill', random_seed=0, **kwargs) Imputes missing data according to a specified timeseries-specific imputation strategy. This Transformer should be used after the `TimeSeriesRegularizer` in order to impute the missing values that were added to X and y (if passed). :param categorical_impute_strategy: Impute strategy to use for string, object, boolean, categorical dtypes. Valid values include "backwards_fill" and "forwards_fill". Defaults to "forwards_fill". :type categorical_impute_strategy: string :param numeric_impute_strategy: Impute strategy to use for numeric columns. Valid values include "backwards_fill", "forwards_fill", and "interpolate". Defaults to "interpolate". :type numeric_impute_strategy: string :param target_impute_strategy: Impute strategy to use for the target column. Valid values include "backwards_fill", "forwards_fill", and "interpolate". Defaults to "forwards_fill". :type target_impute_strategy: string :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :raises ValueError: If categorical_impute_strategy, numeric_impute_strategy, or target_impute_strategy is not one of the valid values. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "categorical_impute_strategy": ["backwards_fill", "forwards_fill"], "numeric_impute_strategy": ["backwards_fill", "forwards_fill", "interpolate"], "target_impute_strategy": ["backwards_fill", "forwards_fill", "interpolate"],} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Time Series Imputer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesImputer.clone evalml.pipelines.components.TimeSeriesImputer.default_parameters evalml.pipelines.components.TimeSeriesImputer.describe evalml.pipelines.components.TimeSeriesImputer.fit evalml.pipelines.components.TimeSeriesImputer.fit_transform evalml.pipelines.components.TimeSeriesImputer.load evalml.pipelines.components.TimeSeriesImputer.needs_fitting evalml.pipelines.components.TimeSeriesImputer.parameters evalml.pipelines.components.TimeSeriesImputer.save evalml.pipelines.components.TimeSeriesImputer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits imputer to data. 'None' values are converted to np.nan before imputation and are treated as the same. If a value is missing at the beginning or end of a column, that value will be imputed using backwards fill or forwards fill as necessary, respectively. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame, np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by imputing missing values using specified timeseries-specific strategies. 'None' values are converted to np.nan before imputation and are treated as the same. :param X: Data to transform. :type X: pd.DataFrame :param y: Optionally, target data to transform. :type y: pd.Series, optional :returns: Transformed X and y :rtype: pd.DataFrame .. py:class:: TimeSeriesRegularizer(time_index=None, frequency_payload=None, window_length=4, threshold=0.4, random_seed=0, **kwargs) Transformer that regularizes an inconsistently spaced datetime column. If X is passed in to fit/transform, the column `time_index` will be checked for an inferrable offset frequency. If the `time_index` column is perfectly inferrable then this Transformer will do nothing and return the original X and y. If X does not have a perfectly inferrable frequency but one can be estimated, then X and y will be reformatted based on the estimated frequency for `time_index`. In the original X and y passed: - Missing datetime values will be added and will have their corresponding columns in X and y set to None. - Duplicate datetime values will be dropped. - Extra datetime values will be dropped. - If it can be determined that a duplicate or extra value is misaligned, then it will be repositioned to take the place of a missing value. This Transformer should be used before the `TimeSeriesImputer` in order to impute the missing values that were added to X and y (if passed). :param time_index: Name of the column containing the datetime information used to order the data, required. Defaults to None. :type time_index: string :param frequency_payload: Payload returned from Woodwork's infer_frequency function where debug is True. Defaults to None. :type frequency_payload: tuple :param window_length: The size of the rolling window over which inference is conducted to determine the prevalence of uninferrable frequencies. :type window_length: int :param Lower values make this component more sensitive to recognizing numerous faulty datetime values. Defaults to 5.: :param threshold: The minimum percentage of windows that need to have been able to infer a frequency. Lower values make this component more :type threshold: float :param sensitive to recognizing numerous faulty datetime values. Defaults to 0.8.: :param random_seed: Seed for the random number generator. This transformer performs the same regardless of the random seed provided. :type random_seed: int :param Defaults to 0.: :raises ValueError: if the frequency_payload parameter has not been passed a tuple **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Time Series Regularizer * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.TimeSeriesRegularizer.clone evalml.pipelines.components.TimeSeriesRegularizer.default_parameters evalml.pipelines.components.TimeSeriesRegularizer.describe evalml.pipelines.components.TimeSeriesRegularizer.fit evalml.pipelines.components.TimeSeriesRegularizer.fit_transform evalml.pipelines.components.TimeSeriesRegularizer.load evalml.pipelines.components.TimeSeriesRegularizer.needs_fitting evalml.pipelines.components.TimeSeriesRegularizer.parameters evalml.pipelines.components.TimeSeriesRegularizer.save evalml.pipelines.components.TimeSeriesRegularizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits the TimeSeriesRegularizer. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self :raises ValueError: if self.time_index is None, if X and y have different lengths, if `time_index` in X does not have an offset frequency that can be estimated :raises TypeError: if the `time_index` column is not of type Datetime :raises KeyError: if the `time_index` column doesn't exist .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Regularizes a dataframe and target data to an inferrable offset frequency. A 'clean' X and y (if y was passed in) are created based on an inferrable offset frequency and matching datetime values with the original X and y are imputed into the clean X and y. Datetime values identified as misaligned are shifted into their appropriate position. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Data with an inferrable `time_index` offset frequency. :rtype: (pd.DataFrame, pd.Series) .. py:class:: Transformer(parameters=None, component_obj=None, random_seed=0, **kwargs) A component that may or may not need fitting that transforms data. These components are used before an estimator. To implement a new Transformer, define your own class which is a subclass of Transformer, including a name and a list of acceptable ranges for any parameters to be tuned during the automl search (hyperparameters). Define an `__init__` method which sets up any necessary state and objects. Make sure your `__init__` only uses standard keyword arguments and calls `super().__init__()` with a parameters dict. You may also override the `fit`, `transform`, `fit_transform` and other methods in this class if appropriate. To see some examples, check out the definitions of any Transformer component. :param parameters: Dictionary of parameters for the component. Defaults to None. :type parameters: dict :param component_obj: Third-party objects useful in component implementation. Defaults to None. :type component_obj: obj :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **modifies_features** - True * - **modifies_target** - False * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Transformer.clone evalml.pipelines.components.Transformer.default_parameters evalml.pipelines.components.Transformer.describe evalml.pipelines.components.Transformer.fit evalml.pipelines.components.Transformer.fit_transform evalml.pipelines.components.Transformer.load evalml.pipelines.components.Transformer.name evalml.pipelines.components.Transformer.needs_fitting evalml.pipelines.components.Transformer.parameters evalml.pipelines.components.Transformer.save evalml.pipelines.components.Transformer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: name(cls) :property: Returns string name of this component. .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) :abstractmethod: Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:class:: Undersampler(sampling_ratio=0.25, sampling_ratio_dict=None, min_samples=100, min_percentage=0.1, random_seed=0, **kwargs) Initializes an undersampling transformer to downsample the majority classes in the dataset. This component is only run during training and not during predict. :param sampling_ratio: The smallest minority:majority ratio that is accepted as 'balanced'. For instance, a 1:4 ratio would be represented as 0.25, while a 1:1 ratio is 1.0. Must be between 0 and 1, inclusive. Defaults to 0.25. :type sampling_ratio: float :param sampling_ratio_dict: A dictionary specifying the desired balanced ratio for each target value. For instance, in a binary case where class 1 is the minority, we could specify: `sampling_ratio_dict={0: 0.5, 1: 1}`, which means we would undersample class 0 to have twice the number of samples as class 1 (minority:majority ratio = 0.5), and don't sample class 1. Overrides sampling_ratio if provided. Defaults to None. :type sampling_ratio_dict: dict :param min_samples: The minimum number of samples that we must have for any class, pre or post sampling. If a class must be downsampled, it will not be downsampled past this value. To determine severe imbalance, the minority class must occur less often than this and must have a class ratio below min_percentage. Must be greater than 0. Defaults to 100. :type min_samples: int :param min_percentage: The minimum percentage of the minimum class to total dataset that we tolerate, as long as it is above min_samples. If min_percentage and min_samples are not met, treat this as severely imbalanced, and we will not resample the data. Must be between 0 and 0.5, inclusive. Defaults to 0.1. :type min_percentage: float :param random_seed: The seed to use for random sampling. Defaults to 0. :type random_seed: int :raises ValueError: If sampling_ratio is not in the range (0, 1]. :raises ValueError: If min_sample is not greater than 0. :raises ValueError: If min_percentage is not between 0 and 0.5, inclusive. **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - True * - **name** - Undersampler * - **training_only** - True **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.Undersampler.clone evalml.pipelines.components.Undersampler.default_parameters evalml.pipelines.components.Undersampler.describe evalml.pipelines.components.Undersampler.fit evalml.pipelines.components.Undersampler.fit_resample evalml.pipelines.components.Undersampler.fit_transform evalml.pipelines.components.Undersampler.load evalml.pipelines.components.Undersampler.needs_fitting evalml.pipelines.components.Undersampler.parameters evalml.pipelines.components.Undersampler.save evalml.pipelines.components.Undersampler.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y) Fits the sampler to the data. :param X: Input features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: fit_resample(self, X, y) Resampling technique for this sampler. :param X: Training data to fit and resample. :type X: pd.DataFrame :param y: Training data targets to fit and resample. :type y: pd.Series :returns: Indices to keep for training data. :rtype: list .. py:method:: fit_transform(self, X, y) Fit and transform data using the sampler component. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: Transformed data. :rtype: (pd.DataFrame, pd.Series) .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms the input data by sampling the data. :param X: Training features. :type X: pd.DataFrame :param y: Target. :type y: pd.Series :returns: Transformed features and target. :rtype: pd.DataFrame, pd.Series .. py:class:: URLFeaturizer(random_seed=0, **kwargs) Transformer that can automatically extract features from URL. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - URL Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.URLFeaturizer.clone evalml.pipelines.components.URLFeaturizer.default_parameters evalml.pipelines.components.URLFeaturizer.describe evalml.pipelines.components.URLFeaturizer.fit evalml.pipelines.components.URLFeaturizer.fit_transform evalml.pipelines.components.URLFeaturizer.load evalml.pipelines.components.URLFeaturizer.needs_fitting evalml.pipelines.components.URLFeaturizer.parameters evalml.pipelines.components.URLFeaturizer.save evalml.pipelines.components.URLFeaturizer.transform .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame :param y: The target training data of length [n_samples] :type y: pd.Series, optional :returns: self :raises MethodPropertyNotFoundError: If component does not have a fit method or a component_obj that implements fit. .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X. :param X: Data to transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:class:: VowpalWabbitBinaryClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Binary Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param passes: Number of training passes. Defaults to 1. :type passes: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Binary Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.TIME_SERIES_BINARY,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitBinaryClassifier.clone evalml.pipelines.components.VowpalWabbitBinaryClassifier.default_parameters evalml.pipelines.components.VowpalWabbitBinaryClassifier.describe evalml.pipelines.components.VowpalWabbitBinaryClassifier.feature_importance evalml.pipelines.components.VowpalWabbitBinaryClassifier.fit evalml.pipelines.components.VowpalWabbitBinaryClassifier.load evalml.pipelines.components.VowpalWabbitBinaryClassifier.needs_fitting evalml.pipelines.components.VowpalWabbitBinaryClassifier.parameters evalml.pipelines.components.VowpalWabbitBinaryClassifier.predict evalml.pipelines.components.VowpalWabbitBinaryClassifier.predict_proba evalml.pipelines.components.VowpalWabbitBinaryClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: VowpalWabbitMulticlassClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Multiclass Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Multiclass Classifier * - **supported_problem_types** - [ ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitMulticlassClassifier.clone evalml.pipelines.components.VowpalWabbitMulticlassClassifier.default_parameters evalml.pipelines.components.VowpalWabbitMulticlassClassifier.describe evalml.pipelines.components.VowpalWabbitMulticlassClassifier.feature_importance evalml.pipelines.components.VowpalWabbitMulticlassClassifier.fit evalml.pipelines.components.VowpalWabbitMulticlassClassifier.load evalml.pipelines.components.VowpalWabbitMulticlassClassifier.needs_fitting evalml.pipelines.components.VowpalWabbitMulticlassClassifier.parameters evalml.pipelines.components.VowpalWabbitMulticlassClassifier.predict evalml.pipelines.components.VowpalWabbitMulticlassClassifier.predict_proba evalml.pipelines.components.VowpalWabbitMulticlassClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: VowpalWabbitRegressor(learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Regressor. :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param passes: Number of training passes. Defaults to 1. :type passes: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Regressor * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.VowpalWabbitRegressor.clone evalml.pipelines.components.VowpalWabbitRegressor.default_parameters evalml.pipelines.components.VowpalWabbitRegressor.describe evalml.pipelines.components.VowpalWabbitRegressor.feature_importance evalml.pipelines.components.VowpalWabbitRegressor.fit evalml.pipelines.components.VowpalWabbitRegressor.load evalml.pipelines.components.VowpalWabbitRegressor.needs_fitting evalml.pipelines.components.VowpalWabbitRegressor.parameters evalml.pipelines.components.VowpalWabbitRegressor.predict evalml.pipelines.components.VowpalWabbitRegressor.predict_proba evalml.pipelines.components.VowpalWabbitRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit regressor. .. py:method:: fit(self, X, y=None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: XGBoostClassifier(eta=0.1, max_depth=6, min_child_weight=1, n_estimators=100, random_seed=0, eval_metric='logloss', n_jobs=12, **kwargs) XGBoost Classifier. :param eta: Boosting learning rate. Defaults to 0.1. :type eta: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param min_child_weight: Minimum sum of instance weight (hessian) needed in a child. Defaults to 1.0 :type min_child_weight: float :param n_estimators: Number of gradient boosted trees. Equivalent to number of boosting rounds. Defaults to 100. :type n_estimators: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to 12. :type n_jobs: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "eta": Real(0.000001, 1), "max_depth": Integer(1, 10), "min_child_weight": Real(1, 10), "n_estimators": Integer(1, 1000),} * - **model_family** - ModelFamily.XGBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - XGBoost Classifier * - **SEED_MAX** - None * - **SEED_MIN** - None * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.XGBoostClassifier.clone evalml.pipelines.components.XGBoostClassifier.default_parameters evalml.pipelines.components.XGBoostClassifier.describe evalml.pipelines.components.XGBoostClassifier.feature_importance evalml.pipelines.components.XGBoostClassifier.fit evalml.pipelines.components.XGBoostClassifier.load evalml.pipelines.components.XGBoostClassifier.needs_fitting evalml.pipelines.components.XGBoostClassifier.parameters evalml.pipelines.components.XGBoostClassifier.predict evalml.pipelines.components.XGBoostClassifier.predict_proba evalml.pipelines.components.XGBoostClassifier.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted XGBoost classifier. .. py:method:: fit(self, X, y=None) Fits XGBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted XGBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:class:: XGBoostRegressor(eta=0.1, max_depth=6, min_child_weight=1, n_estimators=100, random_seed=0, n_jobs=12, **kwargs) XGBoost Regressor. :param eta: Boosting learning rate. Defaults to 0.1. :type eta: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param min_child_weight: Minimum sum of instance weight (hessian) needed in a child. Defaults to 1.0 :type min_child_weight: float :param n_estimators: Number of gradient boosted trees. Equivalent to number of boosting rounds. Defaults to 100. :type n_estimators: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to 12. :type n_jobs: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "eta": Real(0.000001, 1), "max_depth": Integer(1, 20), "min_child_weight": Real(1, 10), "n_estimators": Integer(1, 1000),} * - **model_family** - ModelFamily.XGBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - XGBoost Regressor * - **SEED_MAX** - None * - **SEED_MIN** - None * - **supported_problem_types** - [ ProblemTypes.REGRESSION, ProblemTypes.TIME_SERIES_REGRESSION,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.XGBoostRegressor.clone evalml.pipelines.components.XGBoostRegressor.default_parameters evalml.pipelines.components.XGBoostRegressor.describe evalml.pipelines.components.XGBoostRegressor.feature_importance evalml.pipelines.components.XGBoostRegressor.fit evalml.pipelines.components.XGBoostRegressor.load evalml.pipelines.components.XGBoostRegressor.needs_fitting evalml.pipelines.components.XGBoostRegressor.parameters evalml.pipelines.components.XGBoostRegressor.predict evalml.pipelines.components.XGBoostRegressor.predict_proba evalml.pipelines.components.XGBoostRegressor.save .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted XGBoost regressor. .. py:method:: fit(self, X, y=None) Fits XGBoost regressor component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using fitted XGBoost regressor. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int