classifiers ============================================================ .. py:module:: evalml.pipelines.components.estimators.classifiers .. autoapi-nested-parse:: Classification model components. Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 baseline_classifier/index.rst catboost_classifier/index.rst decision_tree_classifier/index.rst elasticnet_classifier/index.rst et_classifier/index.rst kneighbors_classifier/index.rst lightgbm_classifier/index.rst logistic_regression_classifier/index.rst rf_classifier/index.rst svm_classifier/index.rst vowpal_wabbit_classifiers/index.rst xgboost_classifier/index.rst Package Contents ---------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.estimators.classifiers.BaselineClassifier evalml.pipelines.components.estimators.classifiers.CatBoostClassifier evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier evalml.pipelines.components.estimators.classifiers.LightGBMClassifier evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier evalml.pipelines.components.estimators.classifiers.RandomForestClassifier evalml.pipelines.components.estimators.classifiers.SVMClassifier evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier evalml.pipelines.components.estimators.classifiers.XGBoostClassifier Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: BaselineClassifier(strategy='mode', random_seed=0, **kwargs) Classifier that predicts using the specified strategy. This is useful as a simple baseline classifier to compare with other classifiers. :param strategy: Method used to predict. Valid options are "mode", "random" and "random_weighted". Defaults to "mode". :type strategy: str :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **model_family** - ModelFamily.BASELINE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Baseline Classifier * - **supported_problem_types** - [ProblemTypes.BINARY, ProblemTypes.MULTICLASS] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.BaselineClassifier.classes_ evalml.pipelines.components.estimators.classifiers.BaselineClassifier.clone evalml.pipelines.components.estimators.classifiers.BaselineClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.BaselineClassifier.describe evalml.pipelines.components.estimators.classifiers.BaselineClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.BaselineClassifier.fit evalml.pipelines.components.estimators.classifiers.BaselineClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.BaselineClassifier.load evalml.pipelines.components.estimators.classifiers.BaselineClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.BaselineClassifier.parameters evalml.pipelines.components.estimators.classifiers.BaselineClassifier.predict evalml.pipelines.components.estimators.classifiers.BaselineClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.BaselineClassifier.save evalml.pipelines.components.estimators.classifiers.BaselineClassifier.update_parameters .. py:method:: classes_(self) :property: Returns class labels. Will return None before fitting. :returns: Class names :rtype: list[str] or list(float) .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns importance associated with each feature. Since baseline classifiers do not use input features to calculate predictions, returns an array of zeroes. :returns: An array of zeroes :rtype: pd.Series .. py:method:: fit(self, X, y=None) Fits baseline classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self :raises ValueError: If y is None. .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make prediction probabilities using the baseline classification strategy. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: CatBoostClassifier(n_estimators=10, eta=0.03, max_depth=6, bootstrap_type=None, silent=True, allow_writing_files=False, random_seed=0, n_jobs=-1, **kwargs) CatBoost Classifier, a classifier that uses gradient-boosting on decision trees. CatBoost is an open-source library and natively supports categorical features. For more information, check out https://catboost.ai/ :param n_estimators: The maximum number of trees to build. Defaults to 10. :type n_estimators: float :param eta: The learning rate. Defaults to 0.03. :type eta: float :param max_depth: The maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param bootstrap_type: Defines the method for sampling the weights of objects. Available methods are 'Bayesian', 'Bernoulli', 'MVS'. Defaults to None. :type bootstrap_type: string :param silent: Whether to use the "silent" logging mode. Defaults to True. :type silent: boolean :param allow_writing_files: Whether to allow writing snapshot files while training. Defaults to False. :type allow_writing_files: boolean :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(4, 100), "eta": Real(0.000001, 1), "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.CATBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - CatBoost Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.clone evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.describe evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.fit evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.load evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.parameters evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.predict evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.save evalml.pipelines.components.estimators.classifiers.CatBoostClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted CatBoost classifier. .. py:method:: fit(self, X, y=None) Fits CatBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X) Make prediction probabilities using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: DecisionTreeClassifier(criterion='gini', max_features='sqrt', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, random_seed=0, **kwargs) Decision Tree Classifier. :param criterion: The function to measure the quality of a split. Supported criteria are "gini" for the Gini impurity and "entropy" for the information gain. Defaults to "gini". :type criterion: {"gini", "entropy"} :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. :type max_features: int, float or {"sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. Defaults to 2. :type min_samples_split: int or float :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "criterion": ["gini", "entropy"], "max_features": ["sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.DECISION_TREE * - **modifies_features** - True * - **modifies_target** - False * - **name** - Decision Tree Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.clone evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.describe evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.fit evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.load evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.parameters evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.predict evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.save evalml.pipelines.components.estimators.classifiers.DecisionTreeClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ElasticNetClassifier(penalty='elasticnet', C=1.0, l1_ratio=0.15, multi_class='auto', solver='saga', n_jobs=-1, random_seed=0, **kwargs) Elastic Net Classifier. Uses Logistic Regression with elasticnet penalty as the base estimator. :param penalty: The norm used in penalization. Defaults to "elasticnet". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param l1_ratio: The mixing parameter, with 0 <= l1_ratio <= 1. Only used if penalty='elasticnet'. Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2. Defaults to 0.15. :type l1_ratio: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "saga". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0.01, 10), "l1_ratio": Real(0, 1)} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Elastic Net Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.clone evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.describe evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.fit evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.load evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.parameters evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.predict evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.save evalml.pipelines.components.estimators.classifiers.ElasticNetClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted ElasticNet classifier. .. py:method:: fit(self, X, y) Fits ElasticNet classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: ExtraTreesClassifier(n_estimators=100, max_features='sqrt', max_depth=6, min_samples_split=2, min_weight_fraction_leaf=0.0, n_jobs=-1, random_seed=0, **kwargs) Extra Trees Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_features: The number of features to consider when looking for the best split: - If int, then consider max_features features at each split. - If float, then max_features is a fraction and int(max_features * n_features) features are considered at each split. - If "sqrt", then max_features=sqrt(n_features). - If "log2", then max_features=log2(n_features). - If None, then max_features = n_features. The search for a split does not stop until at least one valid partition of the node samples is found, even if it requires to effectively inspect more than max_features features. :type max_features: int, float or {"sqrt", "log2"} :param max_depth: The maximum depth of the tree. Defaults to 6. :type max_depth: int :param min_samples_split: The minimum number of samples required to split an internal node: - If int, then consider min_samples_split as the minimum number. - If float, then min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split. :type min_samples_split: int or float :param Defaults to 2.: :param min_weight_fraction_leaf: The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Defaults to 0.0. :type min_weight_fraction_leaf: float :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_features": ["sqrt", "log2"], "max_depth": Integer(4, 10),} * - **model_family** - ModelFamily.EXTRA_TREES * - **modifies_features** - True * - **modifies_target** - False * - **name** - Extra Trees Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.clone evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.describe evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.fit evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.load evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.parameters evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.predict evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.save evalml.pipelines.components.estimators.classifiers.ExtraTreesClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, random_seed=0, **kwargs) K-Nearest Neighbors Classifier. :param n_neighbors: Number of neighbors to use by default. Defaults to 5. :type n_neighbors: int :param weights: Weight function used in prediction. Can be: - ‘uniform’ : uniform weights. All points in each neighborhood are weighted equally. - ‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away. - [callable] : a user-defined function which accepts an array of distances, and returns an array of the same shape containing the weights. Defaults to "uniform". :type weights: {‘uniform’, ‘distance’} or callable :param algorithm: Algorithm used to compute the nearest neighbors: - ‘ball_tree’ will use BallTree - ‘kd_tree’ will use KDTree - ‘brute’ will use a brute-force search. ‘auto’ will attempt to decide the most appropriate algorithm based on the values passed to fit method. Defaults to "auto". Note: fitting on sparse input will override the setting of this parameter, using brute force. :type algorithm: {‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’} :param leaf_size: Leaf size passed to BallTree or KDTree. This can affect the speed of the construction and query, as well as the memory required to store the tree. The optimal value depends on the nature of the problem. Defaults to 30. :type leaf_size: int :param p: Power parameter for the Minkowski metric. When p = 1, this is equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2. For arbitrary p, minkowski_distance (l_p) is used. Defaults to 2. :type p: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_neighbors": Integer(2, 12), "weights": ["uniform", "distance"], "algorithm": ["auto", "ball_tree", "kd_tree", "brute"], "leaf_size": Integer(10, 30), "p": Integer(1, 5),} * - **model_family** - ModelFamily.K_NEIGHBORS * - **modifies_features** - True * - **modifies_target** - False * - **name** - KNN Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.clone evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.describe evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.fit evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.load evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.parameters evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.predict evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.save evalml.pipelines.components.estimators.classifiers.KNeighborsClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Returns array of 0's matching the input number of features as feature_importance is not defined for KNN classifiers. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LightGBMClassifier(boosting_type='gbdt', learning_rate=0.1, n_estimators=100, max_depth=0, num_leaves=31, min_child_samples=20, bagging_fraction=0.9, bagging_freq=0, n_jobs=-1, random_seed=0, **kwargs) LightGBM Classifier. :param boosting_type: Type of boosting to use. Defaults to "gbdt". - 'gbdt' uses traditional Gradient Boosting Decision Tree - "dart", uses Dropouts meet Multiple Additive Regression Trees - "goss", uses Gradient-based One-Side Sampling - "rf", uses Random Forest :type boosting_type: string :param learning_rate: Boosting learning rate. Defaults to 0.1. :type learning_rate: float :param n_estimators: Number of boosted trees to fit. Defaults to 100. :type n_estimators: int :param max_depth: Maximum tree depth for base learners, <=0 means no limit. Defaults to 0. :type max_depth: int :param num_leaves: Maximum tree leaves for base learners. Defaults to 31. :type num_leaves: int :param min_child_samples: Minimum number of data needed in a child (leaf). Defaults to 20. :type min_child_samples: int :param bagging_fraction: LightGBM will randomly select a subset of features on each iteration (tree) without resampling if this is smaller than 1.0. For example, if set to 0.8, LightGBM will select 80% of features before training each tree. This can be used to speed up training and deal with overfitting. Defaults to 0.9. :type bagging_fraction: float :param bagging_freq: Frequency for bagging. 0 means bagging is disabled. k means perform bagging at every k iteration. Every k-th iteration, LightGBM will randomly select bagging_fraction * 100 % of the data to use for the next k iterations. Defaults to 0. :type bagging_freq: int :param n_jobs: Number of threads to run in parallel. -1 uses all threads. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "learning_rate": Real(0.000001, 1), "boosting_type": ["gbdt", "dart", "goss", "rf"], "n_estimators": Integer(10, 100), "max_depth": Integer(0, 10), "num_leaves": Integer(2, 100), "min_child_samples": Integer(1, 100), "bagging_fraction": Real(0.000001, 1), "bagging_freq": Integer(0, 1),} * - **model_family** - ModelFamily.LIGHTGBM * - **modifies_features** - True * - **modifies_target** - False * - **name** - LightGBM Classifier * - **SEED_MAX** - SEED_BOUNDS.max_bound * - **SEED_MIN** - 0 * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.clone evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.describe evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.fit evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.load evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.parameters evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.predict evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.save evalml.pipelines.components.estimators.classifiers.LightGBMClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X, y=None) Fits LightGBM classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make prediction probabilities using the fitted LightGBM classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted probability values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: LogisticRegressionClassifier(penalty='l2', C=1.0, multi_class='auto', solver='lbfgs', n_jobs=-1, random_seed=0, **kwargs) Logistic Regression Classifier. :param penalty: The norm used in penalization. Defaults to "l2". :type penalty: {"l1", "l2", "elasticnet", "none"} :param C: Inverse of regularization strength. Must be a positive float. Defaults to 1.0. :type C: float :param multi_class: If the option chosen is "ovr", then a binary problem is fit for each label. For "multinomial" the loss minimised is the multinomial loss fit across the entire probability distribution, even when the data is binary. "multinomial" is unavailable when solver="liblinear". "auto" selects "ovr" if the data is binary, or if solver="liblinear", and otherwise selects "multinomial". Defaults to "auto". :type multi_class: {"auto", "ovr", "multinomial"} :param solver: Algorithm to use in the optimization problem. For small datasets, "liblinear" is a good choice, whereas "sag" and "saga" are faster for large ones. For multiclass problems, only "newton-cg", "sag", "saga" and "lbfgs" handle multinomial loss; "liblinear" is limited to one-versus-rest schemes. - "newton-cg", "lbfgs", "sag" and "saga" handle L2 or no penalty - "liblinear" and "saga" also handle L1 penalty - "saga" also supports "elasticnet" penalty - "liblinear" does not support setting penalty='none' Defaults to "lbfgs". :type solver: {"newton-cg", "lbfgs", "liblinear", "sag", "saga"} :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to -1. :type n_jobs: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "penalty": ["l2"], "C": Real(0.01, 10),} * - **model_family** - ModelFamily.LINEAR_MODEL * - **modifies_features** - True * - **modifies_target** - False * - **name** - Logistic Regression Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.clone evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.describe evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.fit evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.load evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.parameters evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.predict evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.save evalml.pipelines.components.estimators.classifiers.LogisticRegressionClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for fitted logistic regression classifier. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: RandomForestClassifier(n_estimators=100, max_depth=6, n_jobs=-1, random_seed=0, **kwargs) Random Forest Classifier. :param n_estimators: The number of trees in the forest. Defaults to 100. :type n_estimators: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param n_jobs: Number of jobs to run in parallel. -1 uses all processes. Defaults to -1. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "n_estimators": Integer(10, 1000), "max_depth": Integer(1, 10),} * - **model_family** - ModelFamily.RANDOM_FOREST * - **modifies_features** - True * - **modifies_target** - False * - **name** - Random Forest Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.clone evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.describe evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.fit evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.load evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.parameters evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.predict evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.save evalml.pipelines.components.estimators.classifiers.RandomForestClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) -> pandas.Series :property: Returns importance associated with each feature. :returns: Importance associated with each feature. :rtype: np.ndarray :raises MethodPropertyNotFoundError: If estimator does not have a feature_importance method or a component_obj that implements feature_importance. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: SVMClassifier(C=1.0, kernel='rbf', gamma='auto', probability=True, random_seed=0, **kwargs) Support Vector Machine Classifier. :param C: The regularization parameter. The strength of the regularization is inversely proportional to C. Must be strictly positive. The penalty is a squared l2 penalty. Defaults to 1.0. :type C: float :param kernel: Specifies the kernel type to be used in the algorithm. Defaults to "rbf". :type kernel: {"poly", "rbf", "sigmoid"} :param gamma: Kernel coefficient for "rbf", "poly" and "sigmoid". Defaults to "auto". - If gamma='scale' is passed then it uses 1 / (n_features * X.var()) as value of gamma - If "auto" (default), uses 1 / n_features :type gamma: {"scale", "auto"} or float :param probability: Whether to enable probability estimates. Defaults to True. :type probability: boolean :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "C": Real(0, 10), "kernel": ["poly", "rbf", "sigmoid"], "gamma": ["scale", "auto"],} * - **model_family** - ModelFamily.SVM * - **modifies_features** - True * - **modifies_target** - False * - **name** - SVM Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.SVMClassifier.clone evalml.pipelines.components.estimators.classifiers.SVMClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.SVMClassifier.describe evalml.pipelines.components.estimators.classifiers.SVMClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.SVMClassifier.fit evalml.pipelines.components.estimators.classifiers.SVMClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.SVMClassifier.load evalml.pipelines.components.estimators.classifiers.SVMClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.SVMClassifier.parameters evalml.pipelines.components.estimators.classifiers.SVMClassifier.predict evalml.pipelines.components.estimators.classifiers.SVMClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.SVMClassifier.save evalml.pipelines.components.estimators.classifiers.SVMClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance only works with linear kernels. If the kernel isn't linear, we return a numpy array of zeros. :returns: Feature importance of fitted SVM classifier or a numpy array of zeroes if the kernel is not linear. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: VowpalWabbitBinaryClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Binary Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param passes: Number of training passes. Defaults to 1. :type passes: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Binary Classifier * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.TIME_SERIES_BINARY,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.clone evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.describe evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.fit evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.load evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.parameters evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.predict evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.save evalml.pipelines.components.estimators.classifiers.VowpalWabbitBinaryClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: VowpalWabbitMulticlassClassifier(loss_function='logistic', learning_rate=0.5, decay_learning_rate=1.0, power_t=0.5, passes=1, random_seed=0, **kwargs) Vowpal Wabbit Multiclass Classifier. :param loss_function: Specifies the loss function to use. One of {"squared", "classic", "hinge", "logistic", "quantile"}. Defaults to "logistic". :type loss_function: str :param learning_rate: Boosting learning rate. Defaults to 0.5. :type learning_rate: float :param decay_learning_rate: Decay factor for learning_rate. Defaults to 1.0. :type decay_learning_rate: float :param power_t: Power on learning rate decay. Defaults to 0.5. :type power_t: float :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - None * - **model_family** - ModelFamily.VOWPAL_WABBIT * - **modifies_features** - True * - **modifies_target** - False * - **name** - Vowpal Wabbit Multiclass Classifier * - **supported_problem_types** - [ ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.clone evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.describe evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.fit evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.load evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.parameters evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.predict evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.save evalml.pipelines.components.estimators.classifiers.VowpalWabbitMulticlassClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance for Vowpal Wabbit classifiers. This is not implemented. .. py:method:: fit(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None) Fits estimator to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series, optional :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X: pandas.DataFrame) -> pandas.Series Make predictions using selected features. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict method or a component_obj that implements predict. .. py:method:: predict_proba(self, X: pandas.DataFrame) -> pandas.Series Make probability estimates for labels. :param X: Features. :type X: pd.DataFrame :returns: Probability estimates. :rtype: pd.Series :raises MethodPropertyNotFoundError: If estimator does not have a predict_proba method or a component_obj that implements predict_proba. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional .. py:class:: XGBoostClassifier(eta=0.1, max_depth=6, min_child_weight=1, n_estimators=100, random_seed=0, eval_metric='logloss', n_jobs=12, **kwargs) XGBoost Classifier. :param eta: Boosting learning rate. Defaults to 0.1. :type eta: float :param max_depth: Maximum tree depth for base learners. Defaults to 6. :type max_depth: int :param min_child_weight: Minimum sum of instance weight (hessian) needed in a child. Defaults to 1.0 :type min_child_weight: float :param n_estimators: Number of gradient boosted trees. Equivalent to number of boosting rounds. Defaults to 100. :type n_estimators: int :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :param n_jobs: Number of parallel threads used to run xgboost. Note that creating thread contention will significantly slow down the algorithm. Defaults to 12. :type n_jobs: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - { "eta": Real(0.000001, 1), "max_depth": Integer(1, 10), "min_child_weight": Real(1, 10), "n_estimators": Integer(1, 1000),} * - **model_family** - ModelFamily.XGBOOST * - **modifies_features** - True * - **modifies_target** - False * - **name** - XGBoost Classifier * - **SEED_MAX** - None * - **SEED_MIN** - None * - **supported_problem_types** - [ ProblemTypes.BINARY, ProblemTypes.MULTICLASS, ProblemTypes.TIME_SERIES_BINARY, ProblemTypes.TIME_SERIES_MULTICLASS,] * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.clone evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.default_parameters evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.describe evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.feature_importance evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.fit evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.get_prediction_intervals evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.load evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.needs_fitting evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.parameters evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.predict evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.predict_proba evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.save evalml.pipelines.components.estimators.classifiers.XGBoostClassifier.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: feature_importance(self) :property: Feature importance of fitted XGBoost classifier. .. py:method:: fit(self, X, y=None) Fits XGBoost classifier component to data. :param X: The input training data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target training data of length [n_samples]. :type y: pd.Series :returns: self .. py:method:: get_prediction_intervals(self, X: pandas.DataFrame, y: Optional[pandas.Series] = None, coverage: List[float] = None, predictions: pandas.Series = None) -> Dict[str, pandas.Series] Find the prediction intervals using the fitted regressor. This function takes the predictions of the fitted estimator and calculates the rolling standard deviation across all predictions using a window size of 5. The lower and upper predictions are determined by taking the percent point (quantile) function of the lower tail probability at each bound multiplied by the rolling standard deviation. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Target data. Ignored. :type y: pd.Series :param coverage: A list of floats between the values 0 and 1 that the upper and lower bounds of the prediction interval should be calculated for. :type coverage: list[float] :param predictions: Optional list of predictions to use. If None, will generate predictions using `X`. :type predictions: pd.Series :returns: Prediction intervals, keys are in the format {coverage}_lower or {coverage}_upper. :rtype: dict :raises MethodPropertyNotFoundError: If the estimator does not support Time Series Regression as a problem type. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: predict(self, X) Make predictions using the fitted XGBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: predict_proba(self, X) Make predictions using the fitted CatBoost classifier. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame :returns: Predicted values. :rtype: pd.DataFrame .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional