time_series_classification_pipelines =============================================================== .. py:module:: evalml.pipelines.time_series_classification_pipelines .. autoapi-nested-parse:: Pipeline base class for time-series classification problems. Module Contents --------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: TimeSeriesBinaryClassificationPipeline(component_graph, parameters=None, custom_name=None, random_seed=0) Pipeline base class for time series binary classification problems. :param component_graph: List of components in order. Accepts strings or ComponentBase subclasses in the list. Note that when duplicate components are specified in a list, the duplicate component names will be modified with the component's index in the list. For example, the component graph [Imputer, One Hot Encoder, Imputer, Logistic Regression Classifier] will have names ["Imputer", "One Hot Encoder", "Imputer_2", "Logistic Regression Classifier"] :type component_graph: list or dict :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary {} implies using all default values for component parameters. Pipeline-level parameters such as time_index, gap, and max_delay must be specified with the "pipeline" key. For example: Pipeline(parameters={"pipeline": {"time_index": "Date", "max_delay": 4, "gap": 2}}). :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> pipeline = TimeSeriesBinaryClassificationPipeline(component_graph=["Simple Imputer", "Logistic Regression Classifier"], ... parameters={"Logistic Regression Classifier": {"penalty": "elasticnet", ... "solver": "liblinear"}, ... "pipeline": {"gap": 1, "max_delay": 1, "forecast_horizon": 1, "time_index": "date"}}, ... custom_name="My TimeSeriesBinary Pipeline") ... >>> assert pipeline.custom_name == "My TimeSeriesBinary Pipeline" >>> assert pipeline.component_graph.component_dict.keys() == {'Simple Imputer', 'Logistic Regression Classifier'} ... >>> assert pipeline.parameters == { ... 'Simple Imputer': {'impute_strategy': 'most_frequent', 'fill_value': None}, ... 'Logistic Regression Classifier': {'penalty': 'elasticnet', ... 'C': 1.0, ... 'n_jobs': -1, ... 'multi_class': 'auto', ... 'solver': 'liblinear'}, ... 'pipeline': {'gap': 1, 'max_delay': 1, 'forecast_horizon': 1, 'time_index': "date"}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **problem_type** - None **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.can_tune_threshold_with_objective evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.classes_ evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.clone evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.create_objectives evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.custom_name evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.dates_needed_for_prediction evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.dates_needed_for_prediction_range evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.describe evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.fit evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.fit_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.get_component evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.get_hyperparameter_ranges evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.graph evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.graph_dict evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.graph_feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.inverse_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.load evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.model_family evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.name evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.new evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.optimize_threshold evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.parameters evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.predict evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.predict_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.predict_proba evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.predict_proba_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.save evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.score evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.summary evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.threshold evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesBinaryClassificationPipeline.transform_all_but_final .. py:method:: can_tune_threshold_with_objective(self, objective) Determine whether the threshold of a binary classification pipeline can be tuned. :param objective: Primary AutoMLSearch objective. :type objective: ObjectiveBase :returns: True if the pipeline threshold can be tuned. :rtype: bool .. py:method:: classes_(self) :property: Gets the class names for the pipeline. Will return None before pipeline is fit. .. py:method:: clone(self) Constructs a new pipeline with the same components, parameters, and random seed. :returns: A new instance of this pipeline with identical components, parameters, and random seed. .. py:method:: create_objectives(objectives) :staticmethod: Create objective instances from a list of strings or objective classes. .. py:method:: custom_name(self) :property: Custom name of the pipeline. .. py:method:: dates_needed_for_prediction(self, date) Return dates needed to forecast the given date in the future. :param date: Date to forecast in the future. :type date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) .. py:method:: dates_needed_for_prediction_range(self, start_date, end_date) Return dates needed to forecast the given date in the future. :param start_date: Start date of range to forecast in the future. :type start_date: pd.Timestamp :param end_date: End date of range to forecast in the future. :type end_date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) :raises ValueError: If start_date doesn't come before end_date .. py:method:: describe(self, return_dict=False) Outputs pipeline details including component parameters. :param return_dict: If True, return dictionary of information about pipeline. Defaults to False. :type return_dict: bool :returns: Dictionary of all component parameters if return_dict is True, else None. :rtype: dict .. py:method:: feature_importance(self) :property: Importance associated with each feature. Features dropped by the feature selection are excluded. :returns: Feature names and their corresponding importance :rtype: pd.DataFrame .. py:method:: fit(self, X, y) Fit a time series classification model. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training labels of length [n_samples] :type y: pd.Series, np.ndarray :returns: self :raises ValueError: If the number of unique classes in y are not appropriate for the type of pipeline. .. py:method:: fit_transform(self, X, y) Fit and transform all components in the component graph, if all components are Transformers. :param X: Input features of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target data of length [n_samples]. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame :raises ValueError: If final component is an Estimator. .. py:method:: get_component(self, name) Returns component by name. :param name: Name of component. :type name: str :returns: Component to return :rtype: Component .. py:method:: get_hyperparameter_ranges(self, custom_hyperparameters) Returns hyperparameter ranges from all components as a dictionary. :param custom_hyperparameters: Custom hyperparameters for the pipeline. :type custom_hyperparameters: dict :returns: Dictionary of hyperparameter ranges for each component in the pipeline. :rtype: dict .. py:method:: graph(self, filepath=None) Generate an image representing the pipeline graph. :param filepath: Path to where the graph should be saved. If set to None (as by default), the graph will not be saved. :type filepath: str, optional :returns: Graph object that can be directly displayed in Jupyter notebooks. :rtype: graphviz.Digraph :raises RuntimeError: If graphviz is not installed. :raises ValueError: If path is not writeable. .. py:method:: graph_dict(self) Generates a dictionary with nodes consisting of the component names and parameters, and edges detailing component relationships. This dictionary is JSON serializable in most cases. x_edges specifies from which component feature data is being passed. y_edges specifies from which component target data is being passed. This can be used to build graphs across a variety of visualization tools. Template: {"Nodes": {"component_name": {"Name": class_name, "Parameters": parameters_attributes}, ...}}, "x_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...], "y_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...]} :returns: A dictionary representing the DAG structure. :rtype: dag_dict (dict) .. py:method:: graph_feature_importance(self, importance_threshold=0) Generate a bar graph of the pipeline's feature importance. :param importance_threshold: If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to zero. :type importance_threshold: float, optional :returns: A bar graph showing features and their corresponding importance. :rtype: plotly.Figure :raises ValueError: If importance threshold is not valid. .. py:method:: inverse_transform(self, y) Apply component inverse_transform methods to estimator predictions in reverse order. Components that implement inverse_transform are PolynomialDecomposer, LogTransformer, LabelEncoder (tbd). :param y: Final component features. :type y: pd.Series :returns: The inverse transform of the target. :rtype: pd.Series .. py:method:: load(file_path: Union[str, io.BytesIO]) :staticmethod: Loads pipeline at file path. :param file_path: load filepath or a BytesIO object. :type file_path: str|BytesIO :returns: PipelineBase object .. py:method:: model_family(self) :property: Returns model family of this pipeline. .. py:method:: name(self) :property: Name of the pipeline. .. py:method:: new(self, parameters, random_seed=0) Constructs a new instance of the pipeline with the same component graph but with a different set of parameters. Not to be confused with python's __new__ method. :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary or None implies using all default values for component parameters. Defaults to None. :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :returns: A new instance of this pipeline with identical components. .. py:method:: optimize_threshold(self, X, y, y_pred_proba, objective) Optimize the pipeline threshold given the objective to use. Only used for binary problems with objectives whose thresholds can be tuned. :param X: Input features. :type X: pd.DataFrame :param y: Input target values. :type y: pd.Series :param y_pred_proba: The predicted probabilities of the target outputted by the pipeline. :type y_pred_proba: pd.Series :param objective: The objective to threshold with. Must have a tunable threshold. :type objective: ObjectiveBase :raises ValueError: If objective is not optimizable. .. py:method:: parameters(self) :property: Parameter dictionary for this pipeline. :returns: Dictionary of all component parameters. :rtype: dict .. py:method:: predict(self, X, objective=None, X_train=None, y_train=None) Predict on future data where target is not known. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param objective: The objective to use to make predictions. :type objective: Object or string :param X_train: Training data. :type X_train: pd.DataFrame or np.ndarray or None :param y_train: Training labels. :type y_train: pd.Series or None :raises ValueError: If X_train and/or y_train are None or if final component is not an Estimator. :returns: Predictions. .. py:method:: predict_in_sample(self, X, y, X_train, y_train, objective=None) Predict on future data where the target is known, e.g. cross validation. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: Future target of shape [n_samples]. :type y: pd.Series :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_feautures]. :type X_train: pd.DataFrame :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series :param objective: Objective used to threshold predicted probabilities, optional. Defaults to None. :type objective: ObjectiveBase, str :returns: Estimated labels. :rtype: pd.Series :raises ValueError: If objective is not defined for time-series binary classification problems. .. py:method:: predict_proba(self, X, X_train=None, y_train=None) Predict on future data where the target is unknown. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If final component is not an Estimator. .. py:method:: predict_proba_in_sample(self, X_holdout, y_holdout, X_train, y_train) Predict on future data where the target is known, e.g. cross validation. :param X_holdout: Future data of shape [n_samples, n_features]. :type X_holdout: pd.DataFrame or np.ndarray :param y_holdout: Future target of shape [n_samples]. :type y_holdout: pd.Series, np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If the final component is not an Estimator. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves pipeline at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: score(self, X, y, objectives, X_train=None, y_train=None) Evaluate model performance on current and additional objectives. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param y: True labels of length [n_samples]. :type y: pd.Series :param objectives: Non-empty list of objectives to score on. :type objectives: list :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Ordered dictionary of objective scores. :rtype: dict .. py:method:: summary(self) :property: A short summary of the pipeline structure, describing the list of components used. Example: Logistic Regression Classifier w/ Simple Imputer + One Hot Encoder :returns: A string describing the pipeline structure. .. py:method:: threshold(self) :property: Threshold used to make a prediction. Defaults to None. .. py:method:: transform(self, X, y=None) Transform the input. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param y: The target data of length [n_samples]. Defaults to None. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame .. py:method:: transform_all_but_final(self, X, y=None, X_train=None, y_train=None, calculating_residuals=False) Transforms the data by applying all pre-processing components. :param X: Input data to the pipeline to transform. :type X: pd.DataFrame :param y: Targets corresponding to the pipeline targets. :type y: pd.Series :param X_train: Training data used to generate generates from past observations. :type X_train: pd.DataFrame :param y_train: Training targets used to generate features from past observations. :type y_train: pd.Series :param calculating_residuals: Whether we're calling predict_in_sample to calculate the residuals. This means the X and y arguments are not future data, but actually the train data. :type calculating_residuals: bool :returns: New transformed features. :rtype: pd.DataFrame .. py:class:: TimeSeriesClassificationPipeline(component_graph, parameters=None, custom_name=None, random_seed=0) Pipeline base class for time series classification problems. :param component_graph: ComponentGraph instance, list of components in order, or dictionary of components. Accepts strings or ComponentBase subclasses in the list. Note that when duplicate components are specified in a list, the duplicate component names will be modified with the component's index in the list. For example, the component graph [Imputer, One Hot Encoder, Imputer, Logistic Regression Classifier] will have names ["Imputer", "One Hot Encoder", "Imputer_2", "Logistic Regression Classifier"] :type component_graph: ComponentGraph, list, dict :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary {} implies using all default values for component parameters. Pipeline-level parameters such as time_index, gap, and max_delay must be specified with the "pipeline" key. For example: Pipeline(parameters={"pipeline": {"time_index": "Date", "max_delay": 4, "gap": 2}}). :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **problem_type** - None **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.can_tune_threshold_with_objective evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.classes_ evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.clone evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.create_objectives evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.custom_name evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.dates_needed_for_prediction evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.dates_needed_for_prediction_range evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.describe evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.fit evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.fit_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.get_component evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.get_hyperparameter_ranges evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.graph evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.graph_dict evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.graph_feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.inverse_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.load evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.model_family evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.name evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.new evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.parameters evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.predict evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.predict_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.predict_proba evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.predict_proba_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.save evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.score evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.summary evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesClassificationPipeline.transform_all_but_final .. py:method:: can_tune_threshold_with_objective(self, objective) Determine whether the threshold of a binary classification pipeline can be tuned. :param objective: Primary AutoMLSearch objective. :type objective: ObjectiveBase :returns: True if the pipeline threshold can be tuned. :rtype: bool .. py:method:: classes_(self) :property: Gets the class names for the pipeline. Will return None before pipeline is fit. .. py:method:: clone(self) Constructs a new pipeline with the same components, parameters, and random seed. :returns: A new instance of this pipeline with identical components, parameters, and random seed. .. py:method:: create_objectives(objectives) :staticmethod: Create objective instances from a list of strings or objective classes. .. py:method:: custom_name(self) :property: Custom name of the pipeline. .. py:method:: dates_needed_for_prediction(self, date) Return dates needed to forecast the given date in the future. :param date: Date to forecast in the future. :type date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) .. py:method:: dates_needed_for_prediction_range(self, start_date, end_date) Return dates needed to forecast the given date in the future. :param start_date: Start date of range to forecast in the future. :type start_date: pd.Timestamp :param end_date: End date of range to forecast in the future. :type end_date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) :raises ValueError: If start_date doesn't come before end_date .. py:method:: describe(self, return_dict=False) Outputs pipeline details including component parameters. :param return_dict: If True, return dictionary of information about pipeline. Defaults to False. :type return_dict: bool :returns: Dictionary of all component parameters if return_dict is True, else None. :rtype: dict .. py:method:: feature_importance(self) :property: Importance associated with each feature. Features dropped by the feature selection are excluded. :returns: Feature names and their corresponding importance :rtype: pd.DataFrame .. py:method:: fit(self, X, y) Fit a time series classification model. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training labels of length [n_samples] :type y: pd.Series, np.ndarray :returns: self :raises ValueError: If the number of unique classes in y are not appropriate for the type of pipeline. .. py:method:: fit_transform(self, X, y) Fit and transform all components in the component graph, if all components are Transformers. :param X: Input features of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target data of length [n_samples]. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame :raises ValueError: If final component is an Estimator. .. py:method:: get_component(self, name) Returns component by name. :param name: Name of component. :type name: str :returns: Component to return :rtype: Component .. py:method:: get_hyperparameter_ranges(self, custom_hyperparameters) Returns hyperparameter ranges from all components as a dictionary. :param custom_hyperparameters: Custom hyperparameters for the pipeline. :type custom_hyperparameters: dict :returns: Dictionary of hyperparameter ranges for each component in the pipeline. :rtype: dict .. py:method:: graph(self, filepath=None) Generate an image representing the pipeline graph. :param filepath: Path to where the graph should be saved. If set to None (as by default), the graph will not be saved. :type filepath: str, optional :returns: Graph object that can be directly displayed in Jupyter notebooks. :rtype: graphviz.Digraph :raises RuntimeError: If graphviz is not installed. :raises ValueError: If path is not writeable. .. py:method:: graph_dict(self) Generates a dictionary with nodes consisting of the component names and parameters, and edges detailing component relationships. This dictionary is JSON serializable in most cases. x_edges specifies from which component feature data is being passed. y_edges specifies from which component target data is being passed. This can be used to build graphs across a variety of visualization tools. Template: {"Nodes": {"component_name": {"Name": class_name, "Parameters": parameters_attributes}, ...}}, "x_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...], "y_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...]} :returns: A dictionary representing the DAG structure. :rtype: dag_dict (dict) .. py:method:: graph_feature_importance(self, importance_threshold=0) Generate a bar graph of the pipeline's feature importance. :param importance_threshold: If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to zero. :type importance_threshold: float, optional :returns: A bar graph showing features and their corresponding importance. :rtype: plotly.Figure :raises ValueError: If importance threshold is not valid. .. py:method:: inverse_transform(self, y) Apply component inverse_transform methods to estimator predictions in reverse order. Components that implement inverse_transform are PolynomialDecomposer, LogTransformer, LabelEncoder (tbd). :param y: Final component features. :type y: pd.Series :returns: The inverse transform of the target. :rtype: pd.Series .. py:method:: load(file_path: Union[str, io.BytesIO]) :staticmethod: Loads pipeline at file path. :param file_path: load filepath or a BytesIO object. :type file_path: str|BytesIO :returns: PipelineBase object .. py:method:: model_family(self) :property: Returns model family of this pipeline. .. py:method:: name(self) :property: Name of the pipeline. .. py:method:: new(self, parameters, random_seed=0) Constructs a new instance of the pipeline with the same component graph but with a different set of parameters. Not to be confused with python's __new__ method. :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary or None implies using all default values for component parameters. Defaults to None. :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :returns: A new instance of this pipeline with identical components. .. py:method:: parameters(self) :property: Parameter dictionary for this pipeline. :returns: Dictionary of all component parameters. :rtype: dict .. py:method:: predict(self, X, objective=None, X_train=None, y_train=None) Predict on future data where target is not known. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param objective: The objective to use to make predictions. :type objective: Object or string :param X_train: Training data. :type X_train: pd.DataFrame or np.ndarray or None :param y_train: Training labels. :type y_train: pd.Series or None :raises ValueError: If X_train and/or y_train are None or if final component is not an Estimator. :returns: Predictions. .. py:method:: predict_in_sample(self, X, y, X_train, y_train, objective=None) Predict on future data where the target is known, e.g. cross validation. Note: we cast y as ints first to address boolean values that may be returned from calculating predictions which we would not be able to otherwise transform if we originally had integer targets. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param y: Future target of shape [n_samples]. :type y: pd.Series, np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :param objective: Objective used to threshold predicted probabilities, optional. :type objective: ObjectiveBase, str, None :returns: Estimated labels. :rtype: pd.Series :raises ValueError: If final component is not an Estimator. .. py:method:: predict_proba(self, X, X_train=None, y_train=None) Predict on future data where the target is unknown. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If final component is not an Estimator. .. py:method:: predict_proba_in_sample(self, X_holdout, y_holdout, X_train, y_train) Predict on future data where the target is known, e.g. cross validation. :param X_holdout: Future data of shape [n_samples, n_features]. :type X_holdout: pd.DataFrame or np.ndarray :param y_holdout: Future target of shape [n_samples]. :type y_holdout: pd.Series, np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If the final component is not an Estimator. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves pipeline at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: score(self, X, y, objectives, X_train=None, y_train=None) Evaluate model performance on current and additional objectives. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param y: True labels of length [n_samples]. :type y: pd.Series :param objectives: Non-empty list of objectives to score on. :type objectives: list :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Ordered dictionary of objective scores. :rtype: dict .. py:method:: summary(self) :property: A short summary of the pipeline structure, describing the list of components used. Example: Logistic Regression Classifier w/ Simple Imputer + One Hot Encoder :returns: A string describing the pipeline structure. .. py:method:: transform(self, X, y=None) Transform the input. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param y: The target data of length [n_samples]. Defaults to None. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame .. py:method:: transform_all_but_final(self, X, y=None, X_train=None, y_train=None, calculating_residuals=False) Transforms the data by applying all pre-processing components. :param X: Input data to the pipeline to transform. :type X: pd.DataFrame :param y: Targets corresponding to the pipeline targets. :type y: pd.Series :param X_train: Training data used to generate generates from past observations. :type X_train: pd.DataFrame :param y_train: Training targets used to generate features from past observations. :type y_train: pd.Series :param calculating_residuals: Whether we're calling predict_in_sample to calculate the residuals. This means the X and y arguments are not future data, but actually the train data. :type calculating_residuals: bool :returns: New transformed features. :rtype: pd.DataFrame .. py:class:: TimeSeriesMulticlassClassificationPipeline(component_graph, parameters=None, custom_name=None, random_seed=0) Pipeline base class for time series multiclass classification problems. :param component_graph: List of components in order. Accepts strings or ComponentBase subclasses in the list. Note that when duplicate components are specified in a list, the duplicate component names will be modified with the component's index in the list. For example, the component graph [Imputer, One Hot Encoder, Imputer, Logistic Regression Classifier] will have names ["Imputer", "One Hot Encoder", "Imputer_2", "Logistic Regression Classifier"] :type component_graph: list or dict :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary {} implies using all default values for component parameters. Pipeline-level parameters such as time_index, gap, and max_delay must be specified with the "pipeline" key. For example: Pipeline(parameters={"pipeline": {"time_index": "Date", "max_delay": 4, "gap": 2}}). :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int .. rubric:: Example >>> pipeline = TimeSeriesMulticlassClassificationPipeline(component_graph=["Simple Imputer", "Logistic Regression Classifier"], ... parameters={"Logistic Regression Classifier": {"penalty": "elasticnet", ... "solver": "liblinear"}, ... "pipeline": {"gap": 1, "max_delay": 1, "forecast_horizon": 1, "time_index": "date"}}, ... custom_name="My TimeSeriesMulticlass Pipeline") >>> assert pipeline.custom_name == "My TimeSeriesMulticlass Pipeline" >>> assert pipeline.component_graph.component_dict.keys() == {'Simple Imputer', 'Logistic Regression Classifier'} >>> assert pipeline.parameters == { ... 'Simple Imputer': {'impute_strategy': 'most_frequent', 'fill_value': None}, ... 'Logistic Regression Classifier': {'penalty': 'elasticnet', ... 'C': 1.0, ... 'n_jobs': -1, ... 'multi_class': 'auto', ... 'solver': 'liblinear'}, ... 'pipeline': {'gap': 1, 'max_delay': 1, 'forecast_horizon': 1, 'time_index': "date"}} **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **problem_type** - ProblemTypes.TIME_SERIES_MULTICLASS **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.can_tune_threshold_with_objective evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.classes_ evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.clone evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.create_objectives evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.custom_name evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.dates_needed_for_prediction evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.dates_needed_for_prediction_range evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.describe evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.fit evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.fit_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.get_component evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.get_hyperparameter_ranges evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.graph evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.graph_dict evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.graph_feature_importance evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.inverse_transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.load evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.model_family evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.name evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.new evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.parameters evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.predict evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.predict_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.predict_proba evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.predict_proba_in_sample evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.save evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.score evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.summary evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.transform evalml.pipelines.time_series_classification_pipelines.TimeSeriesMulticlassClassificationPipeline.transform_all_but_final .. py:method:: can_tune_threshold_with_objective(self, objective) Determine whether the threshold of a binary classification pipeline can be tuned. :param objective: Primary AutoMLSearch objective. :type objective: ObjectiveBase :returns: True if the pipeline threshold can be tuned. :rtype: bool .. py:method:: classes_(self) :property: Gets the class names for the pipeline. Will return None before pipeline is fit. .. py:method:: clone(self) Constructs a new pipeline with the same components, parameters, and random seed. :returns: A new instance of this pipeline with identical components, parameters, and random seed. .. py:method:: create_objectives(objectives) :staticmethod: Create objective instances from a list of strings or objective classes. .. py:method:: custom_name(self) :property: Custom name of the pipeline. .. py:method:: dates_needed_for_prediction(self, date) Return dates needed to forecast the given date in the future. :param date: Date to forecast in the future. :type date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) .. py:method:: dates_needed_for_prediction_range(self, start_date, end_date) Return dates needed to forecast the given date in the future. :param start_date: Start date of range to forecast in the future. :type start_date: pd.Timestamp :param end_date: End date of range to forecast in the future. :type end_date: pd.Timestamp :returns: Range of dates needed to forecast the given date. :rtype: dates_needed (tuple(pd.Timestamp)) :raises ValueError: If start_date doesn't come before end_date .. py:method:: describe(self, return_dict=False) Outputs pipeline details including component parameters. :param return_dict: If True, return dictionary of information about pipeline. Defaults to False. :type return_dict: bool :returns: Dictionary of all component parameters if return_dict is True, else None. :rtype: dict .. py:method:: feature_importance(self) :property: Importance associated with each feature. Features dropped by the feature selection are excluded. :returns: Feature names and their corresponding importance :rtype: pd.DataFrame .. py:method:: fit(self, X, y) Fit a time series classification model. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training labels of length [n_samples] :type y: pd.Series, np.ndarray :returns: self :raises ValueError: If the number of unique classes in y are not appropriate for the type of pipeline. .. py:method:: fit_transform(self, X, y) Fit and transform all components in the component graph, if all components are Transformers. :param X: Input features of shape [n_samples, n_features]. :type X: pd.DataFrame :param y: The target data of length [n_samples]. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame :raises ValueError: If final component is an Estimator. .. py:method:: get_component(self, name) Returns component by name. :param name: Name of component. :type name: str :returns: Component to return :rtype: Component .. py:method:: get_hyperparameter_ranges(self, custom_hyperparameters) Returns hyperparameter ranges from all components as a dictionary. :param custom_hyperparameters: Custom hyperparameters for the pipeline. :type custom_hyperparameters: dict :returns: Dictionary of hyperparameter ranges for each component in the pipeline. :rtype: dict .. py:method:: graph(self, filepath=None) Generate an image representing the pipeline graph. :param filepath: Path to where the graph should be saved. If set to None (as by default), the graph will not be saved. :type filepath: str, optional :returns: Graph object that can be directly displayed in Jupyter notebooks. :rtype: graphviz.Digraph :raises RuntimeError: If graphviz is not installed. :raises ValueError: If path is not writeable. .. py:method:: graph_dict(self) Generates a dictionary with nodes consisting of the component names and parameters, and edges detailing component relationships. This dictionary is JSON serializable in most cases. x_edges specifies from which component feature data is being passed. y_edges specifies from which component target data is being passed. This can be used to build graphs across a variety of visualization tools. Template: {"Nodes": {"component_name": {"Name": class_name, "Parameters": parameters_attributes}, ...}}, "x_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...], "y_edges": [[from_component_name, to_component_name], [from_component_name, to_component_name], ...]} :returns: A dictionary representing the DAG structure. :rtype: dag_dict (dict) .. py:method:: graph_feature_importance(self, importance_threshold=0) Generate a bar graph of the pipeline's feature importance. :param importance_threshold: If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to zero. :type importance_threshold: float, optional :returns: A bar graph showing features and their corresponding importance. :rtype: plotly.Figure :raises ValueError: If importance threshold is not valid. .. py:method:: inverse_transform(self, y) Apply component inverse_transform methods to estimator predictions in reverse order. Components that implement inverse_transform are PolynomialDecomposer, LogTransformer, LabelEncoder (tbd). :param y: Final component features. :type y: pd.Series :returns: The inverse transform of the target. :rtype: pd.Series .. py:method:: load(file_path: Union[str, io.BytesIO]) :staticmethod: Loads pipeline at file path. :param file_path: load filepath or a BytesIO object. :type file_path: str|BytesIO :returns: PipelineBase object .. py:method:: model_family(self) :property: Returns model family of this pipeline. .. py:method:: name(self) :property: Name of the pipeline. .. py:method:: new(self, parameters, random_seed=0) Constructs a new instance of the pipeline with the same component graph but with a different set of parameters. Not to be confused with python's __new__ method. :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values. An empty dictionary or None implies using all default values for component parameters. Defaults to None. :type parameters: dict :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :returns: A new instance of this pipeline with identical components. .. py:method:: parameters(self) :property: Parameter dictionary for this pipeline. :returns: Dictionary of all component parameters. :rtype: dict .. py:method:: predict(self, X, objective=None, X_train=None, y_train=None) Predict on future data where target is not known. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param objective: The objective to use to make predictions. :type objective: Object or string :param X_train: Training data. :type X_train: pd.DataFrame or np.ndarray or None :param y_train: Training labels. :type y_train: pd.Series or None :raises ValueError: If X_train and/or y_train are None or if final component is not an Estimator. :returns: Predictions. .. py:method:: predict_in_sample(self, X, y, X_train, y_train, objective=None) Predict on future data where the target is known, e.g. cross validation. Note: we cast y as ints first to address boolean values that may be returned from calculating predictions which we would not be able to otherwise transform if we originally had integer targets. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param y: Future target of shape [n_samples]. :type y: pd.Series, np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :param objective: Objective used to threshold predicted probabilities, optional. :type objective: ObjectiveBase, str, None :returns: Estimated labels. :rtype: pd.Series :raises ValueError: If final component is not an Estimator. .. py:method:: predict_proba(self, X, X_train=None, y_train=None) Predict on future data where the target is unknown. :param X: Future data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If final component is not an Estimator. .. py:method:: predict_proba_in_sample(self, X_holdout, y_holdout, X_train, y_train) Predict on future data where the target is known, e.g. cross validation. :param X_holdout: Future data of shape [n_samples, n_features]. :type X_holdout: pd.DataFrame or np.ndarray :param y_holdout: Future target of shape [n_samples]. :type y_holdout: pd.Series, np.ndarray :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Estimated probabilities. :rtype: pd.Series :raises ValueError: If the final component is not an Estimator. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves pipeline at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: score(self, X, y, objectives, X_train=None, y_train=None) Evaluate model performance on current and additional objectives. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame or np.ndarray :param y: True labels of length [n_samples]. :type y: pd.Series :param objectives: Non-empty list of objectives to score on. :type objectives: list :param X_train: Data the pipeline was trained on of shape [n_samples_train, n_features]. :type X_train: pd.DataFrame, np.ndarray :param y_train: Targets used to train the pipeline of shape [n_samples_train]. :type y_train: pd.Series, np.ndarray :returns: Ordered dictionary of objective scores. :rtype: dict .. py:method:: summary(self) :property: A short summary of the pipeline structure, describing the list of components used. Example: Logistic Regression Classifier w/ Simple Imputer + One Hot Encoder :returns: A string describing the pipeline structure. .. py:method:: transform(self, X, y=None) Transform the input. :param X: Data of shape [n_samples, n_features]. :type X: pd.DataFrame, or np.ndarray :param y: The target data of length [n_samples]. Defaults to None. :type y: pd.Series :returns: Transformed output. :rtype: pd.DataFrame .. py:method:: transform_all_but_final(self, X, y=None, X_train=None, y_train=None, calculating_residuals=False) Transforms the data by applying all pre-processing components. :param X: Input data to the pipeline to transform. :type X: pd.DataFrame :param y: Targets corresponding to the pipeline targets. :type y: pd.Series :param X_train: Training data used to generate generates from past observations. :type X_train: pd.DataFrame :param y_train: Training targets used to generate features from past observations. :type y_train: pd.Series :param calculating_residuals: Whether we're calling predict_in_sample to calculate the residuals. This means the X and y arguments are not future data, but actually the train data. :type calculating_residuals: bool :returns: New transformed features. :rtype: pd.DataFrame