time_series_regression_pipeline¶

Module Contents¶

Classes Summary¶

TimeSeriesRegressionPipeline

Pipeline base class for time series regression problems.

Contents¶

class evalml.pipelines.time_series_regression_pipeline.TimeSeriesRegressionPipeline(component_graph, parameters=None, custom_name=None, random_seed=0)[source]¶

Pipeline base class for time series regression problems.

Parameters

component_graph (list or dict) – List of components in order. Accepts strings or ComponentBase subclasses in the list. Note that when duplicate components are specified in a list, the duplicate component names will be modified with the component’s index in the list. For example, the component graph [Imputer, One Hot Encoder, Imputer, Logistic Regression Classifier] will have names [“Imputer”, “One Hot Encoder”, “Imputer_2”, “Logistic Regression Classifier”]
parameters (dict) – Dictionary with component names as keys and dictionary of that component’s parameters as values. An empty dictionary {} implies using all default values for component parameters. Pipeline-level parameters such as date_index, gap, and max_delay must be specified with the “pipeline” key. For example: Pipeline(parameters={“pipeline”: {“date_index”: “Date”, “max_delay”: 4, “gap”: 2}}).
random_seed (int) – Seed for the random number generator. Defaults to 0.

Attributes

problem_type

ProblemTypes.TIME_SERIES_REGRESSION

Methods

`can_tune_threshold_with_objective`	Determine whether the threshold of a binary classification pipeline can be tuned.
`clone`	Constructs a new pipeline with the same components, parameters, and random state.
`compute_estimator_features`	Transforms the data by applying all pre-processing components.
`create_objectives`
`custom_name`	Custom name of the pipeline.
`describe`	Outputs pipeline details including component parameters
`feature_importance`	Importance associated with each feature. Features dropped by the feature selection are excluded.
`fit`	Fit a time series pipeline.
`get_component`	Returns component by name
`get_hyperparameter_ranges`	Returns hyperparameter ranges from all components as a dictionary.
`graph`	Generate an image representing the pipeline graph.
`graph_feature_importance`	Generate a bar graph of the pipeline’s feature importance
`inverse_transform`	Apply component inverse_transform methods to estimator predictions in reverse order.
`load`	Loads pipeline at file path
`model_family`	Returns model family of this pipeline.
`name`	Name of the pipeline.
`new`	Constructs a new instance of the pipeline with the same component graph but with a different set of parameters.
`parameters`	Parameter dictionary for this pipeline.
`predict`	Make predictions using selected features.
`save`	Saves pipeline at file path
`score`	Evaluate model performance on current and additional objectives.
`summary`	A short summary of the pipeline structure, describing the list of components used.
`transform`	Transform the input.

can_tune_threshold_with_objective(self, objective)¶

Determine whether the threshold of a binary classification pipeline can be tuned.

Parameters

pipeline (PipelineBase) – Binary classification pipeline.
objective – Primary AutoMLSearch objective.

clone(self)¶

Constructs a new pipeline with the same components, parameters, and random state.

Returns: A new instance of this pipeline with identical components, parameters, and random state.

compute_estimator_features(self, X, y=None)¶

Transforms the data by applying all pre-processing components.

Parameters: X (pd.DataFrame) – Input data to the pipeline to transform.
Returns: New transformed features.
Return type: pd.DataFrame

static create_objectives(objectives)¶

property custom_name(self)¶: Custom name of the pipeline.

describe(self, return_dict=False)¶

Outputs pipeline details including component parameters

Parameters: return_dict (bool) – If True, return dictionary of information about pipeline. Defaults to False.
Returns: Dictionary of all component parameters if return_dict is True, else None
Return type: dict

property feature_importance(self)¶

Importance associated with each feature. Features dropped by the feature selection are excluded.

Returns: pd.DataFrame including feature names and their corresponding importance

fit(self, X, y)¶

Fit a time series pipeline.

Parameters

X (pd.DataFrame or np.ndarray) – The input training data of shape [n_samples, n_features].
y (pd.Series, np.ndarray) – The target training targets of length [n_samples].

Returns

self

get_component(self, name)¶

Returns component by name

Parameters: name (str) – Name of component
Returns: Component to return
Return type: Component

get_hyperparameter_ranges(self, custom_hyperparameters)¶

Returns hyperparameter ranges from all components as a dictionary.

Parameters: custom_hyperparameters (dict) – Custom hyperparameters for the pipeline.
Returns: Dictionary of hyperparameter ranges for each component in the pipeline.
Return type: dict

graph(self, filepath=None)¶

Generate an image representing the pipeline graph.

Parameters: filepath (str, optional) – Path to where the graph should be saved. If set to None (as by default), the graph will not be saved.
Returns: Graph object that can be directly displayed in Jupyter notebooks.
Return type: graphviz.Digraph

graph_feature_importance(self, importance_threshold=0)¶

Generate a bar graph of the pipeline’s feature importance

Parameters: importance_threshold (float, optional) – If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to zero.
Returns: plotly.Figure, a bar graph showing features and their corresponding importance

inverse_transform(self, y)¶

Apply component inverse_transform methods to estimator predictions in reverse order.

Components that implement inverse_transform are PolynomialDetrender, LabelEncoder (tbd).

Parameters: y (pd.Series) – Final component features

static load(file_path)¶

Loads pipeline at file path

Parameters: file_path (str) – location to load file
Returns: PipelineBase object

property model_family(self)¶: Returns model family of this pipeline.

property name(self)¶: Name of the pipeline.

new(self, parameters, random_seed=0)¶

Constructs a new instance of the pipeline with the same component graph but with a different set of parameters.: Not to be confused with python’s __new__ method.

Parameters

parameters (dict) – Dictionary with component names as keys and dictionary of that component’s parameters as values. An empty dictionary or None implies using all default values for component parameters. Defaults to None.
random_seed (int) – Seed for the random number generator. Defaults to 0.

Returns

A new instance of this pipeline with identical components.

property parameters(self)¶

Parameter dictionary for this pipeline.

Returns: Dictionary of all component parameters.
Return type: dict

predict(self, X, y=None, objective=None)[source]¶

Make predictions using selected features.

Parameters

X (pd.DataFrame, or np.ndarray) – Data of shape [n_samples, n_features].
y (pd.Series, np.ndarray, None) – The target training targets of length [n_samples].
objective (Object or string) – The objective to use to make predictions.

Returns

Predicted values.

Return type

pd.Series

save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL)¶

Saves pipeline at file path

Parameters

file_path (str) – location to save file
pickle_protocol (int) – the pickle data stream format.

Returns

None

score(self, X, y, objectives)[source]¶

Evaluate model performance on current and additional objectives.

Parameters

X (pd.DataFrame or np.ndarray) – Data of shape [n_samples, n_features].
y (pd.Series) – True labels of length [n_samples].
objectives (list) – Non-empty list of objectives to score on.

Returns

Ordered dictionary of objective scores.

Return type

dict

property summary(self)¶: A short summary of the pipeline structure, describing the list of components used. Example: Logistic Regression Classifier w/ Simple Imputer + One Hot Encoder

transform(self, X, y=None)¶

Transform the input.

Parameters

X (pd.DataFrame, or np.ndarray) – Data of shape [n_samples, n_features].
y (pd.Series) – The target data of length [n_samples]. Defaults to None.

Returns

Transformed output.

Return type

pd.DataFrame

time_series_pipeline_base utils