component_graph

Module Contents

Classes Summary

ComponentGraph

Component graph for a pipeline as a directed acyclic graph (DAG).

Attributes Summary

logger

Contents

class evalml.pipelines.component_graph.ComponentGraph(component_dict=None, random_seed=0)[source]

Component graph for a pipeline as a directed acyclic graph (DAG).

Parameters
  • component_dict (dict) – A dictionary which specifies the components and edges between components that should be used to create the component graph. Defaults to None.

  • random_seed (int) – Seed for the random number generator. Defaults to 0.

Example

>>> component_dict = {'imputer': ['Imputer'], 'ohe': ['One Hot Encoder', 'imputer.x'], 'estimator_1': ['Random Forest Classifier', 'ohe.x'], 'estimator_2': ['Decision Tree Classifier', 'ohe.x'], 'final': ['Logistic Regression Classifier', 'estimator_1', 'estimator_2']}
>>> component_graph = ComponentGraph(component_dict)

Methods

compute_final_component_features

Transform all components save the final one, and gathers the data from any number of parents

compute_order

The order that components will be computed or called in.

default_parameters

The default parameter dictionary for this pipeline.

describe

Outputs component graph details including component parameters

fit

Fit each component in the graph

fit_features

Fit all components save the final one, usually an estimator

from_list

Constructs a linear ComponentGraph from a given list, where each component in the list feeds its X transformed output to the next component

generate_order

Regenerated the topologically sorted order of the graph

get_component

Retrieves a single component object from the graph.

get_estimators

Gets a list of all the estimator components within this graph

get_last_component

Retrieves the component that is computed last in the graph, usually the final estimator.

get_parents

Finds all of the inputs for a given component, including the names of all parent nodes of the given component

graph

Generate an image representing the component graph

instantiate

Instantiates all uninstantiated components within the graph using the given parameters. An error will be

inverse_transform

Apply component inverse_transform methods to estimator predictions in reverse order.

linearized_component_graph

Return a list of (component name, component class) tuples from a pre-initialized component graph defined

predict

Make predictions using selected features.

compute_final_component_features(self, X, y=None)[source]

Transform all components save the final one, and gathers the data from any number of parents to get all the information that should be fed to the final component

Parameters
  • X (pd.DataFrame) – Data of shape [n_samples, n_features]

  • y (pd.Series) – The target training data of length [n_samples]. Defaults to None.

Returns

Transformed values.

Return type

pd.DataFrame

property compute_order(self)

The order that components will be computed or called in.

property default_parameters(self)

The default parameter dictionary for this pipeline.

Returns

Dictionary of all component default parameters.

Return type

dict

describe(self, return_dict=False)[source]

Outputs component graph details including component parameters

Parameters

return_dict (bool) – If True, return dictionary of information about component graph. Defaults to False.

Returns

Dictionary of all component parameters if return_dict is True, else None

Return type

dict

fit(self, X, y)[source]

Fit each component in the graph

Parameters
  • X (pd.DataFrame) – The input training data of shape [n_samples, n_features]

  • y (pd.Series) – The target training data of length [n_samples]

fit_features(self, X, y)[source]

Fit all components save the final one, usually an estimator

Parameters
  • X (pd.DataFrame) – The input training data of shape [n_samples, n_features]

  • y (pd.Series) – The target training data of length [n_samples]

Returns

Transformed values.

Return type

pd.DataFrame

classmethod from_list(cls, component_list, random_seed=0)[source]

Constructs a linear ComponentGraph from a given list, where each component in the list feeds its X transformed output to the next component

Parameters

component_list (list) – String names or ComponentBase subclasses in an order that represents a valid linear graph

classmethod generate_order(cls, component_dict)[source]

Regenerated the topologically sorted order of the graph

get_component(self, component_name)[source]

Retrieves a single component object from the graph.

Parameters

component_name (str) – Name of the component to retrieve

Returns

ComponentBase object

get_estimators(self)[source]

Gets a list of all the estimator components within this graph

Returns

All estimator objects within the graph

Return type

list

get_last_component(self)[source]

Retrieves the component that is computed last in the graph, usually the final estimator.

Returns

ComponentBase object

get_parents(self, component_name)[source]

Finds all of the inputs for a given component, including the names of all parent nodes of the given component

Parameters

component_name (str) – Name of the child component to look up

Returns

List of inputs to use

Return type

list[str]

graph(self, name=None, graph_format=None)[source]

Generate an image representing the component graph

Parameters
  • name (str) – Name of the graph. Defaults to None.

  • graph_format (str) – file format to save the graph in. Defaults to None.

Returns

Graph object that can be directly displayed in Jupyter notebooks.

Return type

graphviz.Digraph

instantiate(self, parameters)[source]

Instantiates all uninstantiated components within the graph using the given parameters. An error will be raised if a component is already instantiated but the parameters dict contains arguments for that component.

Parameters

parameters (dict) – Dictionary with component names as keys and dictionary of that component’s parameters as values. An empty dictionary {} or None implies using all default values for component parameters.

inverse_transform(self, y)[source]

Apply component inverse_transform methods to estimator predictions in reverse order.

Components that implement inverse_transform are PolynomialDetrender, LabelEncoder (tbd).

Parameters

y – (pd.Series): Final component features

classmethod linearized_component_graph(cls, components)[source]

Return a list of (component name, component class) tuples from a pre-initialized component graph defined as either a list or a dictionary. The component names are guaranteed to be unique.

Parameters

components (list(ComponentBase) or Dict[str, ComponentBase]) – Components in the pipeline.

Returns

list((component name, ComponentBase)) - tuples with the unique component name as the first element and the

component class as the second element. When the input is a list, the components will be returned in the order they appear in the input.

predict(self, X)[source]

Make predictions using selected features.

Parameters

X (pd.DataFrame) – Data of shape [n_samples, n_features]

Returns

Predicted values.

Return type

pd.Series

evalml.pipelines.component_graph.logger