component_graph
==========================================

.. py:module:: evalml.pipelines.component_graph

.. autoapi-nested-parse::

   Component graph for a pipeline as a directed acyclic graph (DAG).


Module Contents
---------------

Classes Summary
~~~~~~~~~~~~~~~

.. autoapisummary::

   evalml.pipelines.component_graph.ComponentGraph


Attributes Summary
~~~~~~~~~~~~~~~~~~~

.. autoapisummary::

   evalml.pipelines.component_graph.logger


Contents
~~~~~~~~~~~~~~~~~~~
.. py:class:: ComponentGraph(component_dict=None, cached_data=None, random_seed=0)

   Component graph for a pipeline as a directed acyclic graph (DAG).

   :param component_dict: A dictionary which specifies the components and edges between components that should be used to create the component graph. Defaults to None.
   :type component_dict: dict
   :param cached_data: A dictionary of nested cached data. If the hashes and components are in this cache, we skip fitting for these components. Expected to be of format
                       {hash1: {component_name: trained_component, ...}, hash2: {...}, ...}.
                       Defaults to None.
   :type cached_data: dict
   :param random_seed: Seed for the random number generator. Defaults to 0.
   :type random_seed: int

   .. rubric:: Examples

   >>> component_dict = {'Imputer': ['Imputer', 'X', 'y'],
   ...                   'Logistic Regression': ['Logistic Regression Classifier', 'Imputer.x', 'y']}
   >>> component_graph = ComponentGraph(component_dict)
   >>> assert component_graph.compute_order == ['Imputer', 'Logistic Regression']
   ...
   ...
   >>> component_dict = {'Imputer': ['Imputer', 'X', 'y'],
   ...                   'OHE': ['One Hot Encoder', 'Imputer.x', 'y'],
   ...                   'estimator_1': ['Random Forest Classifier', 'OHE.x', 'y'],
   ...                   'estimator_2': ['Decision Tree Classifier', 'OHE.x', 'y'],
   ...                   'final': ['Logistic Regression Classifier', 'estimator_1.x', 'estimator_2.x', 'y']}
   >>> component_graph = ComponentGraph(component_dict)

   The default parameters for every component in the component graph.

   >>> assert component_graph.default_parameters == {
   ...     'Imputer': {'categorical_impute_strategy': 'most_frequent',
   ...                 'numeric_impute_strategy': 'mean',
   ...                 'boolean_impute_strategy': 'most_frequent',
   ...                 'categorical_fill_value': None,
   ...                 'numeric_fill_value': None,
   ...                 'boolean_fill_value': None},
   ...     'One Hot Encoder': {'top_n': 10,
   ...                         'features_to_encode': None,
   ...                         'categories': None,
   ...                         'drop': 'if_binary',
   ...                         'handle_unknown': 'ignore',
   ...                         'handle_missing': 'error'},
   ...     'Random Forest Classifier': {'n_estimators': 100,
   ...                                  'max_depth': 6,
   ...                                  'n_jobs': -1},
   ...     'Decision Tree Classifier': {'criterion': 'gini',
   ...                                  'max_features': 'sqrt',
   ...                                  'max_depth': 6,
   ...                                  'min_samples_split': 2,
   ...                                  'min_weight_fraction_leaf': 0.0},
   ...     'Logistic Regression Classifier': {'penalty': 'l2',
   ...                                        'C': 1.0,
   ...                                        'n_jobs': -1,
   ...                                        'multi_class': 'auto',
   ...                                        'solver': 'lbfgs'}}


   **Methods**

   .. autoapisummary::
      :nosignatures:

      evalml.pipelines.component_graph.ComponentGraph.compute_order
      evalml.pipelines.component_graph.ComponentGraph.default_parameters
      evalml.pipelines.component_graph.ComponentGraph.describe
      evalml.pipelines.component_graph.ComponentGraph.fit
      evalml.pipelines.component_graph.ComponentGraph.fit_and_transform_all_but_final
      evalml.pipelines.component_graph.ComponentGraph.fit_transform
      evalml.pipelines.component_graph.ComponentGraph.generate_order
      evalml.pipelines.component_graph.ComponentGraph.get_component
      evalml.pipelines.component_graph.ComponentGraph.get_component_input_logical_types
      evalml.pipelines.component_graph.ComponentGraph.get_estimators
      evalml.pipelines.component_graph.ComponentGraph.get_inputs
      evalml.pipelines.component_graph.ComponentGraph.get_last_component
      evalml.pipelines.component_graph.ComponentGraph.graph
      evalml.pipelines.component_graph.ComponentGraph.has_dfs
      evalml.pipelines.component_graph.ComponentGraph.instantiate
      evalml.pipelines.component_graph.ComponentGraph.inverse_transform
      evalml.pipelines.component_graph.ComponentGraph.last_component_input_logical_types
      evalml.pipelines.component_graph.ComponentGraph.predict
      evalml.pipelines.component_graph.ComponentGraph.transform
      evalml.pipelines.component_graph.ComponentGraph.transform_all_but_final

   .. py:method:: compute_order(self)
      :property:

      The order that components will be computed or called in.


   .. py:method:: default_parameters(self)
      :property:

      The default parameter dictionary for this pipeline.

      :returns: Dictionary of all component default parameters.
      :rtype: dict


   .. py:method:: describe(self, return_dict=False)

      Outputs component graph details including component parameters.

      :param return_dict: If True, return dictionary of information about component graph. Defaults to False.
      :type return_dict: bool

      :returns: Dictionary of all component parameters if return_dict is True, else None
      :rtype: dict

      :raises ValueError: If the componentgraph is not instantiated


   .. py:method:: fit(self, X, y)

      Fit each component in the graph.

      :param X: The input training data of shape [n_samples, n_features].
      :type X: pd.DataFrame
      :param y: The target training data of length [n_samples].
      :type y: pd.Series

      :returns: self


   .. py:method:: fit_and_transform_all_but_final(self, X, y)

      Fit and transform all components save the final one, usually an estimator.

      :param X: The input training data of shape [n_samples, n_features].
      :type X: pd.DataFrame
      :param y: The target training data of length [n_samples].
      :type y: pd.Series

      :returns: Transformed features and target.
      :rtype: Tuple (pd.DataFrame, pd.Series)


   .. py:method:: fit_transform(self, X, y)

      Fit and transform all components in the component graph, if all components are Transformers.

      :param X: Input features of shape [n_samples, n_features].
      :type X: pd.DataFrame
      :param y: The target data of length [n_samples].
      :type y: pd.Series

      :returns: Transformed output.
      :rtype: pd.DataFrame

      :raises ValueError: If final component is an Estimator.


   .. py:method:: generate_order(cls, component_dict)
      :classmethod:

      Regenerated the topologically sorted order of the graph.


   .. py:method:: get_component(self, component_name)

      Retrieves a single component object from the graph.

      :param component_name: Name of the component to retrieve
      :type component_name: str

      :returns: ComponentBase object

      :raises ValueError: If the component is not in the graph.


   .. py:method:: get_component_input_logical_types(self, component_name)

      Get the logical types that are passed to the given component.

      :param component_name: Name of component in the graph
      :type component_name: str

      :returns: Dict - Mapping feature name to logical type instance.

      :raises ValueError: If the component is not in the graph.
      :raises ValueError: If the component graph as not been fitted


   .. py:method:: get_estimators(self)

      Gets a list of all the estimator components within this graph.

      :returns: All estimator objects within the graph.
      :rtype: list

      :raises ValueError: If the component graph is not yet instantiated.


   .. py:method:: get_inputs(self, component_name)

      Retrieves all inputs for a given component.

      :param component_name: Name of the component to look up.
      :type component_name: str

      :returns: List of inputs for the component to use.
      :rtype: list[str]

      :raises ValueError: If the component is not in the graph.


   .. py:method:: get_last_component(self)

      Retrieves the component that is computed last in the graph, usually the final estimator.

      :returns: ComponentBase object

      :raises ValueError: If the component graph has no edges.


   .. py:method:: graph(self, name=None, graph_format=None)

      Generate an image representing the component graph.

      :param name: Name of the graph. Defaults to None.
      :type name: str
      :param graph_format: file format to save the graph in. Defaults to None.
      :type graph_format: str

      :returns: Graph object that can be directly displayed in Jupyter notebooks.
      :rtype: graphviz.Digraph

      :raises RuntimeError: If graphviz is not installed.


   .. py:method:: has_dfs(self)
      :property:

      Whether this component graph contains a DFSTransformer or not.


   .. py:method:: instantiate(self, parameters=None)

      Instantiates all uninstantiated components within the graph using the given parameters. An error will be raised if a component is already instantiated but the parameters dict contains arguments for that component.

      :param parameters: Dictionary with component names as keys and dictionary of that component's parameters as values.
                         An empty dictionary {} or None implies using all default values for component parameters. If a component
                         in the component graph is already instantiated, it will not use any of its parameters defined in this dictionary. Defaults to None.
      :type parameters: dict

      :returns: self

      :raises ValueError: If component graph is already instantiated or if a component errored while instantiating.


   .. py:method:: inverse_transform(self, y)

      Apply component inverse_transform methods to estimator predictions in reverse order.

      Components that implement inverse_transform are PolynomialDecomposer, LogTransformer, LabelEncoder (tbd).

      :param y: (pd.Series): Final component features.

      :returns: The target with inverse transformation applied.
      :rtype: pd.Series


   .. py:method:: last_component_input_logical_types(self)
      :property:

      Get the logical types that are passed to the last component in the pipeline.

      :returns: Dict - Mapping feature name to logical type instance.

      :raises ValueError: If the component is not in the graph.
      :raises ValueError: If the component graph as not been fitted


   .. py:method:: predict(self, X)

      Make predictions using selected features.

      :param X: Input features of shape [n_samples, n_features].
      :type X: pd.DataFrame

      :returns: Predicted values.
      :rtype: pd.Series

      :raises ValueError: If final component is not an Estimator.


   .. py:method:: transform(self, X, y=None)

      Transform the input using the component graph.

      :param X: Input features of shape [n_samples, n_features].
      :type X: pd.DataFrame
      :param y: The target data of length [n_samples]. Defaults to None.
      :type y: pd.Series

      :returns: Transformed output.
      :rtype: pd.DataFrame

      :raises ValueError: If final component is not a Transformer.


   .. py:method:: transform_all_but_final(self, X, y=None)

      Transform all components save the final one, and gathers the data from any number of parents to get all the information that should be fed to the final component.

      :param X: Data of shape [n_samples, n_features].
      :type X: pd.DataFrame
      :param y: The target training data of length [n_samples]. Defaults to None.
      :type y: pd.Series

      :returns: Transformed values.
      :rtype: pd.DataFrame


.. py:data:: logger