permutation_importance
===========================================================

.. py:module:: evalml.model_understanding.permutation_importance

.. autoapi-nested-parse::

   Permutation importance methods.


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::
   :nosignatures:

   evalml.model_understanding.permutation_importance.calculate_permutation_importance
   evalml.model_understanding.permutation_importance.calculate_permutation_importance_one_column
   evalml.model_understanding.permutation_importance.graph_permutation_importance


Contents
~~~~~~~~~~~~~~~~~~~
.. py:function:: calculate_permutation_importance(pipeline, X, y, objective, n_repeats=5, n_jobs=None, random_seed=0)

   Calculates permutation importance for features.

   :param pipeline: Fitted pipeline.
   :type pipeline: PipelineBase or subclass
   :param X: The input data used to score and compute permutation importance.
   :type X: pd.DataFrame
   :param y: The target data.
   :type y: pd.Series
   :param objective: Objective to score on.
   :type objective: str, ObjectiveBase
   :param n_repeats: Number of times to permute a feature. Defaults to 5.
   :type n_repeats: int
   :param n_jobs: Non-negative integer describing level of parallelism used for pipelines.
                  None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Defaults to None.
   :type n_jobs: int or None
   :param random_seed: Seed for the random number generator. Defaults to 0.
   :type random_seed: int

   :returns: Mean feature importance scores over a number of shuffles.
   :rtype: pd.DataFrame

   :raises ValueError: If objective cannot be used with the given pipeline.


.. py:function:: calculate_permutation_importance_one_column(pipeline, X, y, col_name, objective, n_repeats=5, fast=True, precomputed_features=None, random_seed=0)

   Calculates permutation importance for one column in the original dataframe.

   :param pipeline: Fitted pipeline.
   :type pipeline: PipelineBase or subclass
   :param X: The input data used to score and compute permutation importance.
   :type X: pd.DataFrame
   :param y: The target data.
   :type y: pd.Series
   :param col_name: The column in X to calculate permutation importance for.
   :type col_name: str, int
   :param objective: Objective to score on.
   :type objective: str, ObjectiveBase
   :param n_repeats: Number of times to permute a feature. Defaults to 5.
   :type n_repeats: int
   :param fast: Whether to use the fast method of calculating the permutation importance or not. Defaults to True.
   :type fast: bool
   :param precomputed_features: Precomputed features necessary to calculate permutation importance using the fast method. Defaults to None.
   :type precomputed_features: pd.DataFrame
   :param random_seed: Seed for the random number generator. Defaults to 0.
   :type random_seed: int

   :returns: Mean feature importance scores over a number of shuffles.
   :rtype: float

   :raises ValueError: If pipeline does not support fast permutation importance calculation.
   :raises ValueError: If precomputed_features is None.


.. py:function:: graph_permutation_importance(pipeline, X, y, objective, importance_threshold=0)

   Generate a bar graph of the pipeline's permutation importance.

   :param pipeline: Fitted pipeline.
   :type pipeline: PipelineBase or subclass
   :param X: The input data used to score and compute permutation importance.
   :type X: pd.DataFrame
   :param y: The target data.
   :type y: pd.Series
   :param objective: Objective to score on.
   :type objective: str, ObjectiveBase
   :param importance_threshold: If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to 0.
   :type importance_threshold: float, optional

   :returns: plotly.Figure, a bar graph showing features and their respective permutation importance.

   :raises ValueError: If importance_threshold is not greater than or equal to 0.