permutation_importance =========================================================== .. py:module:: evalml.model_understanding.permutation_importance .. autoapi-nested-parse:: Permutation importance methods. Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: :nosignatures: evalml.model_understanding.permutation_importance.calculate_permutation_importance evalml.model_understanding.permutation_importance.calculate_permutation_importance_one_column evalml.model_understanding.permutation_importance.graph_permutation_importance Contents ~~~~~~~~~~~~~~~~~~~ .. py:function:: calculate_permutation_importance(pipeline, X, y, objective, n_repeats=5, n_jobs=None, random_seed=0) Calculates permutation importance for features. :param pipeline: Fitted pipeline. :type pipeline: PipelineBase or subclass :param X: The input data used to score and compute permutation importance. :type X: pd.DataFrame :param y: The target data. :type y: pd.Series :param objective: Objective to score on. :type objective: str, ObjectiveBase :param n_repeats: Number of times to permute a feature. Defaults to 5. :type n_repeats: int :param n_jobs: Non-negative integer describing level of parallelism used for pipelines. None and 1 are equivalent. If set to -1, all CPUs are used. For n_jobs below -1, (n_cpus + 1 + n_jobs) are used. Defaults to None. :type n_jobs: int or None :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :returns: Mean feature importance scores over a number of shuffles. :rtype: pd.DataFrame :raises ValueError: If objective cannot be used with the given pipeline. .. py:function:: calculate_permutation_importance_one_column(pipeline, X, y, col_name, objective, n_repeats=5, fast=True, precomputed_features=None, random_seed=0) Calculates permutation importance for one column in the original dataframe. :param pipeline: Fitted pipeline. :type pipeline: PipelineBase or subclass :param X: The input data used to score and compute permutation importance. :type X: pd.DataFrame :param y: The target data. :type y: pd.Series :param col_name: The column in X to calculate permutation importance for. :type col_name: str, int :param objective: Objective to score on. :type objective: str, ObjectiveBase :param n_repeats: Number of times to permute a feature. Defaults to 5. :type n_repeats: int :param fast: Whether to use the fast method of calculating the permutation importance or not. Defaults to True. :type fast: bool :param precomputed_features: Precomputed features necessary to calculate permutation importance using the fast method. Defaults to None. :type precomputed_features: pd.DataFrame :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int :returns: Mean feature importance scores over a number of shuffles. :rtype: float :raises ValueError: If pipeline does not support fast permutation importance calculation. :raises ValueError: If precomputed_features is None. .. py:function:: graph_permutation_importance(pipeline, X, y, objective, importance_threshold=0) Generate a bar graph of the pipeline's permutation importance. :param pipeline: Fitted pipeline. :type pipeline: PipelineBase or subclass :param X: The input data used to score and compute permutation importance. :type X: pd.DataFrame :param y: The target data. :type y: pd.Series :param objective: Objective to score on. :type objective: str, ObjectiveBase :param importance_threshold: If provided, graph features with a permutation importance whose absolute value is larger than importance_threshold. Defaults to 0. :type importance_threshold: float, optional :returns: plotly.Figure, a bar graph showing features and their respective permutation importance. :raises ValueError: If importance_threshold is not greater than or equal to 0.