feature_explanations#

Human Readable Pipeline Explanations.

Module Contents#

Functions#

get_influential_features

Finds the most influential features as well as any detrimental features from a dataframe of feature importances.

readable_explanation

Outputs a human-readable explanation of trained pipeline behavior.

Contents#

evalml.model_understanding.feature_explanations.get_influential_features(imp_df, max_features=5, min_importance_threshold=0.05, linear_importance=False)[source]#

Finds the most influential features as well as any detrimental features from a dataframe of feature importances.

Parameters
  • imp_df (pd.DataFrame) – DataFrame containing feature names and associated importances.

  • max_features (int) – The maximum number of features to include in an explanation. Defaults to 5.

  • min_importance_threshold (float) – The minimum percent of total importance a single feature can have to be considered important. Defaults to 0.05.

  • linear_importance (bool) – When True, negative feature importances are not considered detrimental. Defaults to False.

Returns

Lists of feature names corresponding to heavily influential, somewhat influential, and detrimental features, respectively.

Return type

(list, list, list)

evalml.model_understanding.feature_explanations.readable_explanation(pipeline, X=None, y=None, importance_method='permutation', max_features=5, min_importance_threshold=0.05, objective='auto')[source]#

Outputs a human-readable explanation of trained pipeline behavior.

Parameters
  • pipeline (PipelineBase) – The pipeline to explain.

  • X (pd.DataFrame) – If importance_method is permutation, the holdout X data to compute importance with. Ignored otherwise.

  • y (pd.Series) – The holdout y data, used to obtain the name of the target class. If importance_method is permutation, used to compute importance with.

  • importance_method (str) – The method of determining feature importance. One of [“permutation”, “feature”]. Defaults to “permutation”.

  • max_features (int) – The maximum number of influential features to include in an explanation. This does not affect the number of detrimental features reported. Defaults to 5.

  • min_importance_threshold (float) – The minimum percent of total importance a single feature can have to be considered important. Defaults to 0.05.

  • objective (str, ObjectiveBase) – If importance_method is permutation, the objective to compute importance with. Ignored otherwise, defaults to “auto”.

Raises

ValueError – if any arguments passed in are invalid or the pipeline is not fitted.