feature_explanations¶
Human Readable Pipeline Explanations.
Module Contents¶
Functions¶
Finds the most influential features as well as any detrimental features from a dataframe of feature importances. |
|
Outputs a human-readable explanation of trained pipeline behavior. |
Contents¶
-
evalml.model_understanding.feature_explanations.
get_influential_features
(imp_df, max_features=5, min_importance_threshold=0.05, linear_importance=False)[source]¶ Finds the most influential features as well as any detrimental features from a dataframe of feature importances.
- Parameters
imp_df (pd.DataFrame) – DataFrame containing feature names and associated importances.
max_features (int) – The maximum number of features to include in an explanation. Defaults to 5.
min_importance_threshold (float) – The minimum percent of total importance a single feature can have to be considered important. Defaults to 0.05.
linear_importance (bool) – When True, negative feature importances are not considered detrimental. Defaults to False.
- Returns
Lists of feature names corresponding to heavily influential, somewhat influential, and detrimental features, respectively.
- Return type
(list, list, list)
-
evalml.model_understanding.feature_explanations.
readable_explanation
(pipeline, X=None, y=None, importance_method='permutation', max_features=5, min_importance_threshold=0.05, objective='auto')[source]¶ Outputs a human-readable explanation of trained pipeline behavior.
- Parameters
pipeline (PipelineBase) – The pipeline to explain.
X (pd.DataFrame) – If importance_method is permutation, the holdout X data to compute importance with. Ignored otherwise.
y (pd.Series) – The holdout y data, used to obtain the name of the target class. If importance_method is permutation, used to compute importance with.
importance_method (str) – The method of determining feature importance. One of [“permutation”, “feature”]. Defaults to “permutation”.
max_features (int) – The maximum number of influential features to include in an explanation. This does not affect the number of detrimental features reported. Defaults to 5.
min_importance_threshold (float) – The minimum percent of total importance a single feature can have to be considered important. Defaults to 0.05.
objective (str, ObjectiveBase) – If importance_method is permutation, the objective to compute importance with. Ignored otherwise, defaults to “auto”.
- Raises
ValueError – if any arguments passed in are invalid or the pipeline is not fitted.