prediction_explanations¶

Prediction explanation tools.

Submodules¶

explainers

Package Contents¶

Functions¶

`explain_predictions`	Creates a report summarizing the top contributing features for each data point in the input features.
`explain_predictions_best_worst`	Creates a report summarizing the top contributing features for the best and worst points in the dataset as measured by error to true labels.

Contents¶

evalml.model_understanding.prediction_explanations.explain_predictions(pipeline, input_features, y, indices_to_explain, top_k_features=3, include_shap_values=False, include_expected_value=False, output_format='text', training_data=None, training_target=None)[source]¶

Creates a report summarizing the top contributing features for each data point in the input features.

XGBoost and Stacked Ensemble models, as well as CatBoost multiclass classifiers, are not currently supported.

Parameters

pipeline (PipelineBase) – Fitted pipeline whose predictions we want to explain with SHAP.
input_features (pd.DataFrame) – Dataframe of input data to evaluate the pipeline on.
y (pd.Series) – Labels for the input data.
indices_to_explain (list[int]) – List of integer indices to explain.
top_k_features (int) – How many of the highest/lowest contributing feature to include in the table for each data point. Default is 3.
include_shap_values (bool) – Whether SHAP values should be included in the table. Default is False.
include_expected_value (bool) – Whether the expected value should be included in the table. Default is False.
output_format (str) – Either “text”, “dict”, or “dataframe”. Default is “text”.
training_data (pd.DataFrame, np.ndarray) – Data the pipeline was trained on. Required and only used for time series pipelines.
training_target (pd.Series, np.ndarray) – Targets used to train the pipeline. Required and only used for time series pipelines.

Returns

A report explaining the top contributing features to each prediction for each row of input_features.: The report will include the feature names, prediction contribution, and SHAP Value (optional).

Return type

str, dict, or pd.DataFrame

Raises

ValueError – if input_features is empty.
ValueError – if an output_format outside of “text”, “dict” or “dataframe is provided.
ValueError – if the requested index falls outside the input_feature’s boundaries.

evalml.model_understanding.prediction_explanations.explain_predictions_best_worst(pipeline, input_features, y_true, num_to_explain=5, top_k_features=3, include_shap_values=False, metric=None, output_format='text', callback=None, training_data=None, training_target=None)[source]¶

Creates a report summarizing the top contributing features for the best and worst points in the dataset as measured by error to true labels.