metrics#

Standard metrics used for model understanding.

Module Contents#

Functions#

`check_distribution`	Determines if the distribution of the predicted data is likely to match that of the ground truth data.
`confusion_matrix`	Confusion matrix for binary and multiclass classification.
`graph_confusion_matrix`	Generate and display a confusion matrix plot.
`graph_precision_recall_curve`	Generate and display a precision-recall plot.
`graph_roc_curve`	Generate and display a Receiver Operating Characteristic (ROC) plot for binary and multiclass classification problems.
`normalize_confusion_matrix`	Normalizes a confusion matrix.
`precision_recall_curve`	Given labels and binary classifier predicted probabilities, compute and return the data representing a precision-recall curve.
`roc_curve`	Given labels and classifier predicted probabilities, compute and return the data representing a Receiver Operating Characteristic (ROC) curve. Works with binary or multiclass problems.

Contents#

evalml.model_understanding.metrics.check_distribution(y_true, y_pred, problem_type, threshold=0.1)[source]#

Determines if the distribution of the predicted data is likely to match that of the ground truth data.

Will use a different statistical test based on the given problem type: - Classification (Binary or Multiclass) - chi squared test - Regression - Kolmogorov-Smirnov test - Time Series Regression - Wilcoxon signed-rank test :param y_true: The ground truth data. :type y_true: pd.Series :param y_pred: Predictions from a pipeline. :type y_pred: pd.Series :param problem_type: The pipeline’s problem type, used to determine the method. :type problem_type: str or ProblemType :param threshold: The threshold for the p value where we choose to accept or reject the null hypothesis.

Should be between 0 and 1, non-inclusive. Defaults to 0.1.

Returns: 0 if the distribution of predicted values is not likely to match the true distribution, 1 if it is.
Return type: int

evalml.model_understanding.metrics.confusion_matrix(y_true, y_predicted, normalize_method='true')[source]#

Confusion matrix for binary and multiclass classification.

Parameters

y_true (pd.Series or np.ndarray) – True binary labels.
y_predicted (pd.Series or np.ndarray) – Predictions from a binary classifier.
normalize_method ({'true', 'pred', 'all', None}) – Normalization method to use, if not None. Supported options are: ‘true’ to normalize by row, ‘pred’ to normalize by column, or ‘all’ to normalize by all values. Defaults to ‘true’.

Returns

Confusion matrix. The column header represents the predicted labels while row header represents the actual labels.

Return type

pd.DataFrame

evalml.model_understanding.metrics.graph_confusion_matrix(y_true, y_pred, normalize_method='true', title_addition=None)[source]#

Generate and display a confusion matrix plot.

If normalize_method is set, hover text will show raw count, otherwise hover text will show count normalized with method ‘true’.

Parameters

y_true (pd.Series or np.ndarray) – True binary labels.
y_pred (pd.Series or np.ndarray) – Predictions from a binary classifier.
normalize_method ({'true', 'pred', 'all', None}) – Normalization method to use, if not None. Supported options are: ‘true’ to normalize by row, ‘pred’ to normalize by column, or ‘all’ to normalize by all values. Defaults to ‘true’.
title_addition (str) – If not None, append to plot title. Defaults to None.

Returns

plotly.Figure representing the confusion matrix plot generated.

evalml.model_understanding.metrics.graph_precision_recall_curve(y_true, y_pred_proba, title_addition=None)[source]#

Generate and display a precision-recall plot.

Parameters

y_true (pd.Series or np.ndarray) – True binary labels.
y_pred_proba (pd.Series or np.ndarray) – Predictions from a binary classifier, before thresholding has been applied. Note this should be the predicted probability for the “true” label.
title_addition (str or None) – If not None, append to plot title. Defaults to None.

Returns

plotly.Figure representing the precision-recall plot generated

evalml.model_understanding.metrics.graph_roc_curve(y_true, y_pred_proba, custom_class_names=None, title_addition=None)[source]#

Generate and display a Receiver Operating Characteristic (ROC) plot for binary and multiclass classification problems.

Parameters

y_true (pd.Series or np.ndarray) – True labels.
y_pred_proba (pd.Series or np.ndarray) – Predictions from a classifier, before thresholding has been applied. Note this should a one dimensional array with the predicted probability for the “true” label in the binary case.
custom_class_names (list or None) – If not None, custom labels for classes. Defaults to None.
title_addition (str or None) – if not None, append to plot title. Defaults to None.

Returns

plotly.Figure representing the ROC plot generated

Raises

ValueError – If the number of custom class names does not match number of classes in the input data.

evalml.model_understanding.metrics.normalize_confusion_matrix(conf_mat, normalize_method='true')[source]#

Normalizes a confusion matrix.

Parameters

conf_mat (pd.DataFrame or np.ndarray) – Confusion matrix to normalize.
normalize_method ({'true', 'pred', 'all'}) – Normalization method. Supported options are: ‘true’ to normalize by row, ‘pred’ to normalize by column, or ‘all’ to normalize by all values. Defaults to ‘true’.

Returns

normalized version of the input confusion matrix. The column header represents the predicted labels while row header represents the actual labels.

Return type

pd.DataFrame

Raises

ValueError – If configuration is invalid, or if the sum of a given axis is zero and normalization by axis is specified.

evalml.model_understanding.metrics.precision_recall_curve(y_true, y_pred_proba, pos_label_idx=- 1)[source]#

Given labels and binary classifier predicted probabilities, compute and return the data representing a precision-recall curve.

Parameters

y_true (pd.Series or np.ndarray) – True binary labels.
y_pred_proba (pd.Series or np.ndarray) – Predictions from a binary classifier, before thresholding has been applied. Note this should be the predicted probability for the “true” label.
pos_label_idx (int) – the column index corresponding to the positive class. If predicted probabilities are two-dimensional, this will be used to access the probabilities for the positive class.

Returns

Dictionary containing metrics used to generate a precision-recall plot, with the following keys:

precision: Precision values.

recall: Recall values.

thresholds: Threshold values used to produce the precision and recall.

auc_score: The area under the ROC curve.

Return type

list

Raises

NoPositiveLabelException – If predicted probabilities do not contain a column at the specified label.

evalml.model_understanding.metrics.roc_curve(y_true, y_pred_proba)[source]#

Given labels and classifier predicted probabilities, compute and return the data representing a Receiver Operating Characteristic (ROC) curve. Works with binary or multiclass problems.

Parameters

y_true (pd.Series or np.ndarray) – True labels.
y_pred_proba (pd.Series or pd.DataFrame or np.ndarray) – Predictions from a classifier, before thresholding has been applied.

Returns

A list of dictionaries (with one for each class) is returned. Binary classification problems return a list with one dictionary.

Each dictionary contains metrics used to generate an ROC plot with the following keys:

fpr_rate: False positive rate.
tpr_rate: True positive rate.
threshold: Threshold values used to produce each pair of true/false positive rates.
auc_score: The area under the ROC curve.

Return type

list(dict)