Simply examining a model’s performance metrics is not enough to select a model and promote it for use in a production setting. While developing an ML algorithm, it is important to understand how the model behaves on the data, to examine the key factors influencing its predictions and to consider where it may be deficient. Determination of what “success” may mean for an ML project depends first and foremost on the user’s domain expertise.
EvalML includes a variety of tools for understanding models.
First, let’s train a pipeline on some data.
[1]:
import evalml class RFBinaryClassificationPipeline(evalml.pipelines.BinaryClassificationPipeline): component_graph = ['Simple Imputer', 'Random Forest Classifier'] X, y = evalml.demos.load_breast_cancer() pipeline = RFBinaryClassificationPipeline({}) pipeline.fit(X, y) print(pipeline.score(X, y, objectives=['log_loss_binary']))
/home/docs/checkouts/readthedocs.org/user_builds/feature-labs-inc-evalml/envs/v0.11.2/lib/python3.7/site-packages/evalml/pipelines/components/transformers/preprocessing/text_featurization.py:35: RuntimeWarning: No text columns were given to TextFeaturizer, component will have no effect warnings.warn("No text columns were given to TextFeaturizer, component will have no effect", RuntimeWarning)
OrderedDict([('Log Loss Binary', 0.03840382802787619)])
We can get the importance associated with each feature of the resulting pipeline
[2]:
pipeline.feature_importance
We can also create a bar plot of the feature importances
[3]:
pipeline.graph_feature_importance()
We can also compute and plot the permutation importance of the pipeline.
[4]:
evalml.pipelines.calculate_permutation_importance(pipeline, X, y, 'log_loss_binary')
[5]:
evalml.pipelines.graph_permutation_importance(pipeline, X, y, 'log_loss_binary')
For binary classification, we can view the precision-recall curve of the pipeline.
[6]:
# get the predicted probabilities associated with the "true" label y_pred_proba = pipeline.predict_proba(X)[1] evalml.pipelines.graph_utils.graph_precision_recall_curve(y, y_pred_proba)
For binary and multiclass classification, we can view the Receiver Operating Characteristic (ROC) curve of the pipeline.
[7]:
# get the predicted probabilities associated with the "true" label y_pred_proba = pipeline.predict_proba(X)[1] evalml.pipelines.graph_utils.graph_roc_curve(y, y_pred_proba)
For binary or multiclass classification, we can view a confusion matrix of the classifier’s predictions
[8]:
y_pred = pipeline.predict(X) evalml.pipelines.graph_utils.graph_confusion_matrix(y, y_pred)