Model Understanding

Simply examining a model’s performance metrics is not enough to select a model and promote it for use in a production setting. While developing an ML algorithm, it is important to understand how the model behaves on the data, to examine the key factors influencing its predictions and to consider where it may be deficient. Determination of what “success” may mean for an ML project depends first and foremost on the user’s domain expertise.

EvalML includes a variety of tools for understanding models, from graphing utilities to methods for explaining predictions.

** Graphing methods on Jupyter Notebook and Jupyter Lab require ipywidgets to be installed.

** If graphing on Jupyter Lab, jupyterlab-plotly required. To download this, make sure you have npm installed.

Graphing Utilities

First, let’s train a pipeline on some data.

[1]:
import evalml
from evalml.pipelines import BinaryClassificationPipeline
X, y = evalml.demos.load_breast_cancer()

X_train, X_holdout, y_train, y_holdout = evalml.preprocessing.split_data(X, y, problem_type='binary',
                                                                         test_size=0.2, random_seed=0)


pipeline_binary = BinaryClassificationPipeline(['Simple Imputer', 'Random Forest Classifier'])
pipeline_binary.fit(X_train, y_train)
print(pipeline_binary.score(X_holdout, y_holdout, objectives=['log loss binary']))
         Number of Features
Numeric                  30

Number of training examples: 569
Targets
benign       62.74%
malignant    37.26%
Name: target, dtype: object
OrderedDict([('Log Loss Binary', 0.1686746297113362)])

Feature Importance

We can get the importance associated with each feature of the resulting pipeline

[2]:
pipeline_binary.feature_importance
[2]:
feature importance
0 mean concave points 0.138857
1 worst perimeter 0.137780
2 worst concave points 0.117782
3 worst radius 0.100584
4 mean concavity 0.086402
5 worst area 0.072027
6 mean perimeter 0.046500
7 worst concavity 0.043408
8 mean radius 0.037664
9 mean area 0.033683
10 radius error 0.025036
11 area error 0.019324
12 worst texture 0.014754
13 worst compactness 0.014462
14 mean texture 0.013856
15 worst smoothness 0.013710
16 worst symmetry 0.011395
17 perimeter error 0.010284
18 mean compactness 0.008162
19 mean smoothness 0.008154
20 worst fractal dimension 0.007034
21 fractal dimension error 0.005502
22 compactness error 0.004953
23 smoothness error 0.004728
24 texture error 0.004384
25 symmetry error 0.004250
26 mean fractal dimension 0.004164
27 concavity error 0.004089
28 mean symmetry 0.003997
29 concave points error 0.003076

We can also create a bar plot of the feature importances

[3]:
pipeline_binary.graph_feature_importance()