Model Understanding

Simply examining a model’s performance metrics is not enough to select a model and promote it for use in a production setting. While developing an ML algorithm, it is important to understand how the model behaves on the data, to examine the key factors influencing its predictions and to consider where it may be deficient. Determination of what “success” may mean for an ML project depends first and foremost on the user’s domain expertise.

EvalML includes a variety of tools for understanding models, from graphing utilities to methods for explaining predictions.

** Graphing methods on Jupyter Notebook and Jupyter Lab require ipywidgets to be installed.

** If graphing on Jupyter Lab, jupyterlab-plotly required. To download this, make sure you have npm installed.

Graphing Utilities

First, let’s train a pipeline on some data.

[1]:
import evalml

class DTBinaryClassificationPipeline(evalml.pipelines.BinaryClassificationPipeline):
    component_graph = ['Simple Imputer', 'Decision Tree Classifier']

X, y = evalml.demos.load_breast_cancer()

pipeline_dt = DTBinaryClassificationPipeline({})
pipeline_dt.fit(X, y)
[1]:
DTBinaryClassificationPipeline(parameters={'Simple Imputer':{'impute_strategy': 'most_frequent', 'fill_value': None}, 'Decision Tree Classifier':{'criterion': 'gini', 'max_features': 'auto', 'max_depth': 6, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0},})

Tree Visualization

We can visualize the structure of the Decision Tree that was fit to that data, and save it if necessary.

[2]:
from evalml.model_understanding.graphs import visualize_decision_tree

visualize_decision_tree(pipeline_dt.estimator, max_depth=2, rotate=False, filled=True, filepath=None)
[2]:
../_images/user_guide_model_understanding_5_0.svg

Lets replace the Decision Tree Classifier with a Random Forest Classifier.

[3]:
class RFBinaryClassificationPipeline(evalml.pipelines.BinaryClassificationPipeline):
    component_graph = ['Simple Imputer', 'Random Forest Classifier']

pipeline = RFBinaryClassificationPipeline({})
pipeline.fit(X, y)
print(pipeline.score(X, y, objectives=['log loss binary']))
OrderedDict([('Log Loss Binary', 0.038403828027876195)])

Feature Importance

We can get the importance associated with each feature of the resulting pipeline

[4]:
pipeline.feature_importance
[4]:
feature importance
0 worst perimeter 0.176488
1 worst concave points 0.125260
2 worst radius 0.124161
3 mean concave points 0.086443
4 worst area 0.072465
5 mean concavity 0.072320
6 mean perimeter 0.056685
7 mean area 0.049599
8 area error 0.037229
9 worst concavity 0.028181
10 mean radius 0.023294
11 radius error 0.019457
12 worst texture 0.014990
13 perimeter error 0.014103
14 mean texture 0.013618
15 worst compactness 0.011310
16 worst smoothness 0.011139
17 worst fractal dimension 0.008118
18 worst symmetry 0.007818
19 mean smoothness 0.006152
20 concave points error 0.005887
21 fractal dimension error 0.005059
22 concavity error 0.004510
23 smoothness error 0.004493
24 texture error 0.004476
25 mean compactness 0.004050
26 compactness error 0.003559
27 mean symmetry 0.003243
28 symmetry error 0.003124
29 mean fractal dimension 0.002768

We can also create a bar plot of the feature importances

[5]:
pipeline.graph_feature_importance()