Exploring search results

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:
import evalml

X, y = evalml.demos.load_breast_cancer()

clf = evalml.AutoClassifier(objective="f1",
                            max_pipelines=5)

clf.fit(X, y)
*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1. Greater score is better.

Searching up to 5 pipelines.
Possible model types: xgboost, linear_model, random_forest

✔ XGBoost Classifier w/ One Hot Encod...     0%|          | Elapsed:00:02
✔ XGBoost Classifier w/ One Hot Encod...    20%|██        | Elapsed:00:04
✔ Random Forest Classifier w/ One Hot...    40%|████      | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod...    60%|██████    | Elapsed:00:17
✔ Logistic Regression Classifier w/ O...    80%|████████  | Elapsed:00:19
✔ Logistic Regression Classifier w/ O...   100%|██████████| Elapsed:00:19

✔ Optimization finished

View Rankings

A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.

[2]:
clf.rankings
[2]:
id pipeline_name score high_variance_cv parameters
0 4 LogisticRegressionPipeline 0.973411 False {'penalty': 'l2', 'C': 8.444214828324364, 'imp...
1 1 XGBoostPipeline 0.970626 False {'eta': 0.38438170729269994, 'min_child_weight...
2 2 RFClassificationPipeline 0.966846 False {'n_estimators': 569, 'max_depth': 22, 'impute...
3 0 XGBoostPipeline 0.965192 False {'eta': 0.5928446182250184, 'min_child_weight'...
4 3 XGBoostPipeline 0.952237 False {'eta': 0.5288949197529046, 'min_child_weight'...

Describe Pipeline

Each pipeline is given an id. We can get more information about any particular pipeline using that id

[3]:
clf.describe_pipeline(0)
********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************

Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18

Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
         * impute_strategy : most_frequent
3. RF Classifier Select From Model
         * percent_features : 0.6273280598181127
         * threshold : -inf
4. XGBoost Classifier
         * eta : 0.5928446182250184
         * max_depth : 4
         * min_child_weight : 8.598391737229157

Training
========
Training for Binary Classification problems.
Total training time (including CV): 2.4 seconds

Cross Validation
----------------
               F1  Precision  Recall   AUC  Log Loss   MCC # Training # Testing
0           0.950      0.935   0.950 0.985     0.154 0.864    379.000   190.000
1           0.975      0.959   0.975 0.996     0.102 0.933    379.000   190.000
2           0.970      0.991   0.970 0.983     0.137 0.923    380.000   189.000
mean        0.965      0.962   0.965 0.988     0.131 0.907          -         -
std         0.013      0.028   0.013 0.007     0.026 0.037          -         -
coef of var 0.014      0.029   0.014 0.007     0.202 0.041          -         -

Get Pipeline

You can get the object for any pipeline as well

[4]:
clf.get_pipeline(0)
[4]:
<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x7fb39df65710>

Get best pipeline

If you specifically want to get the best pipeline, there is a convenient access.

[5]:
clf.best_pipeline
[5]:
<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x7fb398e0dac8>

Feature Importances

We can get the feature importances of the resulting pipeline

[6]:
pipeline = clf.get_pipeline(0)
pipeline.feature_importances
[6]:
feature importance
0 22 0.407441
1 7 0.239457
2 27 0.120609
3 20 0.072031
4 23 0.052818
5 6 0.038344
6 1 0.033962
7 21 0.028949
8 4 0.003987
9 25 0.002403
10 0 0.000000
11 2 0.000000
12 3 0.000000
13 12 0.000000
14 13 0.000000
15 18 0.000000
16 19 0.000000
17 29 0.000000

Access raw results

You can also get access to all the underlying data like this

[7]:
clf.results
[7]:
{0: {'id': 0,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5928446182250184,
   'min_child_weight': 8.598391737229157,
   'max_depth': 4,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.6273280598181127},
  'score': 0.9651923054186028,
  'high_variance_cv': False,
  'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
  'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
                ('Precision', 0.9349593495934959),
                ('Recall', 0.9504132231404958),
                ('AUC', 0.984731920937389),
                ('Log Loss', 0.1536501646286955),
                ('MCC', 0.8644170412909863),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9752066115702479),
                ('Precision', 0.959349593495935),
                ('Recall', 0.9752066115702479),
                ('AUC', 0.9960350337318026),
                ('Log Loss', 0.10194972527066344),
                ('MCC', 0.9327267201397125),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9699570815450643),
                ('Precision', 0.9912280701754386),
                ('Recall', 0.9699570815450643),
                ('AUC', 0.983313325330132),
                ('Log Loss', 0.13664108974533895),
                ('MCC', 0.9231826763268304),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.4058921337127686},
 1: {'id': 1,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.38438170729269994,
   'min_child_weight': 3.677811458900251,
   'max_depth': 13,
   'impute_strategy': 'median',
   'percent_features': 0.793807787701838},
  'score': 0.9706261399583499,
  'high_variance_cv': False,
  'scores': [0.9707112970711297, 0.9709543568464729, 0.9702127659574468],
  'all_objective_scores': [OrderedDict([('F1', 0.9707112970711297),
                ('Precision', 0.9666666666666667),
                ('Recall', 0.9707112970711297),
                ('AUC', 0.9917149958574978),
                ('Log Loss', 0.11573912222979982),
                ('MCC', 0.9211268105467613),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9709543568464729),
                ('Precision', 0.9590163934426229),
                ('Recall', 0.9709543568464729),
                ('AUC', 0.9969227127470707),
                ('Log Loss', 0.07704140603003141),
                ('MCC', 0.9211492315750531),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9702127659574468),
                ('Precision', 0.9827586206896551),
                ('Recall', 0.9702127659574468),
                ('AUC', 0.9857142857142858),
                ('Log Loss', 0.12628072745317012),
                ('MCC', 0.9218075091290715),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.4553000926971436},
 2: {'id': 2,
  'pipeline_name': 'RFClassificationPipeline',
  'parameters': {'n_estimators': 569,
   'max_depth': 22,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.8593661614465293},
  'score': 0.9668456397284798,
  'high_variance_cv': False,
  'scores': [0.9508196721311476, 0.979253112033195, 0.970464135021097],
  'all_objective_scores': [OrderedDict([('F1', 0.9508196721311476),
                ('Precision', 0.928),
                ('Recall', 0.9508196721311476),
                ('AUC', 0.9889336016096579),
                ('Log Loss', 0.1388421748025717),
                ('MCC', 0.8647724688764672),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.979253112033195),
                ('Precision', 0.9672131147540983),
                ('Recall', 0.979253112033195),
                ('AUC', 0.9898804592259438),
                ('Log Loss', 0.11232987225229708),
                ('MCC', 0.943843520216036),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9906362545018007),
                ('Log Loss', 0.11575295379524118),
                ('MCC', 0.9208800271662652),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 10.012419939041138},
 3: {'id': 3,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5288949197529046,
   'min_child_weight': 6.112401049845392,
   'max_depth': 6,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.34402219881309576},
  'score': 0.9522372250281359,
  'high_variance_cv': False,
  'scores': [0.9367088607594938, 0.9672131147540983, 0.9527896995708156],
  'all_objective_scores': [OrderedDict([('F1', 0.9367088607594938),
                ('Precision', 0.940677966101695),
                ('Recall', 0.9367088607594938),
                ('AUC', 0.9821872410936205),
                ('Log Loss', 0.16857726289400538),
                ('MCC', 0.8318710075349047),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9672131147540983),
                ('Precision', 0.944),
                ('Recall', 0.9672131147540983),
                ('AUC', 0.9937270682921056),
                ('Log Loss', 0.10433676970853029),
                ('MCC', 0.9106361866954563),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9527896995708156),
                ('Precision', 0.9736842105263158),
                ('Recall', 0.9527896995708156),
                ('AUC', 0.9845138055222089),
                ('Log Loss', 0.14270813122179812),
                ('MCC', 0.8783921421654207),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.381497859954834},
 4: {'id': 4,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 8.444214828324364,
   'impute_strategy': 'most_frequent'},
  'score': 0.9734109818152151,
  'high_variance_cv': False,
  'scores': [0.970464135021097, 0.9754098360655737, 0.9743589743589743],
  'all_objective_scores': [OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9885193514025328),
                ('Log Loss', 0.1943294590818862),
                ('MCC', 0.9215733295732883),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9849686353414605),
                ('Log Loss', 0.1533799764180264),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9743589743589743),
                ('Precision', 0.991304347826087),
                ('Recall', 0.9743589743589743),
                ('AUC', 0.990516206482593),
                ('Log Loss', 0.1164316714613053),
                ('MCC', 0.9336637889421326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.423741579055786}}