Exploring search results¶

After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.

[1]:

import evalml

X, y = evalml.demos.load_breast_cancer()

clf = evalml.AutoClassifier(objective="f1",
                            max_pipelines=5)

clf.fit(X, y)

*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1. Greater score is better.

Searching up to 5 pipelines.
Possible model types: xgboost, linear_model, random_forest

✔ XGBoost Classifier w/ One Hot Encod...     0%|          | Elapsed:00:02
✔ XGBoost Classifier w/ One Hot Encod...    20%|██        | Elapsed:00:04
✔ Random Forest Classifier w/ One Hot...    40%|████      | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod...    60%|██████    | Elapsed:00:17
✔ Logistic Regression Classifier w/ O...    80%|████████  | Elapsed:00:19
✔ Logistic Regression Classifier w/ O...   100%|██████████| Elapsed:00:19

✔ Optimization finished

View Rankings¶

A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.

[2]:

clf.rankings

[2]:

	id	pipeline_name	score	high_variance_cv	parameters
0	4	LogisticRegressionPipeline	0.973411	False	{'penalty': 'l2', 'C': 8.444214828324364, 'imp...
1	1	XGBoostPipeline	0.970626	False	{'eta': 0.38438170729269994, 'min_child_weight...
2	2	RFClassificationPipeline	0.966846	False	{'n_estimators': 569, 'max_depth': 22, 'impute...
3	0	XGBoostPipeline	0.965192	False	{'eta': 0.5928446182250184, 'min_child_weight'...
4	3	XGBoostPipeline	0.952237	False	{'eta': 0.5288949197529046, 'min_child_weight'...

Describe Pipeline¶

Each pipeline is given an id. We can get more information about any particular pipeline using that id

[3]:

clf.describe_pipeline(0)

********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************

Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18

Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
         * impute_strategy : most_frequent
3. RF Classifier Select From Model
         * percent_features : 0.6273280598181127
         * threshold : -inf
4. XGBoost Classifier
         * eta : 0.5928446182250184
         * max_depth : 4
         * min_child_weight : 8.598391737229157

Training
========
Training for Binary Classification problems.
Total training time (including CV): 2.4 seconds

Cross Validation
----------------
               F1  Precision  Recall   AUC  Log Loss   MCC # Training # Testing
0           0.950      0.935   0.950 0.985     0.154 0.864    379.000   190.000
1           0.975      0.959   0.975 0.996     0.102 0.933    379.000   190.000
2           0.970      0.991   0.970 0.983     0.137 0.923    380.000   189.000
mean        0.965      0.962   0.965 0.988     0.131 0.907          -         -
std         0.013      0.028   0.013 0.007     0.026 0.037          -         -
coef of var 0.014      0.029   0.014 0.007     0.202 0.041          -         -

Get Pipeline¶

You can get the object for any pipeline as well

[4]:

clf.get_pipeline(0)

[4]:

<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x7fb39df65710>

Get best pipeline¶

If you specifically want to get the best pipeline, there is a convenient access.

[5]:

clf.best_pipeline

[5]:

<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x7fb398e0dac8>

Feature Importances¶

We can get the feature importances of the resulting pipeline

[6]:

pipeline = clf.get_pipeline(0)
pipeline.feature_importances

[6]:

	feature	importance
0	22	0.407441
1	7	0.239457
2	27	0.120609
3	20	0.072031
4	23	0.052818
5	6	0.038344
6	1	0.033962
7	21	0.028949
8	4	0.003987
9	25	0.002403
10	0	0.000000
11	2	0.000000
12	3	0.000000
13	12	0.000000
14	13	0.000000
15	18	0.000000
16	19	0.000000
17	29	0.000000

Access raw results¶

You can also get access to all the underlying data like this

[7]:

clf.results

[7]:

{0: {'id': 0,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5928446182250184,
   'min_child_weight': 8.598391737229157,
   'max_depth': 4,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.6273280598181127},
  'score': 0.9651923054186028,
  'high_variance_cv': False,
  'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
  'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
                ('Precision', 0.9349593495934959),
                ('Recall', 0.9504132231404958),
                ('AUC', 0.984731920937389),
                ('Log Loss', 0.1536501646286955),
                ('MCC', 0.8644170412909863),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9752066115702479),
                ('Precision', 0.959349593495935),
                ('Recall', 0.9752066115702479),
                ('AUC', 0.9960350337318026),
                ('Log Loss', 0.10194972527066344),
                ('MCC', 0.9327267201397125),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9699570815450643),
                ('Precision', 0.9912280701754386),
                ('Recall', 0.9699570815450643),
                ('AUC', 0.983313325330132),
                ('Log Loss', 0.13664108974533895),
                ('MCC', 0.9231826763268304),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.4058921337127686},
 1: {'id': 1,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.38438170729269994,
   'min_child_weight': 3.677811458900251,
   'max_depth': 13,
   'impute_strategy': 'median',
   'percent_features': 0.793807787701838},
  'score': 0.9706261399583499,
  'high_variance_cv': False,
  'scores': [0.9707112970711297, 0.9709543568464729, 0.9702127659574468],
  'all_objective_scores': [OrderedDict([('F1', 0.9707112970711297),
                ('Precision', 0.9666666666666667),
                ('Recall', 0.9707112970711297),
                ('AUC', 0.9917149958574978),
                ('Log Loss', 0.11573912222979982),
                ('MCC', 0.9211268105467613),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9709543568464729),
                ('Precision', 0.9590163934426229),
                ('Recall', 0.9709543568464729),
                ('AUC', 0.9969227127470707),
                ('Log Loss', 0.07704140603003141),
                ('MCC', 0.9211492315750531),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9702127659574468),
                ('Precision', 0.9827586206896551),
                ('Recall', 0.9702127659574468),
                ('AUC', 0.9857142857142858),
                ('Log Loss', 0.12628072745317012),
                ('MCC', 0.9218075091290715),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.4553000926971436},
 2: {'id': 2,
  'pipeline_name': 'RFClassificationPipeline',
  'parameters': {'n_estimators': 569,
   'max_depth': 22,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.8593661614465293},
  'score': 0.9668456397284798,
  'high_variance_cv': False,
  'scores': [0.9508196721311476, 0.979253112033195, 0.970464135021097],
  'all_objective_scores': [OrderedDict([('F1', 0.9508196721311476),
                ('Precision', 0.928),
                ('Recall', 0.9508196721311476),
                ('AUC', 0.9889336016096579),
                ('Log Loss', 0.1388421748025717),
                ('MCC', 0.8647724688764672),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.979253112033195),
                ('Precision', 0.9672131147540983),
                ('Recall', 0.979253112033195),
                ('AUC', 0.9898804592259438),
                ('Log Loss', 0.11232987225229708),
                ('MCC', 0.943843520216036),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9906362545018007),
                ('Log Loss', 0.11575295379524118),
                ('MCC', 0.9208800271662652),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 10.012419939041138},
 3: {'id': 3,
  'pipeline_name': 'XGBoostPipeline',
  'parameters': {'eta': 0.5288949197529046,
   'min_child_weight': 6.112401049845392,
   'max_depth': 6,
   'impute_strategy': 'most_frequent',
   'percent_features': 0.34402219881309576},
  'score': 0.9522372250281359,
  'high_variance_cv': False,
  'scores': [0.9367088607594938, 0.9672131147540983, 0.9527896995708156],
  'all_objective_scores': [OrderedDict([('F1', 0.9367088607594938),
                ('Precision', 0.940677966101695),
                ('Recall', 0.9367088607594938),
                ('AUC', 0.9821872410936205),
                ('Log Loss', 0.16857726289400538),
                ('MCC', 0.8318710075349047),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9672131147540983),
                ('Precision', 0.944),
                ('Recall', 0.9672131147540983),
                ('AUC', 0.9937270682921056),
                ('Log Loss', 0.10433676970853029),
                ('MCC', 0.9106361866954563),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9527896995708156),
                ('Precision', 0.9736842105263158),
                ('Recall', 0.9527896995708156),
                ('AUC', 0.9845138055222089),
                ('Log Loss', 0.14270813122179812),
                ('MCC', 0.8783921421654207),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.381497859954834},
 4: {'id': 4,
  'pipeline_name': 'LogisticRegressionPipeline',
  'parameters': {'penalty': 'l2',
   'C': 8.444214828324364,
   'impute_strategy': 'most_frequent'},
  'score': 0.9734109818152151,
  'high_variance_cv': False,
  'scores': [0.970464135021097, 0.9754098360655737, 0.9743589743589743],
  'all_objective_scores': [OrderedDict([('F1', 0.970464135021097),
                ('Precision', 0.9745762711864406),
                ('Recall', 0.970464135021097),
                ('AUC', 0.9885193514025328),
                ('Log Loss', 0.1943294590818862),
                ('MCC', 0.9215733295732883),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9754098360655737),
                ('Precision', 0.952),
                ('Recall', 0.9754098360655737),
                ('AUC', 0.9849686353414605),
                ('Log Loss', 0.1533799764180264),
                ('MCC', 0.933568045604951),
                ('# Training', 379),
                ('# Testing', 190)]),
   OrderedDict([('F1', 0.9743589743589743),
                ('Precision', 0.991304347826087),
                ('Recall', 0.9743589743589743),
                ('AUC', 0.990516206482593),
                ('Log Loss', 0.1164316714613053),
                ('MCC', 0.9336637889421326),
                ('# Training', 380),
                ('# Testing', 189)])],
  'training_time': 2.423741579055786}}