Exploring search results¶
After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.
[1]:
import evalml
X, y = evalml.demos.load_breast_cancer()
clf = evalml.AutoClassifier(objective="f1",
max_pipelines=10)
clf.fit(X, y)
*****************************
* Beginning pipeline search *
*****************************
Optimizing for F1. Greater score is better.
Searching up to 10 pipelines.
Possible model types: xgboost, random_forest, linear_model
✔ XGBoost Classifier w/ One Hot Encod... 0%| | Elapsed:00:00
✔ XGBoost Classifier w/ One Hot Encod... 10%|█ | Elapsed:00:00
✔ Random Forest Classifier w/ One Hot... 20%|██ | Elapsed:00:06
✔ XGBoost Classifier w/ One Hot Encod... 30%|███ | Elapsed:00:06
✔ Logistic Regression Classifier w/ O... 40%|████ | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod... 50%|█████ | Elapsed:00:14
✔ Logistic Regression Classifier w/ O... 60%|██████ | Elapsed:00:21
✔ XGBoost Classifier w/ One Hot Encod... 70%|███████ | Elapsed:00:22
✔ Logistic Regression Classifier w/ O... 80%|████████ | Elapsed:00:29
✔ Logistic Regression Classifier w/ O... 90%|█████████ | Elapsed:00:37
✔ Logistic Regression Classifier w/ O... 100%|██████████| Elapsed:00:37
✔ Optimization finished
View Rankings¶
A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.
[2]:
clf.rankings
[2]:
id | pipeline_name | score | high_variance_cv | parameters | |
---|---|---|---|---|---|
0 | 8 | LogisticRegressionPipeline | 0.980527 | False | {'penalty': 'l2', 'C': 0.5765626434012575, 'im... |
1 | 6 | LogisticRegressionPipeline | 0.974853 | False | {'penalty': 'l2', 'C': 6.239401330891865, 'imp... |
2 | 9 | LogisticRegressionPipeline | 0.974853 | False | {'penalty': 'l2', 'C': 8.123565600467177, 'imp... |
3 | 4 | LogisticRegressionPipeline | 0.973411 | False | {'penalty': 'l2', 'C': 8.444214828324364, 'imp... |
4 | 1 | XGBoostPipeline | 0.970626 | False | {'eta': 0.38438170729269994, 'min_child_weight... |
5 | 2 | RFClassificationPipeline | 0.966846 | False | {'n_estimators': 569, 'max_depth': 22, 'impute... |
6 | 5 | XGBoostPipeline | 0.966592 | False | {'eta': 0.6481718720511973, 'min_child_weight'... |
7 | 0 | XGBoostPipeline | 0.965192 | False | {'eta': 0.5928446182250184, 'min_child_weight'... |
8 | 7 | XGBoostPipeline | 0.963913 | False | {'eta': 0.9786183422327642, 'min_child_weight'... |
9 | 3 | XGBoostPipeline | 0.952237 | False | {'eta': 0.5288949197529046, 'min_child_weight'... |
Describe Pipeline¶
Each pipeline is given an id
. We can get more information about any particular pipeline using that id
[3]:
clf.describe_pipeline(0)
********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************
Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18
Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
* impute_strategy : most_frequent
3. RF Classifier Select From Model
* percent_features : 0.6273280598181127
* threshold : -inf
4. XGBoost Classifier
* eta : 0.5928446182250184
* max_depth : 4
* min_child_weight : 8.598391737229157
Training
========
Training for Binary Classification problems.
Total training time (including CV): 0.2 seconds
Cross Validation
----------------
F1 Precision Recall AUC Log Loss MCC # Training # Testing
0 0.950 0.935 0.950 0.985 0.154 0.864 379.000 190.000
1 0.975 0.959 0.975 0.996 0.102 0.933 379.000 190.000
2 0.970 0.991 0.970 0.983 0.137 0.923 380.000 189.000
mean 0.965 0.962 0.965 0.988 0.131 0.907 - -
std 0.013 0.028 0.013 0.007 0.026 0.037 - -
coef of var 0.014 0.029 0.014 0.007 0.202 0.041 - -
Get Pipeline¶
You can get the object for any pipeline as well
[4]:
clf.get_pipeline(0)
[4]:
<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x135081990>
Get best pipeline¶
If you specifically want to get the best pipeline, there is a convenient access.
[5]:
clf.best_pipeline
[5]:
<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x1372054d0>
Feature Importances¶
We can get the feature importances of the resulting pipeline
[6]:
pipeline = clf.get_pipeline(0)
pipeline.feature_importances
[6]:
feature | importance | |
---|---|---|
0 | 22 | 0.407441 |
1 | 7 | 0.239457 |
2 | 27 | 0.120609 |
3 | 20 | 0.072031 |
4 | 23 | 0.052818 |
5 | 6 | 0.038344 |
6 | 1 | 0.033962 |
7 | 21 | 0.028949 |
8 | 4 | 0.003987 |
9 | 25 | 0.002403 |
10 | 0 | 0.000000 |
11 | 2 | 0.000000 |
12 | 3 | 0.000000 |
13 | 12 | 0.000000 |
14 | 13 | 0.000000 |
15 | 18 | 0.000000 |
16 | 19 | 0.000000 |
17 | 29 | 0.000000 |
Access raw results¶
You can also get access to all the underlying data like this
[7]:
clf.results
[7]:
{0: {'id': 0,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.5928446182250184,
'min_child_weight': 8.598391737229157,
'max_depth': 4,
'impute_strategy': 'most_frequent',
'percent_features': 0.6273280598181127},
'score': 0.9651923054186028,
'high_variance_cv': False,
'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
('Precision', 0.9349593495934959),
('Recall', 0.9504132231404958),
('AUC', 0.984731920937389),
('Log Loss', 0.1536501646237938),
('MCC', 0.8644170412909863),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9752066115702479),
('Precision', 0.959349593495935),
('Recall', 0.9752066115702479),
('AUC', 0.9960350337318026),
('Log Loss', 0.10194972519713798),
('MCC', 0.9327267201397125),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9699570815450643),
('Precision', 0.9912280701754386),
('Recall', 0.9699570815450643),
('AUC', 0.983313325330132),
('Log Loss', 0.13664108953345075),
('MCC', 0.9231826763268304),
('# Training', 380),
('# Testing', 189)])],
'training_time': 0.248244047164917},
1: {'id': 1,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.38438170729269994,
'min_child_weight': 3.677811458900251,
'max_depth': 13,
'impute_strategy': 'median',
'percent_features': 0.793807787701838},
'score': 0.9706261399583499,
'high_variance_cv': False,
'scores': [0.9707112970711297, 0.9709543568464729, 0.9702127659574468],
'all_objective_scores': [OrderedDict([('F1', 0.9707112970711297),
('Precision', 0.9666666666666667),
('Recall', 0.9707112970711297),
('AUC', 0.9917149958574978),
('Log Loss', 0.11573912222489813),
('MCC', 0.9211268105467613),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9709543568464729),
('Precision', 0.9590163934426229),
('Recall', 0.9709543568464729),
('AUC', 0.9969227127470707),
('Log Loss', 0.07704140599817037),
('MCC', 0.9211492315750531),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9702127659574468),
('Precision', 0.9827586206896551),
('Recall', 0.9702127659574468),
('AUC', 0.9857142857142858),
('Log Loss', 0.12628072744331484),
('MCC', 0.9218075091290715),
('# Training', 380),
('# Testing', 189)])],
'training_time': 0.29195380210876465},
2: {'id': 2,
'pipeline_name': 'RFClassificationPipeline',
'parameters': {'n_estimators': 569,
'max_depth': 22,
'impute_strategy': 'most_frequent',
'percent_features': 0.8593661614465293},
'score': 0.9668456397284798,
'high_variance_cv': False,
'scores': [0.9508196721311476, 0.979253112033195, 0.970464135021097],
'all_objective_scores': [OrderedDict([('F1', 0.9508196721311476),
('Precision', 0.928),
('Recall', 0.9508196721311476),
('AUC', 0.9889336016096579),
('Log Loss', 0.1388421748025717),
('MCC', 0.8647724688764672),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.979253112033195),
('Precision', 0.9672131147540983),
('Recall', 0.979253112033195),
('AUC', 0.9898804592259438),
('Log Loss', 0.11232987225229708),
('MCC', 0.943843520216036),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.970464135021097),
('Precision', 0.9745762711864406),
('Recall', 0.970464135021097),
('AUC', 0.9906362545018007),
('Log Loss', 0.11575295379524118),
('MCC', 0.9208800271662652),
('# Training', 380),
('# Testing', 189)])],
'training_time': 6.06977105140686},
3: {'id': 3,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.5288949197529046,
'min_child_weight': 6.112401049845392,
'max_depth': 6,
'impute_strategy': 'most_frequent',
'percent_features': 0.34402219881309576},
'score': 0.9522372250281359,
'high_variance_cv': False,
'scores': [0.9367088607594938, 0.9672131147540983, 0.9527896995708156],
'all_objective_scores': [OrderedDict([('F1', 0.9367088607594938),
('Precision', 0.940677966101695),
('Recall', 0.9367088607594938),
('AUC', 0.9821872410936205),
('Log Loss', 0.16857726289155453),
('MCC', 0.8318710075349047),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9672131147540983),
('Precision', 0.944),
('Recall', 0.9672131147540983),
('AUC', 0.9937270682921056),
('Log Loss', 0.10433676971098114),
('MCC', 0.9106361866954563),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9527896995708156),
('Precision', 0.9736842105263158),
('Recall', 0.9527896995708156),
('AUC', 0.9845138055222089),
('Log Loss', 0.14270813120701523),
('MCC', 0.8783921421654207),
('# Training', 380),
('# Testing', 189)])],
'training_time': 0.20792675018310547},
4: {'id': 4,
'pipeline_name': 'LogisticRegressionPipeline',
'parameters': {'penalty': 'l2',
'C': 8.444214828324364,
'impute_strategy': 'most_frequent'},
'score': 0.9734109818152151,
'high_variance_cv': False,
'scores': [0.970464135021097, 0.9754098360655737, 0.9743589743589743],
'all_objective_scores': [OrderedDict([('F1', 0.970464135021097),
('Precision', 0.9745762711864406),
('Recall', 0.970464135021097),
('AUC', 0.9885193514025328),
('Log Loss', 0.1943294590819038),
('MCC', 0.9215733295732883),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9754098360655737),
('Precision', 0.952),
('Recall', 0.9754098360655737),
('AUC', 0.9849686353414605),
('Log Loss', 0.1533799764176819),
('MCC', 0.933568045604951),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9743589743589743),
('Precision', 0.991304347826087),
('Recall', 0.9743589743589743),
('AUC', 0.990516206482593),
('Log Loss', 0.1164316714613053),
('MCC', 0.9336637889421326),
('# Training', 380),
('# Testing', 189)])],
'training_time': 7.461816072463989},
5: {'id': 5,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.6481718720511973,
'min_child_weight': 4.314173858564932,
'max_depth': 6,
'impute_strategy': 'most_frequent',
'percent_features': 0.871312026764351},
'score': 0.966592074666908,
'high_variance_cv': False,
'scores': [0.9543568464730291, 0.9752066115702479, 0.9702127659574468],
'all_objective_scores': [OrderedDict([('F1', 0.9543568464730291),
('Precision', 0.9426229508196722),
('Recall', 0.9543568464730291),
('AUC', 0.9899396378269618),
('Log Loss', 0.12702225128151967),
('MCC', 0.8757606542930872),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9752066115702479),
('Precision', 0.959349593495935),
('Recall', 0.9752066115702479),
('AUC', 0.9965676411409634),
('Log Loss', 0.0801103590350402),
('MCC', 0.9327267201397125),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9702127659574468),
('Precision', 0.9827586206896551),
('Recall', 0.9702127659574468),
('AUC', 0.9858343337334934),
('Log Loss', 0.1270006743029361),
('MCC', 0.9218075091290715),
('# Training', 380),
('# Testing', 189)])],
'training_time': 0.33750486373901367},
6: {'id': 6,
'pipeline_name': 'LogisticRegressionPipeline',
'parameters': {'penalty': 'l2',
'C': 6.239401330891865,
'impute_strategy': 'median'},
'score': 0.9748529087969783,
'high_variance_cv': False,
'scores': [0.9747899159663865, 0.9754098360655737, 0.9743589743589743],
'all_objective_scores': [OrderedDict([('F1', 0.9747899159663865),
('Precision', 0.9747899159663865),
('Recall', 0.9747899159663865),
('AUC', 0.9889927802106758),
('Log Loss', 0.17491241567239438),
('MCC', 0.932536394839626),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9754098360655737),
('Precision', 0.952),
('Recall', 0.9754098360655737),
('AUC', 0.9870990649781038),
('Log Loss', 0.13982009938625542),
('MCC', 0.933568045604951),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9743589743589743),
('Precision', 0.991304347826087),
('Recall', 0.9743589743589743),
('AUC', 0.990516206482593),
('Log Loss', 0.1109645583402926),
('MCC', 0.9336637889421326),
('# Training', 380),
('# Testing', 189)])],
'training_time': 7.343135118484497},
7: {'id': 7,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.9786183422327642,
'min_child_weight': 8.192427077950514,
'max_depth': 20,
'impute_strategy': 'median',
'percent_features': 0.6820907348177707},
'score': 0.9639126305792973,
'high_variance_cv': False,
'scores': [0.9547325102880658, 0.9711934156378601, 0.9658119658119659],
'all_objective_scores': [OrderedDict([('F1', 0.9547325102880658),
('Precision', 0.9354838709677419),
('Recall', 0.9547325102880658),
('AUC', 0.9853237069475678),
('Log Loss', 0.15021697619047605),
('MCC', 0.8759603969361893),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9711934156378601),
('Precision', 0.9516129032258065),
('Recall', 0.9711934156378601),
('AUC', 0.9950289975144987),
('Log Loss', 0.10607622409680564),
('MCC', 0.9216584956231404),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9658119658119659),
('Precision', 0.9826086956521739),
('Recall', 0.9658119658119659),
('AUC', 0.9834333733493397),
('Log Loss', 0.13131227825704234),
('MCC', 0.9112159507396058),
('# Training', 380),
('# Testing', 189)])],
'training_time': 0.26775383949279785},
8: {'id': 8,
'pipeline_name': 'LogisticRegressionPipeline',
'parameters': {'penalty': 'l2',
'C': 0.5765626434012575,
'impute_strategy': 'mean'},
'score': 0.9805269796885542,
'high_variance_cv': False,
'scores': [0.9874476987447698, 0.9754098360655737, 0.9787234042553192],
'all_objective_scores': [OrderedDict([('F1', 0.9874476987447698),
('Precision', 0.9833333333333333),
('Recall', 0.9874476987447698),
('AUC', 0.994910640312463),
('Log Loss', 0.08726565374201126),
('MCC', 0.9662335358054943),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9754098360655737),
('Precision', 0.952),
('Recall', 0.9754098360655737),
('AUC', 0.9979879275653923),
('Log Loss', 0.0764559127800754),
('MCC', 0.933568045604951),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9787234042553192),
('Precision', 0.9913793103448276),
('Recall', 0.9787234042553192),
('AUC', 0.9903961584633854),
('Log Loss', 0.09774553003325108),
('MCC', 0.9443109474170326),
('# Training', 380),
('# Testing', 189)])],
'training_time': 7.57702112197876},
9: {'id': 9,
'pipeline_name': 'LogisticRegressionPipeline',
'parameters': {'penalty': 'l2',
'C': 8.123565600467177,
'impute_strategy': 'median'},
'score': 0.9748529087969783,
'high_variance_cv': False,
'scores': [0.9747899159663865, 0.9754098360655737, 0.9743589743589743],
'all_objective_scores': [OrderedDict([('F1', 0.9747899159663865),
('Precision', 0.9747899159663865),
('Recall', 0.9747899159663865),
('AUC', 0.9886377086045686),
('Log Loss', 0.19170510282820305),
('MCC', 0.932536394839626),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9754098360655737),
('Precision', 0.952),
('Recall', 0.9754098360655737),
('AUC', 0.9850869925434962),
('Log Loss', 0.15159254810085362),
('MCC', 0.933568045604951),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9743589743589743),
('Precision', 0.991304347826087),
('Recall', 0.9743589743589743),
('AUC', 0.990516206482593),
('Log Loss', 0.11566930634571038),
('MCC', 0.9336637889421326),
('# Training', 380),
('# Testing', 189)])],
'training_time': 7.280526161193848}}