Exploring search results¶
After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.
[1]:
import evalml
X, y = evalml.demos.load_breast_cancer()
clf = evalml.AutoClassifier(objective="f1",
max_pipelines=5)
clf.fit(X, y)
*****************************
* Beginning pipeline search *
*****************************
Optimizing for F1. Greater score is better.
Searching up to 5 pipelines.
Possible model types: xgboost, linear_model, random_forest
✔ XGBoost Classifier w/ One Hot Encod... 0%| | Elapsed:00:02
✔ XGBoost Classifier w/ One Hot Encod... 20%|██ | Elapsed:00:04
✔ Random Forest Classifier w/ One Hot... 40%|████ | Elapsed:00:14
✔ XGBoost Classifier w/ One Hot Encod... 60%|██████ | Elapsed:00:17
✔ Logistic Regression Classifier w/ O... 80%|████████ | Elapsed:00:19
✔ Logistic Regression Classifier w/ O... 100%|██████████| Elapsed:00:19
✔ Optimization finished
View Rankings¶
A summary of all the pipelines built can be returned as a dataframe. It is sorted by score. EvalML knows based on your objective function whether or not high or lower is better.
[2]:
clf.rankings
[2]:
id | pipeline_name | score | high_variance_cv | parameters | |
---|---|---|---|---|---|
0 | 4 | LogisticRegressionPipeline | 0.973411 | False | {'penalty': 'l2', 'C': 8.444214828324364, 'imp... |
1 | 1 | XGBoostPipeline | 0.970626 | False | {'eta': 0.38438170729269994, 'min_child_weight... |
2 | 2 | RFClassificationPipeline | 0.966846 | False | {'n_estimators': 569, 'max_depth': 22, 'impute... |
3 | 0 | XGBoostPipeline | 0.965192 | False | {'eta': 0.5928446182250184, 'min_child_weight'... |
4 | 3 | XGBoostPipeline | 0.952237 | False | {'eta': 0.5288949197529046, 'min_child_weight'... |
Describe Pipeline¶
Each pipeline is given an id
. We can get more information about any particular pipeline using that id
[3]:
clf.describe_pipeline(0)
********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************
Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: F1 (greater is better)
Number of features: 18
Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
* impute_strategy : most_frequent
3. RF Classifier Select From Model
* percent_features : 0.6273280598181127
* threshold : -inf
4. XGBoost Classifier
* eta : 0.5928446182250184
* max_depth : 4
* min_child_weight : 8.598391737229157
Training
========
Training for Binary Classification problems.
Total training time (including CV): 2.4 seconds
Cross Validation
----------------
F1 Precision Recall AUC Log Loss MCC # Training # Testing
0 0.950 0.935 0.950 0.985 0.154 0.864 379.000 190.000
1 0.975 0.959 0.975 0.996 0.102 0.933 379.000 190.000
2 0.970 0.991 0.970 0.983 0.137 0.923 380.000 189.000
mean 0.965 0.962 0.965 0.988 0.131 0.907 - -
std 0.013 0.028 0.013 0.007 0.026 0.037 - -
coef of var 0.014 0.029 0.014 0.007 0.202 0.041 - -
Get Pipeline¶
You can get the object for any pipeline as well
[4]:
clf.get_pipeline(0)
[4]:
<evalml.pipelines.classification.xgboost.XGBoostPipeline at 0x7fb39df65710>
Get best pipeline¶
If you specifically want to get the best pipeline, there is a convenient access.
[5]:
clf.best_pipeline
[5]:
<evalml.pipelines.classification.logistic_regression.LogisticRegressionPipeline at 0x7fb398e0dac8>
Feature Importances¶
We can get the feature importances of the resulting pipeline
[6]:
pipeline = clf.get_pipeline(0)
pipeline.feature_importances
[6]:
feature | importance | |
---|---|---|
0 | 22 | 0.407441 |
1 | 7 | 0.239457 |
2 | 27 | 0.120609 |
3 | 20 | 0.072031 |
4 | 23 | 0.052818 |
5 | 6 | 0.038344 |
6 | 1 | 0.033962 |
7 | 21 | 0.028949 |
8 | 4 | 0.003987 |
9 | 25 | 0.002403 |
10 | 0 | 0.000000 |
11 | 2 | 0.000000 |
12 | 3 | 0.000000 |
13 | 12 | 0.000000 |
14 | 13 | 0.000000 |
15 | 18 | 0.000000 |
16 | 19 | 0.000000 |
17 | 29 | 0.000000 |
Access raw results¶
You can also get access to all the underlying data like this
[7]:
clf.results
[7]:
{0: {'id': 0,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.5928446182250184,
'min_child_weight': 8.598391737229157,
'max_depth': 4,
'impute_strategy': 'most_frequent',
'percent_features': 0.6273280598181127},
'score': 0.9651923054186028,
'high_variance_cv': False,
'scores': [0.9504132231404958, 0.9752066115702479, 0.9699570815450643],
'all_objective_scores': [OrderedDict([('F1', 0.9504132231404958),
('Precision', 0.9349593495934959),
('Recall', 0.9504132231404958),
('AUC', 0.984731920937389),
('Log Loss', 0.1536501646286955),
('MCC', 0.8644170412909863),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9752066115702479),
('Precision', 0.959349593495935),
('Recall', 0.9752066115702479),
('AUC', 0.9960350337318026),
('Log Loss', 0.10194972527066344),
('MCC', 0.9327267201397125),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9699570815450643),
('Precision', 0.9912280701754386),
('Recall', 0.9699570815450643),
('AUC', 0.983313325330132),
('Log Loss', 0.13664108974533895),
('MCC', 0.9231826763268304),
('# Training', 380),
('# Testing', 189)])],
'training_time': 2.4058921337127686},
1: {'id': 1,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.38438170729269994,
'min_child_weight': 3.677811458900251,
'max_depth': 13,
'impute_strategy': 'median',
'percent_features': 0.793807787701838},
'score': 0.9706261399583499,
'high_variance_cv': False,
'scores': [0.9707112970711297, 0.9709543568464729, 0.9702127659574468],
'all_objective_scores': [OrderedDict([('F1', 0.9707112970711297),
('Precision', 0.9666666666666667),
('Recall', 0.9707112970711297),
('AUC', 0.9917149958574978),
('Log Loss', 0.11573912222979982),
('MCC', 0.9211268105467613),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9709543568464729),
('Precision', 0.9590163934426229),
('Recall', 0.9709543568464729),
('AUC', 0.9969227127470707),
('Log Loss', 0.07704140603003141),
('MCC', 0.9211492315750531),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9702127659574468),
('Precision', 0.9827586206896551),
('Recall', 0.9702127659574468),
('AUC', 0.9857142857142858),
('Log Loss', 0.12628072745317012),
('MCC', 0.9218075091290715),
('# Training', 380),
('# Testing', 189)])],
'training_time': 2.4553000926971436},
2: {'id': 2,
'pipeline_name': 'RFClassificationPipeline',
'parameters': {'n_estimators': 569,
'max_depth': 22,
'impute_strategy': 'most_frequent',
'percent_features': 0.8593661614465293},
'score': 0.9668456397284798,
'high_variance_cv': False,
'scores': [0.9508196721311476, 0.979253112033195, 0.970464135021097],
'all_objective_scores': [OrderedDict([('F1', 0.9508196721311476),
('Precision', 0.928),
('Recall', 0.9508196721311476),
('AUC', 0.9889336016096579),
('Log Loss', 0.1388421748025717),
('MCC', 0.8647724688764672),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.979253112033195),
('Precision', 0.9672131147540983),
('Recall', 0.979253112033195),
('AUC', 0.9898804592259438),
('Log Loss', 0.11232987225229708),
('MCC', 0.943843520216036),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.970464135021097),
('Precision', 0.9745762711864406),
('Recall', 0.970464135021097),
('AUC', 0.9906362545018007),
('Log Loss', 0.11575295379524118),
('MCC', 0.9208800271662652),
('# Training', 380),
('# Testing', 189)])],
'training_time': 10.012419939041138},
3: {'id': 3,
'pipeline_name': 'XGBoostPipeline',
'parameters': {'eta': 0.5288949197529046,
'min_child_weight': 6.112401049845392,
'max_depth': 6,
'impute_strategy': 'most_frequent',
'percent_features': 0.34402219881309576},
'score': 0.9522372250281359,
'high_variance_cv': False,
'scores': [0.9367088607594938, 0.9672131147540983, 0.9527896995708156],
'all_objective_scores': [OrderedDict([('F1', 0.9367088607594938),
('Precision', 0.940677966101695),
('Recall', 0.9367088607594938),
('AUC', 0.9821872410936205),
('Log Loss', 0.16857726289400538),
('MCC', 0.8318710075349047),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9672131147540983),
('Precision', 0.944),
('Recall', 0.9672131147540983),
('AUC', 0.9937270682921056),
('Log Loss', 0.10433676970853029),
('MCC', 0.9106361866954563),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9527896995708156),
('Precision', 0.9736842105263158),
('Recall', 0.9527896995708156),
('AUC', 0.9845138055222089),
('Log Loss', 0.14270813122179812),
('MCC', 0.8783921421654207),
('# Training', 380),
('# Testing', 189)])],
'training_time': 2.381497859954834},
4: {'id': 4,
'pipeline_name': 'LogisticRegressionPipeline',
'parameters': {'penalty': 'l2',
'C': 8.444214828324364,
'impute_strategy': 'most_frequent'},
'score': 0.9734109818152151,
'high_variance_cv': False,
'scores': [0.970464135021097, 0.9754098360655737, 0.9743589743589743],
'all_objective_scores': [OrderedDict([('F1', 0.970464135021097),
('Precision', 0.9745762711864406),
('Recall', 0.970464135021097),
('AUC', 0.9885193514025328),
('Log Loss', 0.1943294590818862),
('MCC', 0.9215733295732883),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9754098360655737),
('Precision', 0.952),
('Recall', 0.9754098360655737),
('AUC', 0.9849686353414605),
('Log Loss', 0.1533799764180264),
('MCC', 0.933568045604951),
('# Training', 379),
('# Testing', 190)]),
OrderedDict([('F1', 0.9743589743589743),
('Precision', 0.991304347826087),
('Recall', 0.9743589743589743),
('AUC', 0.990516206482593),
('Log Loss', 0.1164316714613053),
('MCC', 0.9336637889421326),
('# Training', 380),
('# Testing', 189)])],
'training_time': 2.423741579055786}}