Exploring search results¶
After finishing a pipeline search, we can inspect the results. First, let’s build a search of 10 different pipelines to explore.
[1]:
import evalml
from evalml import AutoClassificationSearch
X, y = evalml.demos.load_breast_cancer()
automl = AutoClassificationSearch(objective="f1",
max_pipelines=5)
automl.search(X, y)
*****************************
* Beginning pipeline search *
*****************************
Optimizing for F1. Greater score is better.
Searching up to 5 pipelines.
✔ XGBoost Binary Classification Pipel... 20%|██ | Elapsed:00:05
✔ Random Forest Binary Classification... 40%|████ | Elapsed:00:18
✔ Logistic Regression Binary Pipeline: 60%|██████ | Elapsed:00:19
✔ XGBoost Binary Classification Pipel... 80%|████████ | Elapsed:00:26
✔ XGBoost Binary Classification Pipel... 100%|██████████| Elapsed:00:32
✔ Optimization finished 100%|██████████| Elapsed:00:32
View Rankings¶
A summary of all the pipelines built can be returned as a pandas DataFrame. It is sorted by score. EvalML knows based on our objective function whether higher or lower is better.
[2]:
automl.rankings
[2]:
id | pipeline_name | score | high_variance_cv | parameters | |
---|---|---|---|---|---|
0 | 2 | Logistic Regression Binary Pipeline | 0.982042 | False | {'impute_strategy': 'mean', 'penalty': 'l2', '... |
1 | 0 | XGBoost Binary Classification Pipeline | 0.976191 | False | {'impute_strategy': 'most_frequent', 'percent_... |
2 | 1 | Random Forest Binary Classification Pipeline | 0.958032 | False | {'impute_strategy': 'median', 'percent_feature... |
Describe Pipeline¶
Each pipeline is given an id
. We can get more information about any particular pipeline using that id
. Here, we will get more information about the pipeline with id = 0
.
[3]:
automl.describe_pipeline(0)
******************************************
* XGBoost Binary Classification Pipeline *
******************************************
Problem Type: Binary Classification
Model Family: XGBoost
Number of features: 25
Pipeline Steps
==============
1. One Hot Encoder
* top_n : 10
2. Simple Imputer
* impute_strategy : most_frequent
* fill_value : None
3. RF Classifier Select From Model
* percent_features : 0.8487792213962843
* threshold : -inf
4. XGBoost Classifier
* eta : 0.38438170729269994
* max_depth : 7
* min_child_weight : 1.5104167958569887
* n_estimators : 397
Training
========
Training for Binary Classification problems.
Total training time (including CV): 5.4 seconds
Cross Validation
----------------
F1 Accuracy Binary Balanced Accuracy Binary Precision Recall AUC Log Loss Binary MCC Binary # Training # Testing
0 0.962 0.953 0.954 0.974 0.950 0.988 0.138 0.900 379.000 190.000
1 0.979 0.974 0.965 0.960 1.000 0.997 0.071 0.945 379.000 190.000
2 0.987 0.984 0.982 0.983 0.992 0.997 0.075 0.966 380.000 189.000
mean 0.976 0.970 0.967 0.972 0.980 0.994 0.095 0.937 - -
std 0.013 0.016 0.014 0.012 0.027 0.005 0.037 0.034 - -
coef of var 0.013 0.017 0.015 0.012 0.028 0.006 0.395 0.036 - -
Get Pipeline¶
We can get the object of any pipeline via their id
as well:
[4]:
automl.get_pipeline(0)
[4]:
<evalml.pipelines.classification.xgboost_binary.XGBoostBinaryPipeline at 0x7fb4afd23b00>
Get best pipeline¶
If we specifically want to get the best pipeline, there is a convenient access
[5]:
automl.best_pipeline
[5]:
<evalml.pipelines.classification.logistic_regression_binary.LogisticRegressionBinaryPipeline at 0x7fb4af46ef28>
Feature Importances¶
We can get the feature importances of the resulting pipeline
[6]:
pipeline = automl.get_pipeline(0)
pipeline.feature_importances
[6]:
feature | importance | |
---|---|---|
0 | mean concave points | 0.465049 |
1 | worst concave points | 0.246494 |
2 | worst radius | 0.089427 |
3 | worst area | 0.045472 |
4 | mean texture | 0.029848 |
5 | worst concavity | 0.020971 |
6 | area error | 0.020298 |
7 | radius error | 0.018571 |
8 | worst texture | 0.014910 |
9 | worst smoothness | 0.010209 |
10 | mean area | 0.006383 |
11 | mean concavity | 0.004976 |
12 | mean smoothness | 0.004681 |
13 | worst perimeter | 0.004660 |
14 | worst symmetry | 0.004073 |
15 | concavity error | 0.003436 |
16 | mean compactness | 0.003422 |
17 | worst fractal dimension | 0.002782 |
18 | smoothness error | 0.001911 |
19 | fractal dimension error | 0.001905 |
20 | symmetry error | 0.000420 |
21 | perimeter error | 0.000101 |
22 | mean radius | 0.000000 |
23 | mean perimeter | 0.000000 |
24 | worst compactness | 0.000000 |
We can also create a bar plot of the feature importances
[7]:
pipeline.graph_feature_importance()
Access raw results¶
You can also get access to all the underlying data, like this:
[8]:
automl.results
[8]:
{'pipeline_results': {0: {'id': 0,
'pipeline_name': 'XGBoost Binary Classification Pipeline',
'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
'parameters': {'impute_strategy': 'most_frequent',
'percent_features': 0.8487792213962843,
'threshold': -inf,
'eta': 0.38438170729269994,
'max_depth': 7,
'min_child_weight': 1.5104167958569887,
'n_estimators': 397},
'score': 0.9761912315723671,
'high_variance_cv': False,
'training_time': 5.410717964172363,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9617021276595743),
('Accuracy Binary', 0.9526315789473684),
('Balanced Accuracy Binary', 0.9536631554030062),
('Precision', 0.9741379310344828),
('Recall', 0.9495798319327731),
('AUC', 0.9876908509882827),
('Log Loss Binary', 0.13808748615334288),
('MCC Binary', 0.9001633057441626),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9617021276595743},
{'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
('Accuracy Binary', 0.9736842105263158),
('Balanced Accuracy Binary', 0.9647887323943662),
('Precision', 0.9596774193548387),
('Recall', 1.0),
('AUC', 0.9973961415552136),
('Log Loss Binary', 0.07131786501827025),
('MCC Binary', 0.9445075449666159),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9794238683127572},
{'all_objective_scores': OrderedDict([('F1', 0.9874476987447698),
('Accuracy Binary', 0.9841269841269841),
('Balanced Accuracy Binary', 0.9815126050420169),
('Precision', 0.9833333333333333),
('Recall', 0.9915966386554622),
('AUC', 0.996998799519808),
('Log Loss Binary', 0.07531116866342562),
('MCC Binary', 0.9659285184801715),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9874476987447698}]},
1: {'id': 1,
'pipeline_name': 'Random Forest Binary Classification Pipeline',
'pipeline_summary': 'Random Forest Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
'parameters': {'impute_strategy': 'median',
'percent_features': 0.8140470414877383,
'threshold': 'mean',
'n_estimators': 859,
'max_depth': 6},
'score': 0.9580315415303952,
'high_variance_cv': False,
'training_time': 12.62360143661499,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9361702127659575),
('Accuracy Binary', 0.9210526315789473),
('Balanced Accuracy Binary', 0.9199313528228192),
('Precision', 0.9482758620689655),
('Recall', 0.9243697478991597),
('AUC', 0.9766836311989585),
('Log Loss Binary', 0.20455160484518806),
('MCC Binary', 0.833232300751445),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9361702127659575},
{'all_objective_scores': OrderedDict([('F1', 0.9672131147540983),
('Accuracy Binary', 0.9578947368421052),
('Balanced Accuracy Binary', 0.9465025446798438),
('Precision', 0.944),
('Recall', 0.9915966386554622),
('AUC', 0.9838442419221209),
('Log Loss Binary', 0.14826817405619716),
('MCC Binary', 0.9106361866954563),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9672131147540983},
{'all_objective_scores': OrderedDict([('F1', 0.9707112970711297),
('Accuracy Binary', 0.9629629629629629),
('Balanced Accuracy Binary', 0.9588235294117646),
('Precision', 0.9666666666666667),
('Recall', 0.9747899159663865),
('AUC', 0.9942376950780312),
('Log Loss Binary', 0.10344817959803934),
('MCC Binary', 0.9204135621119959),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9707112970711297}]},
2: {'id': 2,
'pipeline_name': 'Logistic Regression Binary Pipeline',
'pipeline_summary': 'Logistic Regression Classifier w/ One Hot Encoder + Simple Imputer + Standard Scaler',
'parameters': {'impute_strategy': 'mean',
'penalty': 'l2',
'C': 0.21198179042885398},
'score': 0.9820415596969072,
'high_variance_cv': False,
'training_time': 1.3488342761993408,
'cv_data': [{'all_objective_scores': OrderedDict([('F1', 0.979253112033195),
('Accuracy Binary', 0.9736842105263158),
('Balanced Accuracy Binary', 0.9676293052432241),
('Precision', 0.9672131147540983),
('Recall', 0.9915966386554622),
('AUC', 0.9904130666351048),
('Log Loss Binary', 0.10058063355386729),
('MCC Binary', 0.943843520216036),
('# Training', 379),
('# Testing', 190)]),
'score': 0.979253112033195},
{'all_objective_scores': OrderedDict([('F1', 0.9794238683127572),
('Accuracy Binary', 0.9736842105263158),
('Balanced Accuracy Binary', 0.9647887323943662),
('Precision', 0.9596774193548387),
('Recall', 1.0),
('AUC', 0.9989347851816782),
('Log Loss Binary', 0.07682029301742287),
('MCC Binary', 0.9445075449666159),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9794238683127572},
{'all_objective_scores': OrderedDict([('F1', 0.9874476987447698),
('Accuracy Binary', 0.9841269841269841),
('Balanced Accuracy Binary', 0.9815126050420169),
('Precision', 0.9833333333333333),
('Recall', 0.9915966386554622),
('AUC', 0.997358943577431),
('Log Loss Binary', 0.08090403408994591),
('MCC Binary', 0.9659285184801715),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9874476987447698}]},
3: {'id': 3,
'pipeline_name': 'XGBoost Binary Classification Pipeline',
'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
'parameters': {'impute_strategy': 'most_frequent',
'percent_features': 0.14894727260851873,
'threshold': -inf,
'eta': 0.4736080452737106,
'max_depth': 18,
'min_child_weight': 5.153314260276387,
'n_estimators': 660},
'score': 0.941255546698183,
'high_variance_cv': False,
'training_time': 6.667321681976318,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9264069264069265),
('Accuracy Binary', 0.9105263157894737),
('Balanced Accuracy Binary', 0.9143685643271393),
('Precision', 0.9553571428571429),
('Recall', 0.8991596638655462),
('AUC', 0.9715942715114214),
('Log Loss Binary', 0.2351054900534157),
('MCC Binary', 0.8150103776135726),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9264069264069265},
{'all_objective_scores': OrderedDict([('F1', 0.9482071713147411),
('Accuracy Binary', 0.9315789473684211),
('Balanced Accuracy Binary', 0.9084507042253521),
('Precision', 0.9015151515151515),
('Recall', 1.0),
('AUC', 0.9784589892294946),
('Log Loss Binary', 0.18131056035061574),
('MCC Binary', 0.858166066103978),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9482071713147411},
{'all_objective_scores': OrderedDict([('F1', 0.9491525423728814),
('Accuracy Binary', 0.9365079365079365),
('Balanced Accuracy Binary', 0.9348739495798319),
('Precision', 0.9572649572649573),
('Recall', 0.9411764705882353),
('AUC', 0.9841536614645858),
('Log Loss Binary', 0.16492396169563844),
('MCC Binary', 0.8648817040445186),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9491525423728814}]},
4: {'id': 4,
'pipeline_name': 'XGBoost Binary Classification Pipeline',
'pipeline_summary': 'XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model',
'parameters': {'impute_strategy': 'mean',
'percent_features': 0.6435218111142487,
'threshold': 'mean',
'eta': 0.9446689170495841,
'max_depth': 11,
'min_child_weight': 4.731957459914713,
'n_estimators': 676},
'score': 0.9486606279409701,
'high_variance_cv': False,
'training_time': 6.609421014785767,
'cv_data': [{'all_objective_scores': OrderedDict([('F1',
0.9210526315789473),
('Accuracy Binary', 0.9052631578947369),
('Balanced Accuracy Binary', 0.9130074565037283),
('Precision', 0.963302752293578),
('Recall', 0.8823529411764706),
('AUC', 0.975085808971476),
('Log Loss Binary', 0.2385086150043016),
('MCC Binary', 0.8080435814236837),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9210526315789473},
{'all_objective_scores': OrderedDict([('F1', 0.9709543568464729),
('Accuracy Binary', 0.9631578947368421),
('Balanced Accuracy Binary', 0.9563853710498283),
('Precision', 0.9590163934426229),
('Recall', 0.9831932773109243),
('AUC', 0.9697597348798673),
('Log Loss Binary', 0.13901819948468505),
('MCC Binary', 0.9211492315750531),
('# Training', 379),
('# Testing', 190)]),
'score': 0.9709543568464729},
{'all_objective_scores': OrderedDict([('F1', 0.9539748953974896),
('Accuracy Binary', 0.9417989417989417),
('Balanced Accuracy Binary', 0.9361344537815126),
('Precision', 0.95),
('Recall', 0.957983193277311),
('AUC', 0.9845738295318127),
('Log Loss Binary', 0.13538144654258666),
('MCC Binary', 0.8748986057438203),
('# Training', 380),
('# Testing', 189)]),
'score': 0.9539748953974896}]}},
'search_order': [0, 1, 2, 3, 4]}