EvalML Components and Pipelines

EvalML searches and trains multiple machine learnining pipelines in order to find the best one for your data. Each pipeline is made up of various components that can learn from the data, transform the data and ultimately predict labels given new data. Below we’ll show an example of an EvalML pipeline. You can find a more in-depth look into components or learn how you can construct and use your own pipelines.

XGBoost Pipeline

The EvalML XGBoost Pipeline is made up of four different components: a one-hot encoder, a missing value imputer, a feature selector and an XGBoost estimator. We can see them here by calling .plot():

[1]:
from evalml.pipelines import XGBoostPipeline

xgp = XGBoostPipeline(objective='recall', eta=0.5, min_child_weight=5, max_depth=10, impute_strategy='mean', percent_features=0.5, number_features=10)
xgp.graph()
[1]:
../_images/pipelines_overview_3_0.svg

From the above graph we can see each component and its parameters. Each component takes in data and feeds it to the next. You can see more detailed information by calling .describe():

[2]:
xgp.describe()
********************************************************************************************
* XGBoost Classifier w/ One Hot Encoder + Simple Imputer + RF Classifier Select From Model *
********************************************************************************************

Problem Types: Binary Classification, Multiclass Classification
Model Type: XGBoost Classifier
Objective to Optimize: Recall (greater is better)

Pipeline Steps
==============
1. One Hot Encoder
2. Simple Imputer
         * impute_strategy : mean
3. RF Classifier Select From Model
         * percent_features : 0.5
         * threshold : -inf
4. XGBoost Classifier
         * eta : 0.5
         * max_depth : 10
         * min_child_weight : 5
         * n_estimators : 10

You can then fit and score an individual pipeline:

[3]:
import evalml

X, y = evalml.demos.load_breast_cancer()
xgp.fit(X, y)

xgp.score(X, y)
[3]:
(0.9775910364145658, {})