Custom Pipelines in EvalML¶
EvalML pipelines consist of modular components combining any number of transformers and an estimator. This allows you to create pipelines that fit the needs of your data to achieve the best results.
Requirements¶
A custom pipeline must adhere to the following requirements:
Inherit from the proper pipeline base class
Binary classification -
BinaryClassificationPipeline
Multiclass classification -
MulticlassClassificationPipeline
Regression -
RegressionPipeline
Have a
component_graph
list as a class variable detailing the structure of the pipeline. Each component in the graph can be provided as either a string name or an instance.
Pipeline Configuration¶
There are a few other options to configure your custom pipeline.
Custom Name¶
By default, a pipeline classes name property is the result of adding spaces between each Pascal case capitalization in the class name. E.g. LogisticRegressionPipeline.name will return ‘Logistic Regression Pipeline’. Therefore, we suggest custom pipelines use Pascal case for their class names.
If you’d like to override the pipeline classes name attribute so it isn’t derived from the class name, you can set the custom_name attribute, like so:
[1]:
from evalml.pipelines import BinaryClassificationPipeline
class CustomPipeline(BinaryClassificationPipeline):
component_graph = ['Simple Imputer', 'Logistic Regression Classifier']
custom_name = 'A custom pipeline name'
print(CustomPipeline.name)
A custom pipeline name
Custom Hyperparameters¶
To specify custom hyperparameter ranges, set the custom_hyperparameters property to be a dictionary where each key-value pair consists of a parameter name and range. AutoML will use this dictionary to override the hyperparameter ranges collected from each component in the component graph.
[2]:
class CustomPipeline(BinaryClassificationPipeline):
component_graph = ['Simple Imputer', 'Logistic Regression Classifier']
print("Without custom hyperparameters:")
print(CustomPipeline.hyperparameters)
class CustomPipeline(BinaryClassificationPipeline):
component_graph = ['Simple Imputer', 'Logistic Regression Classifier']
custom_hyperparameters = {
'impute_strategy': ['most_frequent']
}
print()
print("With custom hyperparameters:")
print(CustomPipeline.hyperparameters)
Without custom hyperparameters:
{'impute_strategy': ['mean', 'median', 'most_frequent'], 'penalty': ['l2'], 'C': Real(low=0.01, high=10, prior='uniform', transform='identity')}
With custom hyperparameters:
{'impute_strategy': ['most_frequent'], 'penalty': ['l2'], 'C': Real(low=0.01, high=10, prior='uniform', transform='identity')}