engine_base

Module Contents

Classes Summary

EngineBase

Helper class that provides a standard way to create an ABC using

EngineComputation

Wrapper around the result of a (possibly asynchronous) engine computation.

JobLogger

Mimics the behavior of a python logging.Logger but stores all messages rather than actually logging them.

Functions

evaluate_pipeline

Function submitted to the submit_evaluation_job engine method.

score_pipeline

Wrapper around pipeline.score method to make it easy to score pipelines with dask.

train_and_score_pipeline

Given a pipeline, config and data, train and score the pipeline and return the CV or TV scores

train_pipeline

Train a pipeline and tune the threshold if necessary.

Contents

class evalml.automl.engine.engine_base.EngineBase[source]

Helper class that provides a standard way to create an ABC using inheritance.

Methods

setup_job_log

submit_evaluation_job

Submit job for pipeline evaluation during AutoMLSearch.

submit_scoring_job

Submit job for pipeline scoring.

submit_training_job

Submit job for pipeline training.

static setup_job_log()[source]
abstract submit_evaluation_job(self, automl_config, pipeline, X, y)[source]

Submit job for pipeline evaluation during AutoMLSearch.

abstract submit_scoring_job(self, automl_config, pipeline, X, y, objectives)[source]

Submit job for pipeline scoring.

abstract submit_training_job(self, automl_config, pipeline, X, y)[source]

Submit job for pipeline training.

class evalml.automl.engine.engine_base.EngineComputation[source]

Wrapper around the result of a (possibly asynchronous) engine computation.

Methods

cancel

Cancel the computation.

done

Whether the computation is done.

get_result

Gets the computation result.

abstract cancel(self)[source]

Cancel the computation.

abstract done(self)[source]

Whether the computation is done.

abstract get_result(self)[source]

Gets the computation result. Will block until the computation is finished.

Raises Exception: If computation fails. Returns traceback.

evalml.automl.engine.engine_base.evaluate_pipeline(pipeline, automl_config, X, y, logger)[source]

Function submitted to the submit_evaluation_job engine method.

Parameters
  • pipeline (PipelineBase) – The pipeline to score

  • automl_config (AutoMLConfig) – The AutoMLSearch object, used to access config and the error callback

  • X (pd.DataFrame) – Training features

  • y (pd.Series) – Training target

Returns

First - A dict containing cv_score_mean, cv_scores, training_time and a cv_data structure with details.

Second - The pipeline class we trained and scored. Third - the job logger instance with all the recorded messages.

Return type

tuple of three items

class evalml.automl.engine.engine_base.JobLogger[source]

Mimics the behavior of a python logging.Logger but stores all messages rather than actually logging them.

This is used during engine jobs so that log messages are recorded after the job completes. This is desired so that all of the messages for a single job are grouped together in the log.

Methods

debug

Store message at the debug level.

error

Store message at the error level.

info

Store message at the info level.

warning

Store message at the warning level.

write_to_logger

Write all the messages to the logger. First In First Out order.

debug(self, msg)[source]

Store message at the debug level.

error(self, msg)[source]

Store message at the error level.

info(self, msg)[source]

Store message at the info level.

warning(self, msg)[source]

Store message at the warning level.

write_to_logger(self, logger)[source]

Write all the messages to the logger. First In First Out order.

evalml.automl.engine.engine_base.score_pipeline(pipeline, X, y, objectives, X_schema=None, y_schema=None)[source]

Wrapper around pipeline.score method to make it easy to score pipelines with dask.

Arguments: pipeline (PipelineBase): The pipeline to score. X (pd.DataFrame): Features to score on. y (pd.Series): Target used to calcualte scores. X_schema (ww.TableSchema): Schema for features. y_schema (ww.ColumnSchema): Schema for columns.

Returns

dict containing pipeline scores.

evalml.automl.engine.engine_base.train_and_score_pipeline(pipeline, automl_config, full_X_train, full_y_train, logger)[source]

Given a pipeline, config and data, train and score the pipeline and return the CV or TV scores

Parameters
  • pipeline (PipelineBase) – The pipeline to score

  • automl_config (AutoMLSearch) – The AutoMLSearch object, used to access config and the error callback

  • full_X_train (pd.DataFrame) – Training features

  • full_y_train (pd.Series) – Training target

Returns

First - A dict containing cv_score_mean, cv_scores, training_time and a cv_data structure with details.

Second - The pipeline class we trained and scored. Third - the job logger instance with all the recorded messages.

Return type

tuple of three items

evalml.automl.engine.engine_base.train_pipeline(pipeline, X, y, automl_config, schema=True)[source]

Train a pipeline and tune the threshold if necessary.

Parameters
  • pipeline (PipelineBase) – Pipeline to train.

  • X (pd.DataFrame) – Features to train on.

  • y (pd.Series) – Target to train on.

  • automl_config (AutoMLSearch) – The AutoMLSearch object, used to access config and the error callback

  • schema (bool) – Whether to use the schemas for X and y

Returns

trained pipeline.

Return type

pipeline (PipelineBase)