cf_engine#

Custom CFClient API to match Dask’s CFClient and allow context management.

Module Contents#

Classes Summary#

CFClient

Custom CFClient API to match Dask's CFClient and allow context management.

CFComputation

A Future-like wrapper around jobs created by the CFEngine.

CFEngine

The concurrent.futures (CF) engine.

Contents#

class evalml.automl.engine.cf_engine.CFClient(pool)[source]#

Custom CFClient API to match Dask’s CFClient and allow context management.

Parameters

pool (cf.ThreadPoolExecutor or cf.ProcessPoolExecutor) – The resource pool to execute the futures work on.

Methods

close

Closes the underlying Executor.

is_closed

Property that determines whether the Engine's Client's resources are closed.

submit

Pass through to imitate Dask's Client API.

close(self)[source]#

Closes the underlying Executor.

property is_closed(self)#

Property that determines whether the Engine’s Client’s resources are closed.

submit(self, *args, **kwargs)[source]#

Pass through to imitate Dask’s Client API.

class evalml.automl.engine.cf_engine.CFComputation(future)[source]#

A Future-like wrapper around jobs created by the CFEngine.

Parameters

future (cf.Future) – The concurrent.futures.Future that is desired to be executed.

Methods

cancel

Cancel the current computation.

done

Returns whether the computation is done.

get_result

Gets the computation result. Will block until the computation is finished.

is_cancelled

Returns whether computation was cancelled.

cancel(self)[source]#

Cancel the current computation.

Returns

False if the call is currently being executed or finished running

and cannot be cancelled. True if the call can be canceled.

Return type

bool

done(self)[source]#

Returns whether the computation is done.

get_result(self)[source]#

Gets the computation result. Will block until the computation is finished.

Raises
  • Exception – If computation fails. Returns traceback.

  • cf.TimeoutError – If computation takes longer than default timeout time.

  • cf.CancelledError – If computation was canceled before completing.

Returns

The result of the requested job.

property is_cancelled(self)#

Returns whether computation was cancelled.

class evalml.automl.engine.cf_engine.CFEngine(client=None)[source]#

The concurrent.futures (CF) engine.

Parameters

client (None or CFClient) – If None, creates a threaded pool for processing. Defaults to None.

Methods

close

Function to properly shutdown the Engine's Client's resources.

is_closed

Property that determines whether the Engine's Client's resources are shutdown.

setup_job_log

Set up logger for job.

submit_evaluation_job

Send evaluation job to cluster.

submit_scoring_job

Send scoring job to cluster.

submit_training_job

Send training job to cluster.

close(self)[source]#

Function to properly shutdown the Engine’s Client’s resources.

property is_closed(self)#

Property that determines whether the Engine’s Client’s resources are shutdown.

static setup_job_log()#

Set up logger for job.

submit_evaluation_job(self, automl_config, pipeline, X, y, X_holdout=None, y_holdout=None)[source]#

Send evaluation job to cluster.

Parameters
  • automl_config – Structure containing data passed from AutoMLSearch instance.

  • pipeline (pipeline.PipelineBase) – Pipeline to evaluate.

  • X (pd.DataFrame) – Input data for modeling.

  • y (pd.Series) – Target data for modeling.

  • X_holdout (pd.Series) – Holdout input data for holdout scoring.

  • y_holdout (pd.Series) – Holdout target data for holdout scoring.

Returns

An object wrapping a reference to a future-like computation

occurring in the resource pool

Return type

CFComputation

submit_scoring_job(self, automl_config, pipeline, X, y, objectives, X_train=None, y_train=None)[source]#

Send scoring job to cluster.

Parameters
  • automl_config – Structure containing data passed from AutoMLSearch instance.

  • pipeline (pipeline.PipelineBase) – Pipeline to train.

  • X (pd.DataFrame) – Input data for modeling.

  • y (pd.Series) – Target data for modeling.

  • X_train (pd.DataFrame) – Training features. Used for feature engineering in time series.

  • y_train (pd.Series) – Training target. Used for feature engineering in time series.

  • objectives (list[ObjectiveBase]) – Objectives to score on.

Returns

An object wrapping a reference to a future-like computation

occurring in the resource pool.

Return type

CFComputation

submit_training_job(self, automl_config, pipeline, X, y)[source]#

Send training job to cluster.

Parameters
  • automl_config – Structure containing data passed from AutoMLSearch instance.

  • pipeline (pipeline.PipelineBase) – Pipeline to train.

  • X (pd.DataFrame) – Input data for modeling.

  • y (pd.Series) – Target data for modeling.

Returns

An object wrapping a reference to a future-like computation

occurring in the resource pool

Return type

CFComputation