dask_engine¶
A Future-like wrapper around jobs created by the DaskEngine.
Module Contents¶
Classes Summary¶
A Future-like wrapper around jobs created by the DaskEngine. |
|
The dask engine. |
Contents¶
-
class
evalml.automl.engine.dask_engine.
DaskComputation
(dask_future)[source]¶ A Future-like wrapper around jobs created by the DaskEngine.
- Parameters
dask_future (callable) – Computation to do.
Methods
Cancel the current computation.
Returns whether the computation is done.
Gets the computation result. Will block until the computation is finished.
Returns whether computation was cancelled.
-
get_result
(self)[source]¶ Gets the computation result. Will block until the computation is finished.
- Raises
Exception – If computation fails. Returns traceback.
- Returns
Computation results.
-
property
is_cancelled
(self)¶ Returns whether computation was cancelled.
-
class
evalml.automl.engine.dask_engine.
DaskEngine
(cluster=None)[source]¶ The dask engine.
- Parameters
cluster (None or dd.Client) – If None, creates a local, threaded Dask client for processing. Defaults to None.
Methods
Closes the underlying cluster.
Property that determines whether the Engine’s Client’s resources are shutdown.
Send data to the cluster.
Set up logger for job.
Send evaluation job to cluster.
Send scoring job to cluster.
Send training job to cluster.
-
property
is_closed
(self)¶ Property that determines whether the Engine’s Client’s resources are shutdown.
-
send_data_to_cluster
(self, X, y)[source]¶ Send data to the cluster.
The implementation uses caching so the data is only sent once. This follows dask best practices.
- Parameters
X (pd.DataFrame) – Input data for modeling.
y (pd.Series) – Target data for modeling.
- Returns
The modeling data.
- Return type
dask.Future
-
static
setup_job_log
()¶ Set up logger for job.
-
submit_evaluation_job
(self, automl_config, pipeline, X, y)[source]¶ Send evaluation job to cluster.
- Parameters
automl_config – Structure containing data passed from AutoMLSearch instance.
pipeline (pipeline.PipelineBase) – Pipeline to evaluate.
X (pd.DataFrame) – Input data for modeling.
y (pd.Series) – Target data for modeling.
- Returns
- An object wrapping a reference to a future-like computation
occurring in the dask cluster.
- Return type
-
submit_scoring_job
(self, automl_config, pipeline, X, y, objectives)[source]¶ Send scoring job to cluster.
- Parameters
automl_config – Structure containing data passed from AutoMLSearch instance.
pipeline (pipeline.PipelineBase) – Pipeline to train.
X (pd.DataFrame) – Input data for modeling.
y (pd.Series) – Target data for modeling.
objectives (list[ObjectiveBase]) – List of objectives to score on.
- Returns
- An object wrapping a reference to a future-like computation
occurring in the dask cluster.
- Return type
-
submit_training_job
(self, automl_config, pipeline, X, y)[source]¶ Send training job to cluster.
- Parameters
automl_config – Structure containing data passed from AutoMLSearch instance.
pipeline (pipeline.PipelineBase) – Pipeline to train.
X (pd.DataFrame) – Input data for modeling.
y (pd.Series) – Target data for modeling.
- Returns
- An object wrapping a reference to a future-like computation
occurring in the dask cluster.
- Return type