dask_engine ========================================== .. py:module:: evalml.automl.engine.dask_engine .. autoapi-nested-parse:: A Future-like wrapper around jobs created by the DaskEngine. Module Contents --------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.automl.engine.dask_engine.DaskComputation evalml.automl.engine.dask_engine.DaskEngine Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: DaskComputation(dask_future) A Future-like wrapper around jobs created by the DaskEngine. :param dask_future: Computation to do. :type dask_future: callable **Methods** .. autoapisummary:: :nosignatures: evalml.automl.engine.dask_engine.DaskComputation.cancel evalml.automl.engine.dask_engine.DaskComputation.done evalml.automl.engine.dask_engine.DaskComputation.get_result evalml.automl.engine.dask_engine.DaskComputation.is_cancelled .. py:method:: cancel(self) Cancel the current computation. .. py:method:: done(self) Returns whether the computation is done. .. py:method:: get_result(self) Gets the computation result. Will block until the computation is finished. :raises Exception: If computation fails. Returns traceback. :returns: Computation results. .. py:method:: is_cancelled(self) :property: Returns whether computation was cancelled. .. py:class:: DaskEngine(cluster=None) The dask engine. :param cluster: If None, creates a local, threaded Dask client for processing. Defaults to None. :type cluster: None or dd.Client **Methods** .. autoapisummary:: :nosignatures: evalml.automl.engine.dask_engine.DaskEngine.close evalml.automl.engine.dask_engine.DaskEngine.is_closed evalml.automl.engine.dask_engine.DaskEngine.send_data_to_cluster evalml.automl.engine.dask_engine.DaskEngine.setup_job_log evalml.automl.engine.dask_engine.DaskEngine.submit_evaluation_job evalml.automl.engine.dask_engine.DaskEngine.submit_scoring_job evalml.automl.engine.dask_engine.DaskEngine.submit_training_job .. py:method:: close(self) Closes the underlying cluster. .. py:method:: is_closed(self) :property: Property that determines whether the Engine's Client's resources are shutdown. .. py:method:: send_data_to_cluster(self, X, y) Send data to the cluster. The implementation uses caching so the data is only sent once. This follows dask best practices. :param X: Input data for modeling. :type X: pd.DataFrame :param y: Target data for modeling. :type y: pd.Series :returns: The modeling data. :rtype: dask.Future .. py:method:: setup_job_log() :staticmethod: Set up logger for job. .. py:method:: submit_evaluation_job(self, automl_config, pipeline, X, y, X_holdout=None, y_holdout=None) Send evaluation job to cluster. :param automl_config: Structure containing data passed from AutoMLSearch instance. :param pipeline: Pipeline to evaluate. :type pipeline: pipeline.PipelineBase :param X: Input data for modeling. :type X: pd.DataFrame :param y: Target data for modeling. :type y: pd.Series :param X_holdout: Holdout input data for holdout scoring. :type X_holdout: pd.Series :param y_holdout: Holdout target data for holdout scoring. :type y_holdout: pd.Series :returns: An object wrapping a reference to a future-like computation occurring in the dask cluster. :rtype: DaskComputation .. py:method:: submit_scoring_job(self, automl_config, pipeline, X, y, objectives, X_train=None, y_train=None) Send scoring job to cluster. :param automl_config: Structure containing data passed from AutoMLSearch instance. :param pipeline: Pipeline to train. :type pipeline: pipeline.PipelineBase :param X: Input data for modeling. :type X: pd.DataFrame :param y: Target data for modeling. :type y: pd.Series :param X_train: Training features. Used for feature engineering in time series. :type X_train: pd.DataFrame :param y_train: Training target. Used for feature engineering in time series. :type y_train: pd.Series :param objectives: List of objectives to score on. :type objectives: list[ObjectiveBase] :returns: An object wrapping a reference to a future-like computation occurring in the dask cluster. :rtype: DaskComputation .. py:method:: submit_training_job(self, automl_config, pipeline, X, y) Send training job to cluster. :param automl_config: Structure containing data passed from AutoMLSearch instance. :param pipeline: Pipeline to train. :type pipeline: pipeline.PipelineBase :param X: Input data for modeling. :type X: pd.DataFrame :param y: Target data for modeling. :type y: pd.Series :returns: An object wrapping a reference to a future-like computation occurring in the dask cluster. :rtype: DaskComputation