Utils ====================== .. py:module:: evalml.utils .. autoapi-nested-parse:: Utility methods. Submodules ---------- .. toctree:: :titlesonly: :maxdepth: 1 base_meta/index.rst cli_utils/index.rst gen_utils/index.rst logger/index.rst nullable_type_utils/index.rst update_checker/index.rst woodwork_utils/index.rst Package Contents ---------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.utils.classproperty Functions ~~~~~~~~~ .. autoapisummary:: :nosignatures: evalml.utils.convert_to_seconds evalml.utils.deprecate_arg evalml.utils.downcast_nullable_types evalml.utils.drop_rows_with_nans evalml.utils.get_importable_subclasses evalml.utils.get_logger evalml.utils.get_random_seed evalml.utils.get_random_state evalml.utils.get_time_index evalml.utils.import_or_raise evalml.utils.infer_feature_types evalml.utils.is_all_numeric evalml.utils.jupyter_check evalml.utils.log_subtitle evalml.utils.log_title evalml.utils.pad_with_nans evalml.utils.safe_repr evalml.utils.save_plot Attributes Summary ~~~~~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.utils.SEED_BOUNDS Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: classproperty(func) Allows function to be accessed as a class level property. Example: .. code-block:: class LogisticRegressionBinaryPipeline(PipelineBase): component_graph = ['Simple Imputer', 'Logistic Regression Classifier'] @classproperty def summary(cls): summary = "" for component in cls.component_graph: component = handle_component_class(component) summary += component.name + " + " return summary assert LogisticRegressionBinaryPipeline.summary == "Simple Imputer + Logistic Regression Classifier + " assert LogisticRegressionBinaryPipeline().summary == "Simple Imputer + Logistic Regression Classifier + " .. py:function:: convert_to_seconds(input_str) Converts a string describing a length of time to its length in seconds. :param input_str: The string to be parsed and converted to seconds. :type input_str: str :returns: Returns the library if importing succeeded. :raises AssertionError: If an invalid unit is used. .. rubric:: Examples >>> assert convert_to_seconds("10 hr") == 36000.0 >>> assert convert_to_seconds("30 minutes") == 1800.0 >>> assert convert_to_seconds("2.5 min") == 150.0 .. py:function:: deprecate_arg(old_arg, new_arg, old_value, new_value) Helper to raise warnings when a deprecated arg is used. :param old_arg: Name of old/deprecated argument. :type old_arg: str :param new_arg: Name of new argument. :type new_arg: str :param old_value: Value the user passed in for the old argument. :type old_value: Any :param new_value: Value the user passed in for the new argument. :type new_value: Any :returns: old_value if not None, else new_value .. py:function:: downcast_nullable_types(data, ignore_null_cols=True) Downcasts IntegerNullable, BooleanNullable types to Double, Boolean in order to support certain estimators like ARIMA, CatBoost, and LightGBM. :param data: Feature data. :type data: pd.DataFrame, pd.Series :param ignore_null_cols: Whether to ignore downcasting columns with null values or not. Defaults to True. :type ignore_null_cols: bool :returns: DataFrame or Series initialized with logical type information where BooleanNullable are cast as Double. :rtype: data .. py:function:: drop_rows_with_nans(*pd_data) Drop rows that have any NaNs in all dataframes or series. :param \*pd_data: sequence of pd.Series or pd.DataFrame or None :returns: list of pd.DataFrame or pd.Series or None .. py:function:: get_importable_subclasses(base_class, used_in_automl=True) Get importable subclasses of a base class. Used to list all of our estimators, transformers, components and pipelines dynamically. :param base_class: Base class to find all of the subclasses for. :type base_class: abc.ABCMeta :param used_in_automl: Not all components/pipelines/estimators are used in automl search. If True, only include those subclasses that are used in the search. This would mean excluding classes related to ExtraTrees, ElasticNet, and Baseline estimators. :returns: List of subclasses. .. py:function:: get_logger(name) Get the logger with the associated name. :param name: Name of the logger to get. :type name: str :returns: The logger object with the associated name. .. py:function:: get_random_seed(random_state, min_bound=SEED_BOUNDS.min_bound, max_bound=SEED_BOUNDS.max_bound) Given a numpy.random.RandomState object, generate an int representing a seed value for another random number generator. Or, if given an int, return that int. To protect against invalid input to a particular library's random number generator, if an int value is provided, and it is outside the bounds "[min_bound, max_bound)", the value will be projected into the range between the min_bound (inclusive) and max_bound (exclusive) using modular arithmetic. :param random_state: random state :type random_state: int, numpy.random.RandomState :param min_bound: if not default of None, will be min bound when generating seed (inclusive). Must be less than max_bound. :type min_bound: None, int :param max_bound: if not default of None, will be max bound when generating seed (exclusive). Must be greater than min_bound. :type max_bound: None, int :returns: Seed for random number generator :rtype: int :raises ValueError: If boundaries are not valid. .. py:function:: get_random_state(seed) Generates a numpy.random.RandomState instance using seed. :param seed: seed to use to generate numpy.random.RandomState. Must be between SEED_BOUNDS.min_bound and SEED_BOUNDS.max_bound, inclusive. :type seed: None, int, np.random.RandomState object :raises ValueError: If the input seed is not within the acceptable range. :returns: A numpy.random.RandomState instance. .. py:function:: get_time_index(X: pandas.DataFrame, y: pandas.Series, time_index_name: str) Determines the column in the given data that should be used as the time index. .. py:function:: import_or_raise(library, error_msg=None, warning=False) Attempts to import the requested library by name. If the import fails, raises an ImportError or warning. :param library: The name of the library. :type library: str :param error_msg: Error message to return if the import fails. :type error_msg: str :param warning: If True, import_or_raise gives a warning instead of ImportError. Defaults to False. :type warning: bool :returns: Returns the library if importing succeeded. :raises ImportError: If attempting to import the library fails because the library is not installed. :raises Exception: If importing the library fails. .. py:function:: infer_feature_types(data, feature_types=None) Create a Woodwork structure from the given list, pandas, or numpy input, with specified types for columns. If a column's type is not specified, it will be inferred by Woodwork. :param data: Input data to convert to a Woodwork data structure. :type data: pd.DataFrame, pd.Series :param feature_types: If data is a 2D structure, feature_types must be a dictionary mapping column names to the type of data represented in the column. If data is a 1D structure, then feature_types must be a Woodwork logical type or a string representing a Woodwork logical type ("Double", "Integer", "Boolean", "Categorical", "Datetime", "NaturalLanguage") :type feature_types: string, ww.logical_type obj, dict, optional :returns: A Woodwork data structure where the data type of each column was either specified or inferred. :raises ValueError: If there is a mismatch between the dataframe and the woodwork schema. .. py:function:: is_all_numeric(df) Checks if the given DataFrame contains only numeric values. :param df: The DataFrame to check data types of. :type df: pd.DataFrame :returns: True if all the columns are numeric and are not missing any values, False otherwise. .. py:function:: jupyter_check() Get whether or not the code is being run in a Ipython environment (such as Jupyter Notebook or Jupyter Lab). :returns: True if Ipython, False otherwise. :rtype: boolean .. py:function:: log_subtitle(logger, title, underline='=') Log with a subtitle. .. py:function:: log_title(logger, title) Log with a title. .. py:function:: pad_with_nans(pd_data, num_to_pad) Pad the beginning num_to_pad rows with nans. :param pd_data: Data to pad. :type pd_data: pd.DataFrame or pd.Series :param num_to_pad: Number of nans to pad. :type num_to_pad: int :returns: pd.DataFrame or pd.Series .. py:function:: safe_repr(value) Convert the given value into a string that can safely be used for repr. :param value: The item to convert :returns: String representation of the value .. py:function:: save_plot(fig, filepath=None, format='png', interactive=False, return_filepath=False) Saves fig to filepath if specified, or to a default location if not. :param fig: Figure to be saved. :type fig: Figure :param filepath: Location to save file. Default is with filename "test_plot". :type filepath: str or Path, optional :param format: Extension for figure to be saved as. Ignored if interactive is True and fig is of type plotly.Figure. Defaults to 'png'. :type format: str :param interactive: If True and fig is of type plotly.Figure, saves the fig as interactive instead of static, and format will be set to 'html'. Defaults to False. :type interactive: bool, optional :param return_filepath: Whether to return the final filepath the image is saved to. Defaults to False. :type return_filepath: bool, optional :returns: String representing the final filepath the image was saved to if return_filepath is set to True. Defaults to None. .. py:data:: SEED_BOUNDS