gen_utils¶
Module Contents¶
Classes Summary¶
Allows function to be accessed as a class level property. |
Functions¶
Converts a string describing a length of time to its length in seconds. |
|
Helper to raise warnings when a deprecated arg is used. |
|
Drop rows that have any NaNs in all dataframes or series. |
|
Get importable subclasses of a base class. Used to list all of our |
|
Given a numpy.random.RandomState object, generate an int representing a seed value for another random number generator. Or, if given an int, return that int. |
|
Generates a numpy.random.RandomState instance using seed. |
|
Attempts to import the requested library by name. |
|
Checks if the given DataFrame contains only numeric values |
|
Get whether or not the code is being run in a Ipython environment (such as Jupyter Notebook or Jupyter Lab) |
|
Pad the beginning num_to_pad rows with nans. |
|
Convert the given value into a string that can safely be used for repr |
|
Saves fig to filepath if specified, or to a default location if not. |
Attributes Summary¶
Contents¶
-
class
evalml.utils.gen_utils.
classproperty
(func)[source]¶ Allows function to be accessed as a class level property.
Example:
class LogisticRegressionBinaryPipeline(PipelineBase): component_graph = ['Simple Imputer', 'Logistic Regression Classifier'] @classproperty def summary(cls): summary = "" for component in cls.component_graph: component = handle_component_class(component) summary += component.name + " + " return summary assert LogisticRegressionBinaryPipeline.summary == "Simple Imputer + Logistic Regression Classifier + " assert LogisticRegressionBinaryPipeline().summary == "Simple Imputer + Logistic Regression Classifier + "
-
evalml.utils.gen_utils.
convert_to_seconds
(input_str)[source]¶ Converts a string describing a length of time to its length in seconds.
-
evalml.utils.gen_utils.
deprecate_arg
(old_arg, new_arg, old_value, new_value)[source]¶ Helper to raise warnings when a deprecated arg is used.
- Parameters
old_arg (str) – Name of old/deprecated argument.
new_arg (str) – Name of new argument.
old_value (Any) – Value the user passed in for the old argument.
new_value (Any) – Value the user passed in for the new argument.
- Returns
old_value if not None, else new_value
-
evalml.utils.gen_utils.
drop_rows_with_nans
(*pd_data)[source]¶ Drop rows that have any NaNs in all dataframes or series.
- Parameters
*pd_data (sequence of pd.Series or pd.DataFrame or None) –
- Returns
list of pd.DataFrame or pd.Series or None
-
evalml.utils.gen_utils.
get_importable_subclasses
(base_class, used_in_automl=True)[source]¶ Get importable subclasses of a base class. Used to list all of our estimators, transformers, components and pipelines dynamically.
- Parameters
base_class (abc.ABCMeta) – Base class to find all of the subclasses for.
args (list) – Args used to instantiate the subclass. [{}] for a pipeline, and [] for all other classes.
used_in_automl – Not all components/pipelines/estimators are used in automl search. If True, only include those subclasses that are used in the search. This would mean excluding classes related to ExtraTrees, ElasticNet, and Baseline estimators.
- Returns
List of subclasses.
-
evalml.utils.gen_utils.
get_random_seed
(random_state, min_bound=SEED_BOUNDS.min_bound, max_bound=SEED_BOUNDS.max_bound)[source]¶ Given a numpy.random.RandomState object, generate an int representing a seed value for another random number generator. Or, if given an int, return that int.
To protect against invalid input to a particular library’s random number generator, if an int value is provided, and it is outside the bounds “[min_bound, max_bound)”, the value will be projected into the range between the min_bound (inclusive) and max_bound (exclusive) using modular arithmetic.
- Parameters
random_state (int, numpy.random.RandomState) – random state
min_bound (None, int) – if not default of None, will be min bound when generating seed (inclusive). Must be less than max_bound.
max_bound (None, int) – if not default of None, will be max bound when generating seed (exclusive). Must be greater than min_bound.
- Returns
seed for random number generator
- Return type
int
-
evalml.utils.gen_utils.
get_random_state
(seed)[source]¶ Generates a numpy.random.RandomState instance using seed.
- Parameters
seed (None, int, np.random.RandomState object) – seed to use to generate numpy.random.RandomState. Must be between SEED_BOUNDS.min_bound and SEED_BOUNDS.max_bound, inclusive. Otherwise, an exception will be thrown.
-
evalml.utils.gen_utils.
import_or_raise
(library, error_msg=None, warning=False)[source]¶ Attempts to import the requested library by name. If the import fails, raises an ImportError or warning.
- Parameters
library (str) – the name of the library
error_msg (str) – error message to return if the import fails
warning (bool) – if True, import_or_raise gives a warning instead of ImportError. Defaults to False.
-
evalml.utils.gen_utils.
is_all_numeric
(df)[source]¶ Checks if the given DataFrame contains only numeric values
- Parameters
df (pd.DataFrame) – The DataFrame to check data types of.
- Returns
True if all the columns are numeric and are not missing any values, False otherwise.
-
evalml.utils.gen_utils.
jupyter_check
()[source]¶ Get whether or not the code is being run in a Ipython environment (such as Jupyter Notebook or Jupyter Lab)
- Parameters
None –
- Returns
True if Ipython, False otherwise
- Return type
Boolean
-
evalml.utils.gen_utils.
logger
¶
-
evalml.utils.gen_utils.
pad_with_nans
(pd_data, num_to_pad)[source]¶ Pad the beginning num_to_pad rows with nans.
- Parameters
pd_data (pd.DataFrame or pd.Series) – Data to pad.
- Returns
pd.DataFrame or pd.Series
-
evalml.utils.gen_utils.
safe_repr
(value)[source]¶ Convert the given value into a string that can safely be used for repr
- Parameters
value – the item to convert
- Returns
String representation of the value
-
evalml.utils.gen_utils.
save_plot
(fig, filepath=None, format='png', interactive=False, return_filepath=False)[source]¶ Saves fig to filepath if specified, or to a default location if not.
- Parameters
fig (Figure) – Figure to be saved.
filepath (str or Path, optional) – Location to save file. Default is with filename “test_plot”.
format (str) – Extension for figure to be saved as. Ignored if interactive is True and fig
of type plotly.Figure. Defaults to 'png'. (is) –
interactive (bool, optional) – If True and fig is of type plotly.Figure, saves the fig as interactive
of static (instead) –
format will be set to 'html'. Defaults to False. (and) –
return_filepath (bool, optional) – Whether to return the final filepath the image is saved to. Defaults to False.
- Returns
String representing the final filepath the image was saved to if return_filepath is set to True. Defaults to None.
-
evalml.utils.gen_utils.
SEED_BOUNDS
¶