evalml.guardrails.detect_highly_null¶
-
evalml.guardrails.
detect_highly_null
(X, percent_threshold=0.95)[source]¶ Checks if there are any highly-null columns in a dataframe.
- Parameters
X (pd.DataFrame) – features
percent_threshold (float) – Require that percentage of null values to be considered “highly-null”, defaults to .95
- Returns
A dictionary of features with column name or index and their percentage of null values
Example
>>> df = pd.DataFrame({ ... 'lots_of_null': [None, None, None, None, 5], ... 'no_null': [1, 2, 3, 4, 5] ... }) >>> detect_highly_null(df, percent_threshold=0.8) {'lots_of_null': 0.8}