evalml.guardrails.detect_outliers¶
-
evalml.guardrails.
detect_outliers
(X, random_state=0)[source]¶ Checks if there are any outliers in a dataframe by using first Isolation Forest to obtain the anomaly score of each index and then using IQR to determine score anomalies. Indices with score anomalies are considered outliers.
- Parameters
X (pd.DataFrame) – features
- Returns
A set of indices that may have outlier data.
Example
>>> df = pd.DataFrame({ ... 'x': [1, 2, 3, 40, 5], ... 'y': [6, 7, 8, 990, 10], ... 'z': [-1, -2, -3, -1201, -4] ... }) >>> detect_outliers(df) [3]