multicollinearity_data_check

Module Contents

Classes Summary

MulticollinearityDataCheck

Check if any set features are likely to be multicollinear.

Contents

class evalml.data_checks.multicollinearity_data_check.MulticollinearityDataCheck(threshold=0.9)[source]

Check if any set features are likely to be multicollinear.

Parameters

threshold (float) – The threshold to be considered. Defaults to 0.9.

Methods

name

Returns a name describing the data check.

validate

Check if any set of features are likely to be multicollinear.

name(cls)

Returns a name describing the data check.

validate(self, X, y=None)[source]

Check if any set of features are likely to be multicollinear.

Parameters

X (pd.DataFrame, np.ndarray) – The input features to check

Returns

dict with a DataCheckWarning if there are any potentially multicollinear columns.

Return type

dict