evalml.data_checks.TargetLeakageDataCheck.__init__¶
-
TargetLeakageDataCheck.
__init__
(pct_corr_threshold=0.95, method='mutual')[source]¶ Check if any of the features are highly correlated with the target by using mutual information or Pearson correlation.
If method=’mutual’, this data check uses mutual information and supports all target and feature types. Otherwise, if method=’pearson’, it uses Pearson correlation and only supports binary with numeric and boolean dtypes. Pearson correlation returns a value in [-1, 1], while mutual information returns a value in [0, 1].
- Parameters
pct_corr_threshold (float) – The correlation threshold to be considered leakage. Defaults to 0.95.
method (string) – The method to determine correlation. Use ‘mutual’ for mutual information, otherwise ‘pearson’ for Pearson correlation. Defaults to ‘mutual’.