evalml.pipelines.components.PerColumnImputer.init¶

PerColumnImputer.__init__(impute_strategies=None, default_impute_strategy='most_frequent', random_seed=0, **kwargs)[source]¶

Initializes a transformer that imputes missing data according to the specified imputation strategy per column.”

Parameters

impute_strategies (dict) –
Column and {“impute_strategy”: strategy, “fill_value”:value} pairings. Valid values for impute strategy include “mean”, “median”, “most_frequent”, “constant” for numerical data, and “most_frequent”, “constant” for object data types. Defaults to “most_frequent” for all columns.

When impute_strategy == “constant”, fill_value is used to replace missing data. Defaults to 0 when imputing numerical data and “missing_value” for strings or object data types.
default_impute_strategy (str) – Impute strategy to fall back on when none is provided for a certain column. Valid values include “mean”, “median”, “most_frequent”, “constant” for numerical data, and “most_frequent”, “constant” for object data types. Defaults to “most_frequent”
random_seed (int) – Seed for the random number generator. Defaults to 0.

evalml.pipelines.components.PerColumnImputer.__init__¶