column_selectors#
Initalizes an transformer that selects specified columns in input data.
Module Contents#
Classes Summary#
Initalizes an transformer that selects specified columns in input data. |
|
Drops specified columns in input data. |
|
Selects columns by specified Woodwork logical type or semantic tag in input data. |
|
Selects specified columns in input data. |
Contents#
- class evalml.pipelines.components.transformers.column_selectors.ColumnSelector(columns=None, random_seed=0, **kwargs)[source]#
Initalizes an transformer that selects specified columns in input data.
- Parameters
columns (list(string)) – List of column names, used to determine which columns to select.
random_seed (int) – Seed for the random number generator. Defaults to 0.
Attributes
modifies_features
True
modifies_target
False
training_only
False
Methods
Constructs a new component with the same parameters and random state.
Returns the default parameters for this component.
Describe a component and its parameters.
Fits the transformer by checking if column names are present in the dataset.
Fits on X and transforms X.
Loads component at file path.
Returns string name of this component.
Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances.
Returns the parameters which were used to initialize the component.
Saves component at file path.
Transform data using fitted column selector component.
Updates the parameter dictionary of the component.
- clone(self)#
Constructs a new component with the same parameters and random state.
- Returns
A new instance of this component with identical parameters and random state.
- default_parameters(cls)#
Returns the default parameters for this component.
Our convention is that Component.default_parameters == Component().parameters.
- Returns
Default parameters for this component.
- Return type
dict
- describe(self, print_name=False, return_dict=False)#
Describe a component and its parameters.
- Parameters
print_name (bool, optional) – whether to print name of component
return_dict (bool, optional) – whether to return description as dictionary in the format {“name”: name, “parameters”: parameters}
- Returns
Returns dictionary if return_dict is True, else None.
- Return type
None or dict
- fit(self, X, y=None)[source]#
Fits the transformer by checking if column names are present in the dataset.
- Parameters
X (pd.DataFrame) – Data to check.
y (pd.Series, ignored) – Targets.
- Returns
self
- fit_transform(self, X, y=None)#
Fits on X and transforms X.
- Parameters
X (pd.DataFrame) – Data to fit and transform.
y (pd.Series) – Target data.
- Returns
Transformed X.
- Return type
pd.DataFrame
- Raises
MethodPropertyNotFoundError – If transformer does not have a transform method or a component_obj that implements transform.
- static load(file_path)#
Loads component at file path.
- Parameters
file_path (str) – Location to load file.
- Returns
ComponentBase object
- property name(cls)#
Returns string name of this component.
- needs_fitting(self)#
Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances.
This can be overridden to False for components that do not need to be fit or whose fit methods do nothing.
- Returns
True.
- property parameters(self)#
Returns the parameters which were used to initialize the component.
- save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL)#
Saves component at file path.
- Parameters
file_path (str) – Location to save file.
pickle_protocol (int) – The pickle data stream format.
- transform(self, X, y=None)[source]#
Transform data using fitted column selector component.
- Parameters
X (pd.DataFrame) – The input training data of shape [n_samples, n_features].
y (pd.Series, optional) – The target training data of length [n_samples].
- Returns
Transformed data.
- Return type
pd.DataFrame
- update_parameters(self, update_dict, reset_fit=True)#
Updates the parameter dictionary of the component.
- Parameters
update_dict (dict) – A dict of parameters to update.
reset_fit (bool, optional) – If True, will set _is_fitted to False.
- class evalml.pipelines.components.transformers.column_selectors.DropColumns(columns=None, random_seed=0, **kwargs)[source]#
Drops specified columns in input data.
- Parameters
columns (list(string)) – List of column names, used to determine which columns to drop.
random_seed (int) – Seed for the random number generator. Defaults to 0.
Attributes
hyperparameter_ranges
{}
modifies_features
True
modifies_target
False
name
Drop Columns Transformer
needs_fitting
False
training_only
False
Methods
Constructs a new component with the same parameters and random state.
Returns the default parameters for this component.
Describe a component and its parameters.
Fits the transformer by checking if column names are present in the dataset.
Fits on X and transforms X.
Loads component at file path.
Returns the parameters which were used to initialize the component.
Saves component at file path.
Transforms data X by dropping columns.
Updates the parameter dictionary of the component.
- clone(self)#
Constructs a new component with the same parameters and random state.
- Returns
A new instance of this component with identical parameters and random state.
- default_parameters(cls)#
Returns the default parameters for this component.
Our convention is that Component.default_parameters == Component().parameters.
- Returns
Default parameters for this component.
- Return type
dict
- describe(self, print_name=False, return_dict=False)#
Describe a component and its parameters.
- Parameters
print_name (bool, optional) – whether to print name of component
return_dict (bool, optional) – whether to return description as dictionary in the format {“name”: name, “parameters”: parameters}
- Returns
Returns dictionary if return_dict is True, else None.
- Return type
None or dict
- fit(self, X, y=None)#
Fits the transformer by checking if column names are present in the dataset.
- Parameters
X (pd.DataFrame) – Data to check.
y (pd.Series, ignored) – Targets.
- Returns
self
- fit_transform(self, X, y=None)#
Fits on X and transforms X.
- Parameters
X (pd.DataFrame) – Data to fit and transform.
y (pd.Series) – Target data.
- Returns
Transformed X.
- Return type
pd.DataFrame
- Raises
MethodPropertyNotFoundError – If transformer does not have a transform method or a component_obj that implements transform.
- static load(file_path)#
Loads component at file path.
- Parameters
file_path (str) – Location to load file.
- Returns
ComponentBase object
- property parameters(self)#
Returns the parameters which were used to initialize the component.
- save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL)#
Saves component at file path.
- Parameters
file_path (str) – Location to save file.
pickle_protocol (int) – The pickle data stream format.
- transform(self, X, y=None)[source]#
Transforms data X by dropping columns.
- Parameters
X (pd.DataFrame) – Data to transform.
y (pd.Series, optional) – Targets.
- Returns
Transformed X.
- Return type
pd.DataFrame
- update_parameters(self, update_dict, reset_fit=True)#
Updates the parameter dictionary of the component.
- Parameters
update_dict (dict) – A dict of parameters to update.
reset_fit (bool, optional) – If True, will set _is_fitted to False.
- class evalml.pipelines.components.transformers.column_selectors.SelectByType(column_types=None, exclude=False, random_seed=0, **kwargs)[source]#
Selects columns by specified Woodwork logical type or semantic tag in input data.
- Parameters
column_types (string, ww.LogicalType, list(string), list(ww.LogicalType)) – List of Woodwork types or tags, used to determine which columns to select or exclude.
exclude (bool) – If true, exclude the column_types instead of including them. Defaults to False.
random_seed (int) – Seed for the random number generator. Defaults to 0.
Attributes
hyperparameter_ranges
{}
modifies_features
True
modifies_target
False
name
Select Columns By Type Transformer
needs_fitting
False
training_only
False
Methods
Constructs a new component with the same parameters and random state.
Returns the default parameters for this component.
Describe a component and its parameters.
Fits the transformer by checking if column names are present in the dataset.
Fits on X and transforms X.
Loads component at file path.
Returns the parameters which were used to initialize the component.
Saves component at file path.
Transforms data X by selecting columns.
Updates the parameter dictionary of the component.
- clone(self)#
Constructs a new component with the same parameters and random state.
- Returns
A new instance of this component with identical parameters and random state.
- default_parameters(cls)#
Returns the default parameters for this component.
Our convention is that Component.default_parameters == Component().parameters.
- Returns
Default parameters for this component.
- Return type
dict
- describe(self, print_name=False, return_dict=False)#
Describe a component and its parameters.
- Parameters
print_name (bool, optional) – whether to print name of component
return_dict (bool, optional) – whether to return description as dictionary in the format {“name”: name, “parameters”: parameters}
- Returns
Returns dictionary if return_dict is True, else None.
- Return type
None or dict
- fit(self, X, y=None)[source]#
Fits the transformer by checking if column names are present in the dataset.
- Parameters
X (pd.DataFrame) – Data to check.
y (pd.Series, ignored) – Targets.
- Returns
self
- fit_transform(self, X, y=None)#
Fits on X and transforms X.
- Parameters
X (pd.DataFrame) – Data to fit and transform.
y (pd.Series) – Target data.
- Returns
Transformed X.
- Return type
pd.DataFrame
- Raises
MethodPropertyNotFoundError – If transformer does not have a transform method or a component_obj that implements transform.
- static load(file_path)#
Loads component at file path.
- Parameters
file_path (str) – Location to load file.
- Returns
ComponentBase object
- property parameters(self)#
Returns the parameters which were used to initialize the component.
- save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL)#
Saves component at file path.
- Parameters
file_path (str) – Location to save file.
pickle_protocol (int) – The pickle data stream format.
- transform(self, X, y=None)[source]#
Transforms data X by selecting columns.
- Parameters
X (pd.DataFrame) – Data to transform.
y (pd.Series, optional) – Targets.
- Returns
Transformed X.
- Return type
pd.DataFrame
- update_parameters(self, update_dict, reset_fit=True)#
Updates the parameter dictionary of the component.
- Parameters
update_dict (dict) – A dict of parameters to update.
reset_fit (bool, optional) – If True, will set _is_fitted to False.
- class evalml.pipelines.components.transformers.column_selectors.SelectColumns(columns=None, random_seed=0, **kwargs)[source]#
Selects specified columns in input data.
- Parameters
columns (list(string)) – List of column names, used to determine which columns to select. If columns are not present, they will not be selected.
random_seed (int) – Seed for the random number generator. Defaults to 0.
Attributes
hyperparameter_ranges
{}
modifies_features
True
modifies_target
False
name
Select Columns Transformer
needs_fitting
False
training_only
False
Methods
Constructs a new component with the same parameters and random state.
Returns the default parameters for this component.
Describe a component and its parameters.
Fits the transformer by checking if column names are present in the dataset.
Fits on X and transforms X.
Loads component at file path.
Returns the parameters which were used to initialize the component.
Saves component at file path.
Transform data using fitted column selector component.
Updates the parameter dictionary of the component.
- clone(self)#
Constructs a new component with the same parameters and random state.
- Returns
A new instance of this component with identical parameters and random state.
- default_parameters(cls)#
Returns the default parameters for this component.
Our convention is that Component.default_parameters == Component().parameters.
- Returns
Default parameters for this component.
- Return type
dict
- describe(self, print_name=False, return_dict=False)#
Describe a component and its parameters.
- Parameters
print_name (bool, optional) – whether to print name of component
return_dict (bool, optional) – whether to return description as dictionary in the format {“name”: name, “parameters”: parameters}
- Returns
Returns dictionary if return_dict is True, else None.
- Return type
None or dict
- fit(self, X, y=None)[source]#
Fits the transformer by checking if column names are present in the dataset.
- Parameters
X (pd.DataFrame) – Data to check.
y (pd.Series, optional) – Targets.
- Returns
self
- fit_transform(self, X, y=None)#
Fits on X and transforms X.
- Parameters
X (pd.DataFrame) – Data to fit and transform.
y (pd.Series) – Target data.
- Returns
Transformed X.
- Return type
pd.DataFrame
- Raises
MethodPropertyNotFoundError – If transformer does not have a transform method or a component_obj that implements transform.
- static load(file_path)#
Loads component at file path.
- Parameters
file_path (str) – Location to load file.
- Returns
ComponentBase object
- property parameters(self)#
Returns the parameters which were used to initialize the component.
- save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL)#
Saves component at file path.
- Parameters
file_path (str) – Location to save file.
pickle_protocol (int) – The pickle data stream format.
- transform(self, X, y=None)#
Transform data using fitted column selector component.
- Parameters
X (pd.DataFrame) – The input training data of shape [n_samples, n_features].
y (pd.Series, optional) – The target training data of length [n_samples].
- Returns
Transformed data.
- Return type
pd.DataFrame
- update_parameters(self, update_dict, reset_fit=True)#
Updates the parameter dictionary of the component.
- Parameters
update_dict (dict) – A dict of parameters to update.
reset_fit (bool, optional) – If True, will set _is_fitted to False.