natural_language_featurizer ============================================================================================ .. py:module:: evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer .. autoapi-nested-parse:: Transformer that can automatically featurize text columns using featuretools' nlp_primitives. Module Contents --------------- Classes Summary ~~~~~~~~~~~~~~~ .. autoapisummary:: evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer Contents ~~~~~~~~~~~~~~~~~~~ .. py:class:: NaturalLanguageFeaturizer(random_seed=0, **kwargs) Transformer that can automatically featurize text columns using featuretools' nlp_primitives. Since models cannot handle non-numeric data, any text must be broken down into features that provide useful information about that text. This component splits each text column into several informative features: Diversity Score, Mean Characters per Word, Polarity Score, LSA (Latent Semantic Analysis), Number of Characters, and Number of Words. Calling transform on this component will replace any text columns in the given dataset with these numeric columns. :param random_seed: Seed for the random number generator. Defaults to 0. :type random_seed: int **Attributes** .. list-table:: :widths: 15 85 :header-rows: 0 * - **hyperparameter_ranges** - {} * - **modifies_features** - True * - **modifies_target** - False * - **name** - Natural Language Featurizer * - **training_only** - False **Methods** .. autoapisummary:: :nosignatures: evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.clone evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.default_parameters evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.describe evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.fit evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.fit_transform evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.load evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.needs_fitting evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.parameters evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.save evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.transform evalml.pipelines.components.transformers.preprocessing.natural_language_featurizer.NaturalLanguageFeaturizer.update_parameters .. py:method:: clone(self) Constructs a new component with the same parameters and random state. :returns: A new instance of this component with identical parameters and random state. .. py:method:: default_parameters(cls) Returns the default parameters for this component. Our convention is that Component.default_parameters == Component().parameters. :returns: Default parameters for this component. :rtype: dict .. py:method:: describe(self, print_name=False, return_dict=False) Describe a component and its parameters. :param print_name: whether to print name of component :type print_name: bool, optional :param return_dict: whether to return description as dictionary in the format {"name": name, "parameters": parameters} :type return_dict: bool, optional :returns: Returns dictionary if return_dict is True, else None. :rtype: None or dict .. py:method:: fit(self, X, y=None) Fits component to data. :param X: The input training data of shape [n_samples, n_features] :type X: pd.DataFrame or np.ndarray :param y: The target training data of length [n_samples] :type y: pd.Series :returns: self .. py:method:: fit_transform(self, X, y=None) Fits on X and transforms X. :param X: Data to fit and transform. :type X: pd.DataFrame :param y: Target data. :type y: pd.Series :returns: Transformed X. :rtype: pd.DataFrame :raises MethodPropertyNotFoundError: If transformer does not have a transform method or a component_obj that implements transform. .. py:method:: load(file_path) :staticmethod: Loads component at file path. :param file_path: Location to load file. :type file_path: str :returns: ComponentBase object .. py:method:: needs_fitting(self) Returns boolean determining if component needs fitting before calling predict, predict_proba, transform, or feature_importances. This can be overridden to False for components that do not need to be fit or whose fit methods do nothing. :returns: True. .. py:method:: parameters(self) :property: Returns the parameters which were used to initialize the component. .. py:method:: save(self, file_path, pickle_protocol=cloudpickle.DEFAULT_PROTOCOL) Saves component at file path. :param file_path: Location to save file. :type file_path: str :param pickle_protocol: The pickle data stream format. :type pickle_protocol: int .. py:method:: transform(self, X, y=None) Transforms data X by creating new features using existing text columns. :param X: The data to transform. :type X: pd.DataFrame :param y: Ignored. :type y: pd.Series, optional :returns: Transformed X :rtype: pd.DataFrame .. py:method:: update_parameters(self, update_dict, reset_fit=True) Updates the parameter dictionary of the component. :param update_dict: A dict of parameters to update. :type update_dict: dict :param reset_fit: If True, will set `_is_fitted` to False. :type reset_fit: bool, optional